PHP Syllable splitter/counter and Hyphenator for text and HTML. Multi-language, customisable, cached and fast!
Version 1.7
Copyright © 2011-2023 Martijn van der Lee. MIT Open Source license applies.
PHP Syllable splitting and hyphenation. or rather... PHP Syl-la-ble split-ting and hy-phen-ation.
Based on the work by Frank M. Liang (http://www.tug.org/docs/liang/) and the many volunteers in the TeX community.
Many languages supported. i.e. english (us/uk), spanish, german, french, dutch, italian, romanian, russian, etc. 76 languages in total.
Language sources: http://tug.org/tex-hyphen/#languages
Supports PHP 5.6 and up, so you can use it on older servers.
Install phpSyllable via Composer
composer require vanderlee/syllable
or simply add phpSyllable to your project and set up the project's autoloader for phpSyllable's src/ directory.
Instantiate a Syllable object and start hyphenation.
Minimal example:
$syllable = new \Vanderlee\Syllable\Syllable('en-us');
echo $syllable->hyphenateText('Provide a plethora of paragraphs');
Extended example:
use Vanderlee\Syllable\Syllable;
use Vanderlee\Syllable\Hyphen;
// Globally set the directory where Syllable can store cache files.
// By default, this is the cache/ folder in this package, but usually
// you want to have the folder outside the package. Note that the cache
// folder must be created beforehand.
Syllable::setCacheDir(__DIR__ . '/cache');
// Globally set the directory where the .tex files are stored.
// By default, this is the languages/ folder of this package and
// usually does not need to be adapted.
Syllable::setLanguageDir(__DIR__ . '/languages');
// Create a new instance for the language.
$syllable = new Syllable('en-us');
// Set the style of the hyphen. In this case it is the "-" character.
// By default, it is the soft hyphen "­".
$syllable->setHyphen(new Hyphen\Dash());
// Set the minimum word length required for hyphenation.
// By default, all words are hyphenated.
$syllable->setMinWordLength(5);
// Output hyphenated text ..
echo $syllable->hyphenateText('Provide your own paragraphs...');
// .. or hyphenated HTML.
echo $syllable->hyphenateHtmlText('<b>... with highlighted text.</b>');
See the demo.php file for a working example.
Syllable
API referenceThe following describes the API of the main Syllable class. In most cases, you will not use any other functions. Browse the code under src/ for all available functions.
Create a new Syllable class, with defaults.
Set the directory where compiled language files may be stored.
Default to the cache
subdirectory of the current directory.
Set the character encoding to use.
Specify null
encoding to not apply any encoding at all.
Set the directory where language source files can be found.
Default to the languages
subdirectory of the current directory.
Set the language whose rules will be used for hyphenation.
Set the hyphen text or object to use as a hyphen marker.
Get the current hyphen object.
Words need to contain at least this many character to be hyphenated.
Options to use for HTML parsing by libxml. See: https://www.php.net/manual/de/libxml.constants.php.
Exclude all elements.
Add one or more elements to exclude from HTML.
Add one or more elements with attributes to exclude from HTML.
Add one or more xpath queries to exclude from HTML.
Add one or more elements to include from HTML.
Add one or more elements with attributes to include from HTML.
Add one or more xpath queries to include from HTML.
Split a single word on where the hyphenation would go. Punctuation is not supported, only simple words. For parsing whole sentences please use Syllable::splitWords() or Syllable::splitText().
Split a text into an array of punctuation marks and words, splitting each word on where the hyphenation would go.
Split a text on where the hyphenation would go.
Hyphenate a single word.
Hyphenate all words in the plain text.
Hyphenate all readable text in the HTML, excluding HTML tags and attributes. Deprecated: Use the UTF-8 capable hyphenateHtmlText() instead. This method is kept only for backward compatibility and will be removed in the next major version 2.0.
Hyphenate all readable text in the HTML, excluding HTML tags and attributes. This method is UTF-8 capable and should be preferred over hyphenateHtml().
Count the number of syllables in the text and return a map with syllable count as key and number of words for that syllable count as the value.
Count the number of words in the text.
Count the number of syllables in the text.
Count the number of polysyllables in the text.
Run
composer dump-autoload --dev
./build/update-language-files
to fetch the latest language files remotely and optionally use environment variables to customize the update process:
Specify the absolute path of the configuration file where the language files to be downloaded are defined. The configuration file has the following format:
{
"files": [
{
"_comment": "<comment>",
"fromUrl": "<absolute-remote-file-url>",
"toPath": "<relative-local-file-path>",
"disabled": <true|false>
}
]
}
where the attributes are self-explanatory and _comment
and disabled
are optional. See for example
build/update-language-files.json.
Default: The build/update-language-files.json
file of this package.
Specify the maximum number of URL redirects allowed when retrieving a language file.
Default: 1
.
Create (1) or skip (0) a Git commit from the updated language files.
Default: 0
.
Set the verbosity of the script to verbose (6), warnings and errors (4), errors only (3) or silent (0).
Default: 6
.
For example use
composer dump-autoload --dev
LOG_LEVEL=0 ./build/update-language-files
to silently run the script without outputting any logging.
Run
composer dump-autoload --dev
./build/generate-docs
to update the API documentation in this README.md. This should be done when the Syllable class has been modified. Optionally, you can use environment variables to modify the documentation update process:
Create (1) or skip (0) a Git commit from the adapted files.
Default: 0
.
Set the verbosity of the script to verbose (6), warnings and errors (4), errors only (3) or silent (0).
Default: 6
.
Run
composer dump-autoload --dev
./build/create-release
to create a local release of the project by adding a changelog to this README.md. Optionally, you can use environment variables to modify the release process:
Set the release type to major (0), minor (1) or patch (2) release.
Default: 2
.
Create (1) or skip (0) a Git commit from the adapted files and apply the release tag.
Default: 0
.
Set the verbosity of the script to verbose (6), warnings and errors (4), errors only (3) or silent (0).
Default: 6
.
Run
composer install
./vendor/bin/phpunit
to execute the tests.
1.7
1.6
1.5.5
1.5.4
1.5.3
1.5.2
1.5.1
1.5
1.4.6
setMinWordLength($length)
and getMinWordLength()
to limit1.4.5
1.4.4
1.4.3
1.4.2
1.4.1
1.4
1.3.1
1.3
array histogramText($text)
, integer countWordsText($text)
andinteger countPolysyllableText($text)
methods.1.2