diff options
Diffstat (limited to 'intl/hyphenation/hyphen/README.compound')
-rw-r--r-- | intl/hyphenation/hyphen/README.compound | 87 |
1 files changed, 87 insertions, 0 deletions
diff --git a/intl/hyphenation/hyphen/README.compound b/intl/hyphenation/hyphen/README.compound new file mode 100644 index 000000000..bcb265853 --- /dev/null +++ b/intl/hyphenation/hyphen/README.compound @@ -0,0 +1,87 @@ +New option of Libhyphen 2.7: NOHYPHEN + +Hyphen, apostrophe and other characters may be word boundary characters, +but they don't need (extra) hyphenation. With NOHYPHEN option +it's possible to hyphenate the words parts correctly. + +Example: + +ISO8859-1 +NOHYPHEN -,' +1-1 +1'1 +NEXTLEVEL + +Description: + +1-1 and 1'1 declare hyphen and apostrophe as word boundary characters +and NOHYPHEN with the comma separated character (or character sequence) +list forbid the (extra) hyphens at the hyphen and apostrophe characters. + +Implicite NOHYPHEN declaration + +Without explicite NEXTLEVEL declaration, Hyphen 2.8 uses the +previous settings, plus in UTF-8 encoding, endash (U+2013) and +typographical apostrophe (U+2019) are NOHYPHEN characters, too. + +It's possible to enlarge the hyphenation distance from these +NOHYPHEN characters by using COMPOUNDLEFTHYPHENMIN and +COMPOUNDRIGHTHYPHENMIN attributes. + +Compound word hyphenation + +Hyphen library supports better compound word hyphenation and special +rules of compound word hyphenation of German languages and other +languages with arbitrary number of compound words. The new options, +COMPOUNDLEFTHYPHENMIN and COMPOUNDRIGHTHYPHENMIN help to set the right +style for the hyphenation of compound words. + +Algorithm + +The algorithm is an extension of the original pattern based hyphenation +algorithm. It uses two hyphenation pattern sets, defined in the same +pattern file and separated by the NEXTLEVEL keyword. First pattern +set is for hyphenation only at compound word boundaries, the second one +is for hyphenation within words or word parts. + +Recursive compound level hyphenation + +The algorithm is recursive: every word parts of a successful +first (compound) level hyphenation will be rehyphenated +by the same (first) pattern set. + +Finally, when first level hyphenation is not possible, Hyphen uses +the second level hyphenation for the word or the word parts. + +Word endings and word parts + +Patterns for word endings (patterns with ellipses) match the +word parts, too. + +Options + +COMPOUNDLEFTHYPHENMIN: min. hyph. dist. from the left compound word boundary +COMPOUNDRIGHTHYPHENMIN: min. hyph. dist. from the right comp. word boundary +NEXTLEVEL: sign second level hyphenation patterns + +Default hyphenmin values + +Default values of COMPOUNDLEFTHYPHENMIN and COMPOUNDRIGHTHYPHENMIN are 0, +and 0 under the hyphenation, too. ("0" values of +LEFTHYPHENMIN and RIGHTHYPHENMIN mean the default "2" under the hyphenation.) + +Examples + +See tests/compound* test files. + +Preparation of hyphenation patterns + +It hasn't been special pattern generator tool for compound hyphenation +patterns, yet. It is possible to use PATGEN to generate both of +pattern sets, concatenate it manually and set the requested HYPHENMIN values. +(But don't forget the preprocessing steps by substrings.pl before +concatenation.) One of the disadvantage of this method, that PATGEN +doesn't know recursive compound hyphenation of Hyphen. + +László Németh +<nemeth (at) openoffice.org> |