summaryrefslogtreecommitdiffstats
path: root/intl/hyphenation/hyphen/README.compound
diff options
context:
space:
mode:
Diffstat (limited to 'intl/hyphenation/hyphen/README.compound')
-rw-r--r--intl/hyphenation/hyphen/README.compound87
1 files changed, 87 insertions, 0 deletions
diff --git a/intl/hyphenation/hyphen/README.compound b/intl/hyphenation/hyphen/README.compound
new file mode 100644
index 000000000..bcb265853
--- /dev/null
+++ b/intl/hyphenation/hyphen/README.compound
@@ -0,0 +1,87 @@
+New option of Libhyphen 2.7: NOHYPHEN
+
+Hyphen, apostrophe and other characters may be word boundary characters,
+but they don't need (extra) hyphenation. With NOHYPHEN option
+it's possible to hyphenate the words parts correctly.
+
+Example:
+
+ISO8859-1
+NOHYPHEN -,'
+1-1
+1'1
+NEXTLEVEL
+
+Description:
+
+1-1 and 1'1 declare hyphen and apostrophe as word boundary characters
+and NOHYPHEN with the comma separated character (or character sequence)
+list forbid the (extra) hyphens at the hyphen and apostrophe characters.
+
+Implicite NOHYPHEN declaration
+
+Without explicite NEXTLEVEL declaration, Hyphen 2.8 uses the
+previous settings, plus in UTF-8 encoding, endash (U+2013) and
+typographical apostrophe (U+2019) are NOHYPHEN characters, too.
+
+It's possible to enlarge the hyphenation distance from these
+NOHYPHEN characters by using COMPOUNDLEFTHYPHENMIN and
+COMPOUNDRIGHTHYPHENMIN attributes.
+
+Compound word hyphenation
+
+Hyphen library supports better compound word hyphenation and special
+rules of compound word hyphenation of German languages and other
+languages with arbitrary number of compound words. The new options,
+COMPOUNDLEFTHYPHENMIN and COMPOUNDRIGHTHYPHENMIN help to set the right
+style for the hyphenation of compound words.
+
+Algorithm
+
+The algorithm is an extension of the original pattern based hyphenation
+algorithm. It uses two hyphenation pattern sets, defined in the same
+pattern file and separated by the NEXTLEVEL keyword. First pattern
+set is for hyphenation only at compound word boundaries, the second one
+is for hyphenation within words or word parts.
+
+Recursive compound level hyphenation
+
+The algorithm is recursive: every word parts of a successful
+first (compound) level hyphenation will be rehyphenated
+by the same (first) pattern set.
+
+Finally, when first level hyphenation is not possible, Hyphen uses
+the second level hyphenation for the word or the word parts.
+
+Word endings and word parts
+
+Patterns for word endings (patterns with ellipses) match the
+word parts, too.
+
+Options
+
+COMPOUNDLEFTHYPHENMIN: min. hyph. dist. from the left compound word boundary
+COMPOUNDRIGHTHYPHENMIN: min. hyph. dist. from the right comp. word boundary
+NEXTLEVEL: sign second level hyphenation patterns
+
+Default hyphenmin values
+
+Default values of COMPOUNDLEFTHYPHENMIN and COMPOUNDRIGHTHYPHENMIN are 0,
+and 0 under the hyphenation, too. ("0" values of
+LEFTHYPHENMIN and RIGHTHYPHENMIN mean the default "2" under the hyphenation.)
+
+Examples
+
+See tests/compound* test files.
+
+Preparation of hyphenation patterns
+
+It hasn't been special pattern generator tool for compound hyphenation
+patterns, yet. It is possible to use PATGEN to generate both of
+pattern sets, concatenate it manually and set the requested HYPHENMIN values.
+(But don't forget the preprocessing steps by substrings.pl before
+concatenation.) One of the disadvantage of this method, that PATGEN
+doesn't know recursive compound hyphenation of Hyphen.
+
+László Németh
+<nemeth (at) openoffice.org>