Hypherator
June 3, 2025 ยท View on GitHub
Hypherator is a lightweight, MIT-licensed hyphenation library for Java, based on Hunspell-style hyphenation patterns. It provides a simple iterator-based API to hyphenate words โ making it easy to integrate into tokenizer-style workflows (like ICU4J).
โจ Features
- ๐ค Hunspell-style hyphenation rules (used by LibreOffice and OpenOffice)
- โ๏ธ Lightweight & pure Java โ no native code or JNI required
- ๐ Iterator-based API โ works great in streaming/text processing pipelines
- ๐ Multi-language โ bundled with broad collection of hyphenation dictionaries
- ๐ MIT license โ free for commercial and open-source use
๐ Installation
Maven
<dependency>
<groupId>io.sevcik</groupId>
<artifactId>hypherator</artifactId>
<version>1.0</version>
</dependency>
Gradle
implementation("io.sevcik:hypherator:1.0")
Usage
String word = "typography";
HyphenIterator iterator = Hypherator.getInstance("en_US");
iterator.setWord(word);
var potentialBreak = iterator.first();
int count = 0;
while (potentialBreak != DONE) {
var parts = iterator.applyBreak(pb);
System.out.println(parts.getFirst() + " - " + parts.getSecond());
potentialBreak = iterator.next();
}
This will print possible hyphenation points like:
ty - pography
typog - raphy
Included dictionaries
Compatible LibreOffice hyphenation dictionaries are bundled directly, so Hypherator works out of the box for many languages โ no extra setup required.
License:
MIT License. See LICENSE file for more information.
Sponsored by pdf365.cloud.
Hypherator is developed and maintained as part of the pdf365.cloud project โ a professional-grade PDF generator for structured and multilingual content.