Hypherator

June 3, 2025 ยท View on GitHub

Maven Central License: MIT

Hypherator is a lightweight, MIT-licensed hyphenation library for Java, based on Hunspell-style hyphenation patterns. It provides a simple iterator-based API to hyphenate words โ€” making it easy to integrate into tokenizer-style workflows (like ICU4J).


โœจ Features

  • ๐Ÿ”ค Hunspell-style hyphenation rules (used by LibreOffice and OpenOffice)
  • โš™๏ธ Lightweight & pure Java โ€“ no native code or JNI required
  • ๐Ÿ” Iterator-based API โ€“ works great in streaming/text processing pipelines
  • ๐ŸŒ Multi-language โ€“ bundled with broad collection of hyphenation dictionaries
  • ๐Ÿ†“ MIT license โ€“ free for commercial and open-source use

๐Ÿš€ Installation

Maven

<dependency>
    <groupId>io.sevcik</groupId>
    <artifactId>hypherator</artifactId>
    <version>1.0</version>
</dependency>

Gradle

implementation("io.sevcik:hypherator:1.0")

Usage

        String word = "typography";
        HyphenIterator iterator = Hypherator.getInstance("en_US");
        iterator.setWord(word);
        var potentialBreak = iterator.first();
        int count = 0;
        while (potentialBreak != DONE) {
            var parts = iterator.applyBreak(pb);
            System.out.println(parts.getFirst() + " - " + parts.getSecond());
            potentialBreak = iterator.next();
        }

This will print possible hyphenation points like:

ty - pography
typog - raphy

Included dictionaries

Compatible LibreOffice hyphenation dictionaries are bundled directly, so Hypherator works out of the box for many languages โ€” no extra setup required.

License:

MIT License. See LICENSE file for more information.

Hypherator is developed and maintained as part of the pdf365.cloud project โ€” a professional-grade PDF generator for structured and multilingual content.