Korean Word-IPA Dictionary

November 12, 2025 ยท View on GitHub

1. Getting List of Word Entries

From the latest Kowiktionary dump, I got the list of every word in main namespace. After getting this list, I filtered out all entries which are not written in Hangul, and stored Korean word entries in the file kodict_entry.txt.

2. Crawling

By running crawl.py simultaneously on 11 subsets of kodict_entry.txt, which consist of 6000 words (except the last one), I extracted IPA information, forming a word-IPA dictionary for Korean language. After the crawling processes are all completed, I appended the results in alphabetical order, and deleted entries with no extracted IPA.

3. Converting IPA to X-SAMPA

From any word-IPA dictionary files, you can convert it to word-X-SAMPA dictionary.

from convert import Converter

conv = Converter()
conv.subst_dict(<NAME_OF_DICT>)

4. Licenses

You can make use of the results of scripts (i.e., .dict files and kodict_entry.txt file) under CC BY-SA. You can use the scripts under MIT License.