Open Chinese Convert 開放中文轉換

May 10, 2026 · View on GitHub

CMake Bazel MSVC Node.js CI Python CI AppVeyor

latest packaged version(s)

Introduction 介紹

OpenCC

Open Chinese Convert (OpenCC, 開放中文轉換) is an opensource project for conversions between Traditional Chinese, Simplified Chinese and Japanese Kanji (Shinjitai). It supports character-level and phrase-level conversion, character variant conversion and regional idioms among Mainland China, Taiwan and Hong Kong. This is not translation tool between Mandarin and Cantonese, etc.

中文簡繁轉換開源項目,支持詞彙級別的轉換、異體字轉換和地區習慣用詞轉換(中國大陸、台灣、香港、日本新字體)。不提供普通話與粵語的轉換。

Discussion (Telegram): https://t.me/open_chinese_convert

Features 特點

  • 嚴格區分「一簡對多繁」和「一簡對多異」。
  • 完全兼容異體字,可以實現動態替換。
  • 嚴格審校一簡對多繁詞條,原則爲「能分則不合」。
  • 支持中國大陸、台灣、香港異體字和地區習慣用詞轉換,如「裏」「裡」、「鼠標」「滑鼠」。
  • 詞庫和函數庫完全分離,可以自由修改、導入、擴展。

Installation 安裝

Package Managers 包管理器

Prebuilt 預編譯

Usage 使用

Online 線上轉換

https://opencc.js.org/converter?config=s2t

Node.js

npm install opencc

The npm package supports Node.js >=20.17 <26. It uses bundled Node-API prebuilds when available and falls back to a local node-gyp build when the current platform does not have a matching prebuild.

To install the npm CLI:

npm install -g opencc
opencc -c s2t.json -i input.txt -o output.txt

The npm CLI supports basic text conversion. Plugins, --inspect, and --segmentation require the native OpenCC CLI.

import { OpenCC } from 'opencc';
async function main() {
  const converter: OpenCC = new OpenCC('s2t.json');
  const result: string = await converter.convertPromise('汉字');
  console.log(result);  // 漢字
}

See demo.js and ts-demo.ts.

Python

pip install opencc (Windows, Linux, macOS)

import opencc
converter = opencc.OpenCC('s2t.json')
converter.convert('汉字')  # 漢字

C++

#include "opencc.h"

int main() {
  const opencc::SimpleConverter converter("s2t.json");
  converter.Convert("汉字");  // 漢字
  return 0;
}

Full example with Bazel

C

#include "opencc.h"

int main() {
  opencc_t opencc = opencc_open("s2t.json");
  const char* input = "汉字";
  char* converted = opencc_convert_utf8(opencc, input, strlen(input));  // 漢字
  opencc_convert_utf8_free(converted);
  opencc_close(opencc);
  return 0;
}

Full Document 完整文檔

Command Line

  • opencc --help
  • opencc_dict --help

Segmentation and Inspection Modes

OpenCC CLI supports two diagnostic modes that output JSON instead of converted text:

--segmentation — Output segmentation result only (no conversion):

echo "他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题" | opencc -c s2twp.json --segmentation
# {"input":"他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题","segments":["他","只看","了几行","日志",",就","一叶知秋",",猜到","整个","系统","是","数据库","连接池","出了","问题"]}

--inspect — Output full inspection result (segmentation + per-stage conversion + final output):

echo "他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题" | opencc -c s2twp.json --inspect
# {"input":"他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题","segments":["他","只看","了几行","日志",",就","一叶知秋",",猜到","整个","系统","是","数据库","连接池","出了","问题"],"stages":[{"index":1,"segments":["他","只看","了幾行","日誌",",就","一葉知秋",",猜到","整個","系統","是","數據庫","連接池","出了","問題"]},{"index":2,"segments":["他","只看","了幾行","日誌",",就","一葉知秋",",猜到","整個","系統","是","資料庫","連線池","出了","問題"]},{"index":3,"segments":["他","只看","了幾行","日誌",",就","一葉知秋",",猜到","整個","系統","是","資料庫","連線池","出了","問題"]}],"output":"他只看了幾行日誌,就一葉知秋,猜到整個系統是資料庫連線池出了問題"}

# Pretty-print with jq:
echo "他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题" | opencc -c s2twp.json --inspect | jq .

These modes are useful for diagnosing conversion issues:

  1. Use --segmentation to verify that the input is segmented as expected.
  2. Use --inspect to see which conversion stage produces an unexpected result.

Rules:

  • --segmentation and --inspect are mutually exclusive.

Other Ports (Unofficial)

Configurations 配置文件

預設配置文件

  • s2t.json Simplified Chinese to Traditional Chinese (OpenCC Standard) / 簡體OpenCC 標準繁體
  • t2s.json Traditional Chinese (OpenCC Standard) to Simplified Chinese / OpenCC 標準繁體簡體
  • s2tw.json Simplified Chinese to Traditional Chinese (Taiwan Standard) / 簡體台灣正體
  • tw2s.json Traditional Chinese (Taiwan Standard) to Simplified Chinese / 台灣正體簡體
  • s2hk.json Simplified Chinese to Traditional Chinese (Hong Kong variant) / 簡體香港繁體
  • hk2s.json Traditional Chinese (Hong Kong variant) to Simplified Chinese / 香港繁體簡體
  • s2twp.json Simplified Chinese to Traditional Chinese (Taiwan Standard) with Taiwanese idiom / 簡體台灣正體 並轉換爲台灣常用詞彙
  • tw2sp.json Traditional Chinese (Taiwan Standard) to Simplified Chinese with Mainland Chinese idiom / 台灣正體簡體 並轉換爲中國大陸常用詞彙
  • t2tw.json Traditional Chinese (OpenCC Standard) to Traditional Chinese (Taiwan Standard) / OpenCC 標準繁體台灣正體
  • tw2t.json Traditional Chinese (Taiwan standard) to Traditional Chinese (OpenCC Standard) / 台灣正體OpenCC 標準繁體
  • t2hk.json Traditional Chinese (OpenCC Standard) to Traditional Chinese (Hong Kong variant) / OpenCC 標準繁體香港繁體
  • hk2t.json Traditional Chinese (Hong Kong variant) to Traditional Chinese (OpenCC Standard) / 香港繁體OpenCC 標準繁體
  • t2jp.json Traditional Chinese Characters (Kyūjitai) to New Japanese Kanji (Shinjitai) / OpenCC 標準繁體(日文舊字體)日文新字體
  • jp2t.json New Japanese Kanji (Shinjitai) to Traditional Chinese Characters (Kyūjitai) / 日文新字體OpenCC 標準繁體(日文舊字體)

指定配置文件

通过环境变量OPENCC_DATA_DIR加载指定路径下的配置文件

OPENCC_DATA_DIR=/path/to/your/config/dir opencc --help

Experimental Plugins 試驗性插件

OpenCC 現已支援外部 C++ 分詞插件。當前第一個插件為 opencc-jieba, 可通過 s2t_jieba.jsons2tw_jieba.jsons2hk_jieba.jsons2twp_jieba.jsontw2sp_jieba.json 等插件配置啓用。

OpenCC now supports external C++ segmentation plugins. The first plugin is opencc-jieba, which can be enabled through plugin-backed configs such as s2t_jieba.json, s2tw_jieba.json, s2hk_jieba.json, s2twp_jieba.json, and tw2sp_jieba.json.

注意:

  • 該插件機制目前仍為試驗性功能。
  • jieba 插件是可選組件,預設 OpenCC 構建、Python 套件和 Node.js 套件都不要求它。
  • opencc-jieba 額外依賴 cppjieba 及其配套詞典資源,這些依賴僅在構建或分發該插件時需要。
  • 在下一次正式發布版本之前,插件 ABI 仍可能發生變化,不應視為穩定介面。
  • 我們預計從下一次正式發布版本開始,將插件 ABI 視為穩定介面。
  • Windows 下插件必須與宿主 OpenCC 二進位使用 ABI 相容的工具鏈/執行時構建;MSVC 與 MinGW 產物不支援混用。

Notes:

  • The plugin mechanism is currently experimental.
  • The jieba plugin is optional and is not required for the default OpenCC build, Python package, or Node.js package.
  • opencc-jieba additionally depends on cppjieba and its dictionary resources. These dependencies are only needed when building or distributing the plugin itself.
  • The plugin ABI may still change before the next formal OpenCC release and should not yet be treated as stable.
  • We expect to treat the plugin ABI as stable starting with the next formal OpenCC release.
  • On Windows, plugins must be built with an ABI-compatible toolchain/runtime as the host OpenCC binary. Mixing MSVC-built hosts with MinGW-built plugins, or the reverse, is unsupported.

Build 編譯

Build with CMake

Linux & macOS

g++ 4.6+ or clang 3.2+ is required.

make

Windows Visual Studio:

build.cmd

Build with Bazel

bazel build //:opencc

Test 測試

Linux & macOS

make test

Windows Visual Studio:

test.cmd

Test with Bazel

bazel test --test_output=all //src/... //data/... //python/... //test/...

Benchmark 基準測試

make benchmark

詳情見 doc/benchmark.md 檔案。

Projects using OpenCC 使用 OpenCC 的項目

Please update if your project is using OpenCC.

License 許可協議

Apache License 2.0

Third Party Library 第三方庫

Change History 版本歷史

Contributors 貢獻者

Please feel free to update this list if you have contributed OpenCC.