README.md
July 10, 2024 ยท View on GitHub
Symbol-LLM: Towards Foundational Symbol-centric Interface for Large Language Models
[๐ Website] โข [๐ Paper] โข [๐ค HF Models] โข [๐ค HF Dataset] โข [๐ฑ GitHub]
Repo for "Symbol-LLM: Towards Foundational Symbol-centric Interface for Large Language Models"
๐ฅ News
- [2024/05/16] ๐ฅ๐ฅ๐ฅ Symbol-LLM is accepted by ACL 2024 (main conference) !
- [2023/12/28] ๐ฅ๐ฅ๐ฅ We release Symbolic collection (~880K) on ๐ค HuggingFace! Download and Try it !
- [2023/10/08] ๐ฅ๐ฅ๐ฅ Model weights of Symbol-LLM are released at ๐ค HuggingFace!
- [2023/11/15] We make the Symbol-LLM paper public !
๐ก Abstract
Detailed Abstract of Symbol-LLM
Although Large Language Models (LLMs) demonstrate remarkable ability in processing and generating human-like text, they do have limitations when it comes to comprehending and expressing world knowledge that extends beyond the boundaries of natural language(e.g., chemical molecular formula). Injecting a collection of symbolic data directly into the training of LLMs can be problematic, as it disregards the synergies among different symbolic families and overlooks the need for a balanced mixture of natural and symbolic data. In this work, we tackle these challenges from both a data and framework perspective and introduce Symbol-LLM series models. First, we curated a data collection consisting of 34 tasks and incorporating approximately 20 distinct symbolic families, intending to capture the interrelations and foster synergies between symbols. Then, a two-stage tuning framework succeeds in injecting symbolic knowledge without loss of the generality ability. Extensive experiments on both symbol- and NL-centric tasks demonstrate the balanced and superior performances of Symbol-LLM series models.
๐ Quick Start
To try on Symbol-LLM, please use the Transformer library:
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Symbol-LLM/Symbol-LLM-7B-Instruct")
model = AutoModelForCausalLM.from_pretrained("Symbol-LLM/Symbol-LLM-7B-Instruct")
To utilize our symbolic collection, please load the dataset:
from datasets import load_dataset
# If the dataset is gated/private, make sure you have run huggingface-cli login
dataset = load_dataset("Symbol-LLM/Symbolic_Collection")
๐ Deployed As A WebUI
The implementation of WebUI is modified from text-generation-webui. The running script is as follows:
cd demo-webui/
python server.py --model <model_name> --api --share --gpu-memory 40 40 --compute_dtype float32 --bf16
๐ Note
This work is still under review. We will open-source the model weights, symbolic collection and the code.
๐ง Repo Structure
This repo contains the training scripts and the demo deployment. Detailed structure is as follow:
.
โโโ README.md
โโโ logo.png
โโโ demo-webui
Citation
If you find it helpful, please kindly cite the paper.
@article{xu2023symbol,
title={Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models},
author={Xu, Fangzhi and Wu, Zhiyong and Sun, Qiushi and Ren, Siyu and Yuan, Fei and Yuan, Shuai and Lin, Qika and Qiao, Yu and Liu, Jun},
journal={arXiv preprint arXiv:2311.09278},
year={2023}
}