NLP4Code
April 25, 2023 ยท View on GitHub
Repository for the NLP4Code project at the LILY lab.
Installation
[Recommended] Create a virtualenv or conda enviroment
conda create -n nlp4code python=3.8
conda activate nlp4code
Then, install the dependencies:
pip install -r requirements.txt
(Optional) At any point, if you met with the Python import problem (e.g., ModuleNotFoundError), try doing this in the main (NLP4Code) directory:
export PYTHONPATH=`pwd`
To run LLAMA-based model, you need to install the development version of transformers library:
pip install git+https://github.com/huggingface/transformers
Wandb
We use Wandb for experiment tracking. Please register ask Ansong for an invitation to the Wandb Yale-LILY team before running experiments. When you are ready to run the exps and log it to the cloud, do the following:
wandb login
Paste your API key and the login is complete. When start running experiments, you should see something like
wandb: Tracking run with wandb version 0.12.11
wandb: Run data is saved locally in /home/ansongni/Code/NLP4Code/wandb/run-20220309_150158-1ebacxm4
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run mathqa-gpt-finetuning
wandb: โญ๏ธ View project at https://wandb.ai/yale-lily/unified-codegen
wandb: ๐ View run at https://wandb.ai/yale-lily/unified-codegen/runs/1ebacxm4
If you want to do some test runs without logging to the cloud, run wandb offline first as suggested above.
Naming of the experiments
In the configuration file, you should see a line like
default_root_dir: &exp_name results/mathqa-gpt_neo_1.3B-finetuning
We automatically get the experiment name by the string after /, the tags for the experiments are automatically
generated by spliting that string by -. In this case, the experiment will be named mathqa-gpt_neo_1.3B-finetuning
and the tags will be ["mathqa", "gpt_neo_1.3B", "finetuning"]. Please follow this convention so that we can write all
of this in one place.
Fine-tuning
(Read the previous sections first if you are ready to run experiments) For fine-tuning, in the main directory, do:
python finetuning/trainer.py fit --config finetuning/training_configs/*.yaml
Testing
There are some basic tests in the tests folder, to run all the tests (follow this link for more):
To run tests, do
python -m unittest discover <test_directory>
# or
python -m unittest discover -s <directory> -p '*_test.py'