LLM BugScanner

December 11, 2024 · View on GitHub

Overview

LLM BugScanner is an advanced tool designed to enhance the functionalities of GPTLens by automating the process of identifying and evaluating potential vulnerabilities in code. It provides more flexibility by allowing the use of different Large Language Model (LLM) agents. The tool is equipped with an auditor and a critic, which work together to find and rank vulnerabilities based on correctness and severity.

Features

Automated Vulnerability Detection: Automatically identifies potential vulnerabilities in code.
Flexible LLM Agents: Supports various LLM agents for auditing and critique.
Enhanced Accuracy: Ranks vulnerabilities based on correctness and severity scores.
User-Friendly: Easy to configure and extend.

Setup

Follow these steps to set up the virtual environment, install dependencies, and run the example.

1. Clone the Repository

git clone https://github.com/Mayaaa311/GPTLens-2.0.git
cd GPTLens-2.0

2. Set Up Virtual Environment

Create and activate a virtual environment to manage dependencies.

python3 -m venv venv
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

5. Run the Example

python src/bugscanner_cli.py -a NTQAI/Nxcode-CQ-7B-orpo  -c m-a-p/OpenCodeInterpreter-DS-6.7B -r NTQAI/Nxcode-CQ-7B-orpo -d data -o result_test_pipe -k 5 -log logs_nov3

6. to resume a run

rerun the command, make sure the result file remain the same

if running +10 data, recomend using run_batch folder's method

Currently Tested Model:

AlfredPros/CodeLlama-7b-Instruct-Solidity	Huggingface_LLM	https://huggingface.co/AlfredPros/CodeLlama-7b-Instruct-Solidity	7B
m-a-p/OpenCodeInterpreter-DS-6.7B	Huggingface_LLM	https://huggingface.co/m-a-p/OpenCodeInterpreter-DS-6.7B	6.7B
NTQAI/Nxcode-CQ-7B-orpo	Huggingface_LLM	https://huggingface.co/NTQAI/Nxcode-CQ-7B-orpo	7B
Artigenz/Artigenz-Coder-DS-6.7B	Huggingface_LLM	https://huggingface.co/Artigenz/Artigenz-Coder-DS-6.7B	6.7B
bigcode/starcoders2-15b	Huggingface_LLM	https://huggingface.co/bigcode/starcoders2-15b	15B
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct	Huggingface_LLM	https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct	16B
Qwen/CodeQwen1.5-7B	pipeline_LLM	https://huggingface.co/Qwen/CodeQwen1.5-7B	7B
meta-llama/Llama-3.1-8B-Instruct	pipeline_LLM	https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct	8B
meta-llama/CodeLlama-7b-hf	pipeline_LLM	https://huggingface.co/meta-llama/CodeLlama-7b-hf	7B
WisdomShell/CodeShell-7B-Chat	Huggingface_LLM	https://huggingface.co/WisdomShell/CodeShell-7B-Chat	7B
THUDM/codegeex2-6b	Huggingface_LLM	https://huggingface.co/THUDM/codegeex2-6b	6B
google/codegemma-7b	gemma_LLM	https://huggingface.co/google/codegemma-7b	7B

Commands for testing

python src/bugscanner_cli.py -a AlfredPros/CodeLlama-7b-Instruct-Solidity -c m-a-p/OpenCodeInterpreter-DS-6.7B -r m-a-p/OpenCodeInterpreter-DS-6.7B -p m-a-p/OpenCodeInterpreter-DS-6.7B -d data_2 -o result_test_parser/trail3 -k 5 -log logs_oct1

multiple auditors

python src/bugscanner_cli.py -a AlfredPros/CodeLlama-7b-Instruct-Solidity m-a-p/OpenCodeInterpreter-DS-6.7B NTQAI/Nxcode-CQ-7B-orpo -c m-a-p/OpenCodeInterpreter-DS-6.7B -r NTQAI/Nxcode-CQ-7B-orpo -d data -o result_test_parser -k 5 -log logs_oct1

single auditor

python src/bugscanner_cli.py -a NTQAI/Nxcode-CQ-7B-orpo -c m-a-p/OpenCodeInterpreter-DS-6.7B -r NTQAI/Nxcode-CQ-7B-orpo -d data -o result_test_pipe -k 5 -log logs_nov3

batch submission(for PACE environment only):

python src/1_run_batch_data.py

batch evaluation(need to manually change the directory to be evaluated):

batch evaluation: python src/2_evaluate.py