LLM BugScanner

December 11, 2024 ยท View on GitHub

Overview

LLM BugScanner is an advanced tool designed to enhance the functionalities of GPTLens by automating the process of identifying and evaluating potential vulnerabilities in code. It provides more flexibility by allowing the use of different Large Language Model (LLM) agents. The tool is equipped with an auditor and a critic, which work together to find and rank vulnerabilities based on correctness and severity.

Features

  • Automated Vulnerability Detection: Automatically identifies potential vulnerabilities in code.
  • Flexible LLM Agents: Supports various LLM agents for auditing and critique.
  • Enhanced Accuracy: Ranks vulnerabilities based on correctness and severity scores.
  • User-Friendly: Easy to configure and extend.

Setup

Follow these steps to set up the virtual environment, install dependencies, and run the example.

1. Clone the Repository

git clone https://github.com/Mayaaa311/GPTLens-2.0.git
cd GPTLens-2.0

2. Set Up Virtual Environment

Create and activate a virtual environment to manage dependencies.

python3 -m venv venv
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

5. Run the Example

python src/bugscanner_cli.py -a NTQAI/Nxcode-CQ-7B-orpo  -c m-a-p/OpenCodeInterpreter-DS-6.7B -r NTQAI/Nxcode-CQ-7B-orpo -d data -o result_test_pipe -k 5 -log logs_nov3

6. to resume a run

rerun the command, make sure the result file remain the same

if running +10 data, recomend using run_batch folder's method

Currently Tested Model:

AlfredPros/CodeLlama-7b-Instruct-SolidityHuggingface_LLMhttps://huggingface.co/AlfredPros/CodeLlama-7b-Instruct-Solidity7B
m-a-p/OpenCodeInterpreter-DS-6.7BHuggingface_LLMhttps://huggingface.co/m-a-p/OpenCodeInterpreter-DS-6.7B6.7B
NTQAI/Nxcode-CQ-7B-orpoHuggingface_LLMhttps://huggingface.co/NTQAI/Nxcode-CQ-7B-orpo7B
Artigenz/Artigenz-Coder-DS-6.7BHuggingface_LLMhttps://huggingface.co/Artigenz/Artigenz-Coder-DS-6.7B6.7B
bigcode/starcoders2-15bHuggingface_LLMhttps://huggingface.co/bigcode/starcoders2-15b15B
deepseek-ai/DeepSeek-Coder-V2-Lite-InstructHuggingface_LLMhttps://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct16B
Qwen/CodeQwen1.5-7Bpipeline_LLMhttps://huggingface.co/Qwen/CodeQwen1.5-7B7B
meta-llama/Llama-3.1-8B-Instructpipeline_LLMhttps://huggingface.co/meta-llama/Llama-3.1-8B-Instruct8B
meta-llama/CodeLlama-7b-hfpipeline_LLMhttps://huggingface.co/meta-llama/CodeLlama-7b-hf7B
WisdomShell/CodeShell-7B-ChatHuggingface_LLMhttps://huggingface.co/WisdomShell/CodeShell-7B-Chat7B
THUDM/codegeex2-6bHuggingface_LLMhttps://huggingface.co/THUDM/codegeex2-6b6B
google/codegemma-7bgemma_LLMhttps://huggingface.co/google/codegemma-7b7B

Commands for testing

python src/bugscanner_cli.py -a AlfredPros/CodeLlama-7b-Instruct-Solidity -c m-a-p/OpenCodeInterpreter-DS-6.7B -r m-a-p/OpenCodeInterpreter-DS-6.7B -p m-a-p/OpenCodeInterpreter-DS-6.7B -d data_2 -o result_test_parser/trail3 -k 5 -log logs_oct1

multiple auditors

python src/bugscanner_cli.py -a AlfredPros/CodeLlama-7b-Instruct-Solidity m-a-p/OpenCodeInterpreter-DS-6.7B NTQAI/Nxcode-CQ-7B-orpo -c m-a-p/OpenCodeInterpreter-DS-6.7B -r NTQAI/Nxcode-CQ-7B-orpo -d data -o result_test_parser -k 5 -log logs_oct1

single auditor

python src/bugscanner_cli.py -a NTQAI/Nxcode-CQ-7B-orpo -c m-a-p/OpenCodeInterpreter-DS-6.7B -r NTQAI/Nxcode-CQ-7B-orpo -d data -o result_test_pipe -k 5 -log logs_nov3

batch submission(for PACE environment only):

python src/1_run_batch_data.py

batch evaluation(need to manually change the directory to be evaluated):

batch evaluation: python src/2_evaluate.py