Skip to content

coeoo/Vul-R2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vul-R2

Vul-R2 Logo

(Logo generated by OpenAI-o4)

version version mit

📅Dataset

We use PrimeVul and SVEN as our dataset, respectively.

🛠️Training

Installation

conda create -n logic python=3.9
pip install datasets
pip install tiktoken
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
pip3 install vllm==0.6.3 ray
pip3 install flash-attn --no-build-isolation
pip install -e .  # For verl integration
pip install wandb IPython matplotlib

Run

cd Code/Train
chmod +x ./main_re++_3.sh
./main_re++_3.sh

We will release our entire training data and model later.

⚖️Evaluation

Environment

Our evaluation code is based on Python3 (>= 3.9). There are a few dependencies to run the code. The major libraries are listed as follows:

pip install pandas
pip install tree-sitter==0.22.3
pip install tree-sitter-c==0.21.0
pip install tree-sitter-cpp==0.21.0
pip install codebleu==0.7.0

Run

We present our code to calculate the metrics EM (Extract Match) and CodeBLEU in the file test.py.

cd Code/Eval
python3 test.py path_to_result

Vul-R2 Rebuttal Phase

Table s1: How many SFT data samples are synthesized before and after filtering?

DataType Category # Sample
PrimeVul Before Filter 3789
After Filter 3156
VulData+CodeData - 9433
Vul+Code+Math - 14307

Table s2: How many samples are used in the easy and hard stages of RL?

Stages Easy RL Hard RL
# Sample 4715 2403

Table s3: The experimental results of established tools in the PrimeVul

Models Success EM
VulRepair 8 1.84
Vulmaster 20 4.59
Semgrep 16 3.68
Vul-R2 108 24.83

Table s4: The experimental results of critic model selection in training for 200 steps in Hard RL.

Models Success EM CodeBLEU
Qwen-2.5-7B-Instruct 97 22.30 45.64
Qwen-2.5-14B-Instruct 103 23.69 45.14

Table s5: The experimental results of varients of reward clipping.

Methods Success EM CodeBLEU
Reward clipping [-1,1] 102 23.45 44.79
Vul-R2 [-2,2] 108 24.83 46.17

Figure s1: Case Study of Vul-R2

s1

Figure s2: The detailed prompt in baselines.

s2

Detailed questions and case prompts will be provided on

 ./Reasoning_data/prompt.md

Manual Checking Cases:

 ./Reasoning_data/check_result.xlsx

Overview

OVerview

Acknowledgements

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 99.5%
  • Shell 0.5%