We use PrimeVul and SVEN as our dataset, respectively.
conda create -n logic python=3.9
pip install datasets
pip install tiktoken
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
pip3 install vllm==0.6.3 ray
pip3 install flash-attn --no-build-isolation
pip install -e . # For verl integration
pip install wandb IPython matplotlibcd Code/Train
chmod +x ./main_re++_3.sh
./main_re++_3.shWe will release our entire training data and model later.
Our evaluation code is based on Python3 (>= 3.9). There are a few dependencies to run the code. The major libraries are listed as follows:
pip install pandas
pip install tree-sitter==0.22.3
pip install tree-sitter-c==0.21.0
pip install tree-sitter-cpp==0.21.0
pip install codebleu==0.7.0We present our code to calculate the metrics EM (Extract Match) and CodeBLEU in the file test.py.
cd Code/Eval
python3 test.py path_to_resultTable s1: How many SFT data samples are synthesized before and after filtering?
| DataType | Category | # Sample |
|---|---|---|
| PrimeVul | Before Filter | 3789 |
| After Filter | 3156 | |
| VulData+CodeData | - | 9433 |
| Vul+Code+Math | - | 14307 |
Table s2: How many samples are used in the easy and hard stages of RL?
| Stages | Easy RL | Hard RL |
|---|---|---|
| # Sample | 4715 | 2403 |
Table s3: The experimental results of established tools in the PrimeVul
| Models | Success | EM |
|---|---|---|
| VulRepair | 8 | 1.84 |
| Vulmaster | 20 | 4.59 |
| Semgrep | 16 | 3.68 |
| Vul-R2 | 108 | 24.83 |
Table s4: The experimental results of critic model selection in training for 200 steps in Hard RL.
| Models | Success | EM | CodeBLEU |
|---|---|---|---|
| Qwen-2.5-7B-Instruct | 97 | 22.30 | 45.64 |
| Qwen-2.5-14B-Instruct | 103 | 23.69 | 45.14 |
Table s5: The experimental results of varients of reward clipping.
| Methods | Success | EM | CodeBLEU |
|---|---|---|---|
| Reward clipping [-1,1] | 102 | 23.45 | 44.79 |
| Vul-R2 [-2,2] | 108 | 24.83 | 46.17 |
Figure s1: Case Study of Vul-R2
Figure s2: The detailed prompt in baselines.
Detailed questions and case prompts will be provided on
./Reasoning_data/prompt.md
Manual Checking Cases:
./Reasoning_data/check_result.xlsx



