Vul-R2

(Logo generated by OpenAI-o4)

📅Dataset

We use PrimeVul and SVEN as our dataset, respectively.

🛠️Training

Installation

conda create -n logic python=3.9
pip install datasets
pip install tiktoken
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
pip3 install vllm==0.6.3 ray
pip3 install flash-attn --no-build-isolation
pip install -e .  # For verl integration
pip install wandb IPython matplotlib

Run

cd Code/Train
chmod +x ./main_re++_3.sh
./main_re++_3.sh

We will release our entire training data and model later.

⚖️Evaluation

Environment

Our evaluation code is based on Python3 (>= 3.9). There are a few dependencies to run the code. The major libraries are listed as follows:

pip install pandas
pip install tree-sitter==0.22.3
pip install tree-sitter-c==0.21.0
pip install tree-sitter-cpp==0.21.0
pip install codebleu==0.7.0

Run

We present our code to calculate the metrics EM (Extract Match) and CodeBLEU in the file test.py.

cd Code/Eval
python3 test.py path_to_result

Vul-R2 Rebuttal Phase

Table s1: How many SFT data samples are synthesized before and after filtering?

DataType	Category	# Sample
PrimeVul	Before Filter	3789
PrimeVul	After Filter	3156
VulData+CodeData	-	9433
Vul+Code+Math	-	14307

Table s2: How many samples are used in the easy and hard stages of RL?

Stages	Easy RL	Hard RL
# Sample	4715	2403

Table s3: The experimental results of established tools in the PrimeVul

Models	Success	EM
VulRepair	8	1.84
Vulmaster	20	4.59
Semgrep	16	3.68
Vul-R2	108	24.83

Table s4: The experimental results of critic model selection in training for 200 steps in Hard RL.

Models	Success	EM	CodeBLEU
Qwen-2.5-7B-Instruct	97	22.30	45.64
Qwen-2.5-14B-Instruct	103	23.69	45.14

Table s5: The experimental results of varients of reward clipping.

Methods	Success	EM	CodeBLEU
Reward clipping [-1,1]	102	23.45	44.79
Vul-R2 [-2,2]	108	24.83	46.17

Figure s1: Case Study of Vul-R2

Figure s2: The detailed prompt in baselines.

Detailed questions and case prompts will be provided on

 ./Reasoning_data/prompt.md

Manual Checking Cases:

 ./Reasoning_data/check_result.xlsx

Overview

Acknowledgements

Verl 🔗
TinyZero 🔗
Logic-RL 🔗

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Code		Code
Figures		Figures
Reasoning_data		Reasoning_data
dataset		dataset
results		results
README.md		README.md
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vul-R2

📅Dataset

🛠️Training

Installation

Run

⚖️Evaluation

Environment

Run

Vul-R2 Rebuttal Phase

Overview

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vul-R2

📅Dataset

🛠️Training

Installation

Run

⚖️Evaluation

Environment

Run

Vul-R2 Rebuttal Phase

Overview

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages