Qwen3-olympiad-math-thinking-2507

This model is an adapter version of Qwen/Qwen3-4B-Thinking-2507, trained with LoRA using unsloth framework on the hf-imo-colab/olympiads-ref-base-math-word.

How to Use

Install dependencies (if not installed):

pip install transformers peft torch

Load model normally:

from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
from peft import PeftModel
import torch

model_id = "SkyAsl/Qwen3-olympiad-math-thinking-2507"

tokenizer = AutoTokenizer.from_pretrained(model_id)

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B-Thinking-2507")
model = PeftModel.from_pretrained(base_model, model_id)

prompt = ""

input_text = (
    f"<|user|>\n{prompt}\n"
    f"<|assistant|>\n<think>\n"
)

inputs = tokenizer(input_text, return_tensors="pt").to(model.device)

streamer = TextStreamer(tokenizer, skip_prompt=True)

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=3000, # Recommended to keep max_new_tokens high because of the long CoT.
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
        repetition_penalty=1.2,
        streamer=streamer
    )

print(tokenizer.decode(output[0], skip_special_tokens=False))

Training Details

Training format

The training format follows:

<|user|>
{prompt}
<|assistant|>
<think>
{solution}
</think>
{response}

🛠 Hyperparameters

Setting	Value
Method	LoRA (PEFT)
Rank (r)	16
Alpha	32
LoraDropout	0.05
Max Length	4096
Batch Size	4
Grad Accum	8
LR	2e-4
Optimizer	adamw_torch_fused
Scheduler	cosine
Epochs	2

Training Results

Epoch	Training Loss	Validation Loss
1	0.870800	0.886106
2	0.844400	0.871447

Dataset

hf-imo-colab/olympiads-ref-base-math-word

This dataset splitted into train and test datasets randomly.
train_size = 0.9
test_size = 0.1.

Downloads last month: 25

Model tree for SkyAsl/Qwen3-olympiad-math-thinking-2507

Base model

Qwen/Qwen3-4B-Thinking-2507

Finetuned

unsloth/Qwen3-4B-Thinking-2507

Adapter

(2)

this model

SkyAsl
/

Qwen3-olympiad-math-thinking-2507