Qwen3-olympiad-math-thinking-2507
This model is an adapter version of Qwen/Qwen3-4B-Thinking-2507, trained with LoRA using unsloth framework on the hf-imo-colab/olympiads-ref-base-math-word.
How to Use
Install dependencies (if not installed):
pip install transformers peft torch
Load model normally:
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
from peft import PeftModel
import torch
model_id = "SkyAsl/Qwen3-olympiad-math-thinking-2507"
tokenizer = AutoTokenizer.from_pretrained(model_id)
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B-Thinking-2507")
model = PeftModel.from_pretrained(base_model, model_id)
prompt = ""
input_text = (
f"<|user|>\n{prompt}\n"
f"<|assistant|>\n<think>\n"
)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
streamer = TextStreamer(tokenizer, skip_prompt=True)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=3000, # Recommended to keep max_new_tokens high because of the long CoT.
temperature=0.7,
top_p=0.9,
do_sample=True,
repetition_penalty=1.2,
streamer=streamer
)
print(tokenizer.decode(output[0], skip_special_tokens=False))
Training Details
Training format
The training format follows:
<|user|>
{prompt}
<|assistant|>
<think>
{solution}
</think>
{response}
🛠 Hyperparameters
| Setting | Value |
|---|---|
| Method | LoRA (PEFT) |
| Rank (r) | 16 |
| Alpha | 32 |
| LoraDropout | 0.05 |
| Max Length | 4096 |
| Batch Size | 4 |
| Grad Accum | 8 |
| LR | 2e-4 |
| Optimizer | adamw_torch_fused |
| Scheduler | cosine |
| Epochs | 2 |
Training Results
| Epoch | Training Loss | Validation Loss |
|---|---|---|
| 1 | 0.870800 | 0.886106 |
| 2 | 0.844400 | 0.871447 |
Dataset
hf-imo-colab/olympiads-ref-base-math-word
- This dataset splitted into train and test datasets randomly.
- train_size = 0.9
- test_size = 0.1.
- Downloads last month
- 25
Model tree for SkyAsl/Qwen3-olympiad-math-thinking-2507
Base model
Qwen/Qwen3-4B-Thinking-2507
Finetuned
unsloth/Qwen3-4B-Thinking-2507