Qwen3 4B Thinking 2507 - MiMo V2 Flash Distill

This model was trained on a reasoning dataset of MiMo V2 Flash.

  • 🧬 Datasets:

    • TeichAI/MiMo-V2-Flash-2300x
  • 🏗 Base Model:

    • unsloth/Qwen3-4B-Thinking-2507
  • ⚡ Use cases:

    • Coding
    • Science
    • Chat
    • Deep Research
  • ∑ Stats (Dataset)

    • Costs: $ 0.00 (USD)
    • Total tokens (input + output): 7.36 M

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

An Ollama Modelfile is included for easy deployment.

Downloads last month
10
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TeichAI/Qwen3-4B-Thinking-2507-MiMo-V2-Flash-Distill

Finetuned
(66)
this model
Quantizations
1 model

Dataset used to train TeichAI/Qwen3-4B-Thinking-2507-MiMo-V2-Flash-Distill