Uploaded model

  • Developed by: aslon1213
  • License: apache-2.0
  • Finetuned from model : aslon1213/orpheus-checkpoints-87330

🧠 Base Model

  • Base model: meta-llama/Llama-3.2-3B-Instruct
  • Tokenizer: meta-llama/Llama-3.2-3B-Instruct
  • Architecture: LLaMA
  • License: Apache-2.0

🚀 Pretraining Configuration

Training Setup

  • Epochs: 1
  • Batch size: 1
  • Learning rate: 5e-5
  • Training processes: 8
  • Padding token ID: 128263
  • Save steps: 12,000
  • Data mixing ratio: 0.05
  • Total batches: 87,330
  • Per Epoch Time: 8 hours on 1 GPU

Hardware

  • GPU: 1× NVIDIA H100

Pretrain Datasets

  • Text QA: aslon1213/orpheus_qa_dataset - 26.5K samples
  • TTS: aslon1213/orpheus-tts-dataset - 500K samples

Metrics Available at this link

WanDB Report

Acknowledgements:

  1. Llama
  2. Orpheus-TTS
  3. SNAC
Downloads last month
162
Safetensors
Model size
3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aslon1213/orpheus-checkpoints-87330

Finetuned
(887)
this model
Finetunes
1 model

Datasets used to train aslon1213/orpheus-checkpoints-87330