Uploaded model
- Developed by: aslon1213
- License: apache-2.0
- Finetuned from model : aslon1213/orpheus-checkpoints-87330
🧠Base Model
- Base model: meta-llama/Llama-3.2-3B-Instruct
- Tokenizer: meta-llama/Llama-3.2-3B-Instruct
- Architecture: LLaMA
- License: Apache-2.0
🚀 Pretraining Configuration
Training Setup
- Epochs: 1
- Batch size: 1
- Learning rate: 5e-5
- Training processes: 8
- Padding token ID: 128263
- Save steps: 12,000
- Data mixing ratio: 0.05
- Total batches: 87,330
- Per Epoch Time: 8 hours on 1 GPU
Hardware
- GPU: 1× NVIDIA H100
Pretrain Datasets
- Text QA: aslon1213/orpheus_qa_dataset - 26.5K samples
- TTS: aslon1213/orpheus-tts-dataset - 500K samples
Metrics Available at this link
Acknowledgements:
- Downloads last month
- 162