- Distillation: Transferring knowledge from a larger, more capable model into a smaller, faster one
- Teaching output style and format: Training a model to follow specific response formats, tone, or structure
- Warmup before RL: Pre-training a model with supervised examples before applying reinforcement learning for further refinement
Why Serverless SFT?
Supervised fine-tuning (SFT) is a training technique where a model learns from curated input-output examples. Serverless SFT on W&B provides the following advantages:- Lower training costs: By multiplexing shared infrastructure across many users, skipping the setup process for each job, and scaling your GPU costs down to 0 when you’re not actively training, Serverless SFT reduces training costs significantly.
- Faster training time: By immediately provisioning training infrastructure when you need it, Serverless SFT speeds up your training jobs and lets you iterate faster.
- Automatic deployment: Serverless SFT automatically deploys every checkpoint you train, eliminating the need to manually set up hosting infrastructure. Trained models can be accessed and tested immediately in local, staging, or production environments.
How Serverless SFT uses W&B services
Serverless SFT uses a combination of the following W&B components to operate:- Inference: To run your models
- Models: To track performance metrics during the LoRA adapter’s training
- Artifacts: To store and version the LoRA adapters
- Weave (optional): To gain observability into how the model responds at each step of the training loop