Serverless SFT

Now in public preview, Serverless SFT helps developers fine-tune LLMs using supervised learning on curated datasets. W&B provisions the training infrastructure (on CoreWeave) for you while allowing full flexibility in your environment’s setup. Serverless SFT gives you instant access to a managed training cluster that elastically auto-scales to handle your training workloads. Serverless SFT is ideal for tasks like:

Distillation: Transferring knowledge from a larger, more capable model into a smaller, faster one
Teaching output style and format: Training a model to follow specific response formats, tone, or structure
Warmup before RL: Pre-training a model with supervised examples before applying reinforcement learning for further refinement

Serverless SFT trains low-rank adapters (LoRAs) to specialize a model for your specific task. The LoRAs you train are automatically stored as artifacts in your W&B account, and can be saved locally or to a third party for backup. Models that you train through Serverless SFT are also automatically hosted on W&B Inference. See the ART Serverless SFT docs to get started.

Why Serverless SFT?

Supervised fine-tuning (SFT) is a training technique where a model learns from curated input-output examples. Serverless SFT on W&B provides the following advantages:

Lower training costs: By multiplexing shared infrastructure across many users, skipping the setup process for each job, and scaling your GPU costs down to 0 when you’re not actively training, Serverless SFT reduces training costs significantly.
Faster training time: By immediately provisioning training infrastructure when you need it, Serverless SFT speeds up your training jobs and lets you iterate faster.
Automatic deployment: Serverless SFT automatically deploys every checkpoint you train, eliminating the need to manually set up hosting infrastructure. Trained models can be accessed and tested immediately in local, staging, or production environments.

How Serverless SFT uses W&B services

Serverless SFT uses a combination of the following W&B components to operate:

Inference: To run your models
Models: To track performance metrics during the LoRA adapter’s training
Artifacts: To store and version the LoRA adapters
Weave (optional): To gain observability into how the model responds at each step of the training loop

Serverless SFT is in public preview. During the preview, you are charged only for the use of inference and the storage of artifacts. W&B does not charge for adapter training during the preview period.

Serverless RL

API Reference

Serverless SFT

Why Serverless SFT?

How Serverless SFT uses W&B services

Serverless RL

Serverless SFT

API Reference

​Why Serverless SFT?

​How Serverless SFT uses W&B services

Why Serverless SFT?

How Serverless SFT uses W&B services