Skip to main content
Now in public preview, W&B Training offers serverless post-training for large language models (LLMs), including both reinforcement learning (RL) and supervised fine-tuning (SFT).
  • Serverless RL: Improve model reliability performing multi-turn, agentic tasks while increasing speed and reducing costs. RL is a training technique where models learn to improve their behavior through feedback on their outputs.
  • Serverless SFT: Fine-tune models using curated datasets for distillation, teaching output style and format, or warming up before RL.
W&B Training includes integration with: To get started, satisfy the prerequisites to start using the service and then see the Serverless RL quickstart or the Serverless SFT docs to learn how to post-train your models.