March 2026
Intermediate to advanced
402 pages
11h 1m
English
In the previous chapter, we introduced parameter efficient fine tuning (PEFT) techniques, such as low-rank adaptation (LoRA), and demonstrated how the TRL library can be used to efficiently fine-tune (FT) large language models (LLMs) through supervised fine-tuning (SFT). While fine-tuning helps align a pretrained model with specific datasets or downstream tasks, it often falls short in aligning the model's behavior with nuanced human preferences. Therefore, SFT is typically regarded as the first phase in the reinforcement learning from human feedback (RLHF) pipeline.
To bridge the gap between task performance and human-aligned behavior, RLHF introduces two key stages beyond SFT:
Read now
Unlock full access