Supervised Fine-Tuning (SFT)

Tiya Vaj
2 min readMay 11, 2024

Supervised fine tuning(SFT) is a technique used to improve the performance of large language models (LLMs) on specific tasks. Here’s a breakdown of what SFT involves:

The Big Idea:

Imagine you have a powerful language model that’s good at understanding and generating text, but it’s not perfect for any specific job. Supervised fine-tuning takes that model and trains it further on a specific task, making it much better at that particular thing.

The Process:

  1. Pre-trained LLM: You start with a large language model that’s already been trained on a massive dataset of text and code. This gives the model a strong foundation of general language knowledge.
  2. Labeled Data: You then provide the LLM with a new dataset of data that’s specific to your desired task. This data is “labeled” which means it includes both the input and the desired output. For example, if you want the LLM to write different kinds of creative text formats, you might give it a dataset of poems, code snippets, scripts, etc., where each piece of text is labeled with its format.

3.Fine-Tuning: The LLM is then fine-tuned on this specific data. This involves adjusting the model’s internal parameters to make it better at recognizing patterns and relationships within the new data. Think of it like tweaking the dials on a radio to get a clearer signal.

Benefits of SFT:

  • Task-Specific Knowledge: SFT allows the LLM to learn the nuances of a specific task. It can learn the patterns, styles, and formats used in that domain, which improves its performance.
  • Improved Performance: By leveraging the pre-trained knowledge and then specializing on a specific task, SFT can significantly improve the accuracy and effectiveness of the LLM for that task.

Overall, supervised fine-tuning is a powerful tool for getting the most out of large language models. It allows you to take a general-purpose model and turn it into a specialist for your specific needs.



Tiya Vaj

Ph.D. Research Scholar in NLP and my passionate towards data-driven for social good.Let's connect here