In-context learning (ICL)

2 min readApr 6, 2024

Here’s a breakdown of in-context learning (ICL), along with its significance and how it works:

What is In-Context Learning?

Core Idea: In-context learning is a way of using large language models (LLMs) where you provide examples of a task directly within the text prompt, instead of needing to fine-tune the entire model on a massive dataset.
Few-Shot Learning: ICL is closely related to the concept of “few-shot” or “zero-shot” learning. This means the LLM can adapt to a new task with only a few examples (or sometimes no examples at all) during the time you’re asking it questions (inference).
No Model Updates: Importantly, in-context learning doesn’t permanently change the underlying LLM. The model gains this temporary knowledge for a specific task only from the examples in the prompt.

Why is it Important?

Rapid Adaptation: LLMs with ICL can quickly understand and perform new tasks without extensive training. It’s like the model learns on the fly as you’re using it.
Flexibility: This approach makes LLMs more versatile for various applications. You don’t need a specialized dataset for every single task you want the model to do.
Lower Barriers: In-context learning reduces the need for massive datasets and the complex fine-tuning process, potentially making powerful LLMs more accessible.

How Does In-Context Learning Work?

Massive Pre-training: Large language models are trained on enormous text datasets. This allows them to develop a rich understanding of language patterns and relationships between words.
Pattern Recognition in Prompts: In ICL, you provide a prompt that includes:

Inference: The LLM processes the prompt, identifies patterns within the examples, and figures out how to perform the task based on those patterns.
Response Generation: The model uses its understanding to generate a response that aligns with the provided examples.

Example

Let’s say you want an LLM to summarize articles:

Prompt: “Here are a few examples of how to summarize articles:

Now, summarize this article for me: [New article to be summarized]”

The LLM would analyze the examples, deduce the pattern of summarization, and then attempt to summarize the new article.

Written by Tiya Vaj