In-context learning (ICL)

Tiya Vaj
2 min readApr 6, 2024

Here’s a breakdown of in-context learning (ICL), along with its significance and how it works:

What is In-Context Learning?

  • Core Idea: In-context learning is a way of using large language models (LLMs) where you provide examples of a task directly within the text prompt, instead of needing to fine-tune the entire model on a massive dataset.
  • Few-Shot Learning: ICL is closely related to the concept of “few-shot” or “zero-shot” learning. This means the LLM can adapt to a new task with only a few examples (or sometimes no examples at all) during the time you’re asking it questions (inference).
  • No Model Updates: Importantly, in-context learning doesn’t permanently change the underlying LLM. The model gains this temporary knowledge for a specific task only from the examples in the prompt.

Why is it Important?

  1. Rapid Adaptation: LLMs with ICL can quickly understand and perform new tasks without extensive training. It’s like the model learns on the fly as you’re using it.
  2. Flexibility: This approach makes LLMs more versatile for various applications. You don’t need a specialized dataset for every single task you want the model to do.
  3. Lower Barriers: In-context learning reduces the need for massive datasets and the complex fine-tuning process, potentially making powerful LLMs more accessible.

How Does In-Context Learning Work?

  1. Massive Pre-training: Large language models are trained on enormous text datasets. This allows them to develop a rich understanding of language patterns and relationships between words.
  2. Pattern Recognition in Prompts: In ICL, you provide a prompt that includes:
  • Instructions about the task you want the model to perform.
  • A few examples to demonstrate the desired input and output format.
  1. Inference: The LLM processes the prompt, identifies patterns within the examples, and figures out how to perform the task based on those patterns.
  2. Response Generation: The model uses its understanding to generate a response that aligns with the provided examples.


Let’s say you want an LLM to summarize articles:

Prompt: “Here are a few examples of how to summarize articles:

  • Article: [Insert an article here] Summary: [Insert a short summary]
  • Article: [Another article] Summary: [Its summary]

Now, summarize this article for me: [New article to be summarized]”

The LLM would analyze the examples, deduce the pattern of summarization, and then attempt to summarize the new article.



Tiya Vaj

Ph.D. Research Scholar in NLP and my passionate towards data-driven for social good.Let's connect here