In-context learning (ICL)

Tiya Vaj
2 min readApr 6, 2024

Here’s a breakdown of in-context learning (ICL), along with its significance and how it works:

What is In-Context Learning?

  • Core Idea: In-context learning is a way of using large language models (LLMs) where you provide examples of a task directly within the text prompt, instead of needing to fine-tune the entire model on a massive dataset.
  • Few-Shot Learning: ICL is closely related to the concept of “few-shot” or “zero-shot” learning. This means the LLM can adapt to a new task with only a few examples (or sometimes no examples at all) during the time you’re asking it questions (inference).
  • No Model Updates: Importantly, in-context learning doesn’t permanently change the underlying LLM. The model gains this temporary knowledge for a specific task only from the examples in the prompt.

Why is it Important?

  1. Rapid Adaptation: LLMs with ICL can quickly understand and perform new tasks without extensive training. It’s like the model learns on the fly as you’re using it.
  2. Flexibility: This approach makes LLMs more versatile for various applications. You don’t need a specialized dataset for every single task you want the model to do.
  3. Lower Barriers: In-context learning reduces the need for massive datasets and the complex fine-tuning process, potentially making powerful LLMs more accessible.

How Does In-Context Learning Work?

  1. Massive Pre-training: Large language models are trained on enormous text datasets. This allows them to develop a rich understanding of language patterns and relationships between words.
  2. Pattern Recognition in Prompts: In ICL, you provide a prompt that includes:
  • Instructions about the task you want the model to perform.
  • A few examples to demonstrate the desired input and output format.
  1. Inference: The LLM processes the prompt, identifies patterns within the examples, and figures out how to perform the task based on those patterns.
  2. Response Generation: The model uses its understanding to generate a response that aligns with the provided examples.

Example

Let’s say you want an LLM to summarize articles:

Prompt: “Here are a few examples of how to summarize articles:

  • Article: [Insert an article here] Summary: [Insert a short summary]
  • Article: [Another article] Summary: [Its summary]

Now, summarize this article for me: [New article to be summarized]”

The LLM would analyze the examples, deduce the pattern of summarization, and then attempt to summarize the new article.

--

--

Tiya Vaj

Ph.D. Research Scholar in NLP and my passionate towards data-driven for social good.Let's connect here https://www.linkedin.com/in/tiya-v-076648128/