Context Retrieval: An Overview

As we have seen in one of the previous sections, which describes step-by-step how Copilot builds the final prompt, context retrieval is probably the most important step in the process. Good context retrieval leads to better prompts, including relevant contextual elements being sent to the model, and consequently, better results.

As a reminder, when Copilot generates a completion, the journey starts locally in your IDE. Copilot captures your editor state (cursor, prefix, suffix) and gathers extra snippets from related files. It ranks, filters, and assembles this context into a prompt, running quick checks before sending. The prompt goes to GitHub’s servers, where an AI model generates candidate completions that the client cleans up and displays. If you are interested in learning more about the internal workings of context retrieval, I recommend reading "Copilot Internals" by Parth Thakkar. The author has reverse-engineered many aspects of Copilot and provides a detailed analysis of how it works under the hood. Even if some details are outdated, the engineering principles and techniques discussed remain relevant and interesting.

A similar process happens for Copilot Chat and Copilot Agents, but with more emphasis on chat history, tool outputs, workspace-wide context, custom instructions, and more. The process may feel instantaneous in many cases, but there's a lot happening behind the scenes to make it work smoothly.

However, context retrieval is not perfect. There are many factors that can affect its effectiveness. Some of them are directly related to the user environment, while others are out of the user's control. Let's explore some of them:

The quality of the codebase: If the codebase is poorly structured, lacks comments, or has inconsistent naming conventions, it can be challenging for Copilot to extract meaningful context.
The history of the chat: If the chat history is long, inconsistent, or contains a lot of noise, it can be difficult for Copilot to maintain context and provide relevant suggestions.
The relevance of the open file (and cursor position): If the open file is not related to the task at hand, it may introduce noise and confusion rather than useful context. Copilot is generally good at ignoring irrelevant context, but why add unnecessary complexity? The same applies to the cursor position.

Building with GitHub Copilot

From Autocomplete to Autonomous Agents

Enroll now to unlock all content and receive all future updates for free.

Unlock now $31.99 Learn More

Previous Next