Module (03): (MSLR) Introduction to Prompt Engineering #6
MohamedRadwan-DevOps
announced in
Documentation
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Module (03): (MSLR) Introduction to Prompt Engineering
Document Type: MSLR (Microsoft Learn Reference)**
Scope: This document provides a verbatim reference extract of the “Introduction to prompt engineering” module from Microsoft Learn, with all examples preserved exactly as provided and code converted to C# when applicable. No assumptions or paraphrasing applied.
Introduction to prompt engineering with GitHub Copilot
Discover the essentials of creating effective prompts with GitHub Copilot. Uncover techniques to transform your coding comments into precise, actionable code, enhancing your development workflow and accelerating code delivery through advanced prompting strategies.
Learning objectives
By the end of this module, you're able to:
Prerequisites
This module is part of these learning paths
Introduction
GitHub Copilot, powered by OpenAI, is changing the game in software development by accelerating development workflows from initial code creation to production-ready implementations. GitHub Copilot can grasp the intricate details of your project through its training of data containing both natural language and billions of lines of source code from publicly available sources, including code in public GitHub repositories. This allows GitHub Copilot to provide you with more context-aware suggestions that help you rapidly deliver code changes and automate routine development tasks.
But to get the most out of GitHub Copilot and maximize your development velocity, you need to know about prompt engineering. Prompt engineering is how you tell GitHub Copilot what you need with precision and efficiency. The quality of the code it gives back, and how quickly you can iterate toward the perfect solution, depends on how clear and strategic your prompts are.
In this module, you'll learn about:
Next unit: Prompt engineering foundations and best practices
Prompt engineering foundations and best practices
In this unit, we'll cover:
What is prompt engineering?
Prompt engineering is the process of crafting clear instructions to guide AI systems, like GitHub Copilot, to generate context-appropriate code tailored to your project's specific needs. This ensures the code is syntactically, functionally, and contextually correct.
Now that you know what prompt engineering is, let's learn about some of its principles.
Principles of prompt engineering
Before we explore specific strategies, let's first understand the basic principles of prompt engineering, summed up in the 4 Ss below. These core rules are the basis for creating effective prompts.
These core principles lay the foundation for crafting efficient and effective prompts. Keeping the 4 Ss in mind, let's dive deeper into advanced best practices that ensure each interaction with GitHub Copilot is optimized.
Best practices in prompt engineering
The following advanced practices, based on the 4 Ss, refine and enhance your engagement with Copilot, ensuring that the generated code isn't only accurate but perfectly aligned with your project's specific needs and contexts.
Provide enough clarity
Building on the Single and Specific principles, always aim for explicitness in your prompts. For instance, a prompt like "Write a Python function to filter and return even numbers from a given list" is both single-focused and specific.
Note
Copilot also uses parallel open tabs in your code editor to get more context on the requirements of your code.
Provide enough context with details
Enrich Copilot's understanding with context, following the Surround principle. The more contextual information provided, the more fitting the generated code suggestions are. For example, by adding some comments at the top of your code to give more details to what you want, you can give more context to Copilot to understand your prompt, and provide better code suggestions.
In the example above, steps are used to give more detail while keeping it short. This practice follows the Short principle, balancing detail with conciseness to ensure clarity and precision in communication.
Provide examples for learning
Using examples can clarify your requirements and expectations, illustrating abstract concepts and making the prompts more tangible for Copilot. Well-crafted examples help Copilot understand patterns quickly, leading to more accurate initial suggestions that require fewer revision cycles. This approach is particularly effective for generating boilerplate code, test templates, and repetitive implementations that form the foundation of larger features.
Assert and iterate
One of the keys to unlocking GitHub Copilot's full potential and accelerating your development workflow is the practice of strategic iteration. Your first prompt might not always yield production-ready code, and that's perfectly fine. Rather than spending time manually refining the output, treat it as the beginning of an efficient dialogue with Copilot.
If the first output isn't quite what you're looking for, don't start from scratch. Instead, erase the suggested code, enrich your initial comment with added details and examples, and prompt Copilot again. This iterative approach often gets you to high-quality, deployment-ready code faster than traditional development methods, as each iteration builds on Copilot's understanding of your specific requirements.
Now that you learned best practices to improve your prompting skills, let's take a closer look at how you can provide examples Copilot can learn from.
How Copilot learns from your prompts
GitHub Copilot operates based on AI models trained on vast amounts of data. To enhance its understanding of specific code contexts, engineers often provide it with examples. This practice, commonly found in machine learning, led to different training approaches such as:
Zero-shot learning
Here, GitHub Copilot generates code without any specific examples, relying solely on its foundational training. This approach is ideal for rapidly implementing common patterns and standard functionality. For instance, suppose you want to create a function to convert temperatures between Celsius and Fahrenheit. You can start by only writing a comment describing what you want, and Copilot might be able to generate production-ready code for you, based on its previous training, without any other examples.
One-shot learning
With this approach, a single example is given, aiding the model in generating more context-aware responses that follow your specific patterns and conventions. This is particularly effective for creating consistent implementations across your codebase, accelerating feature development while maintaining code standards. Building upon the previous zero-shot example, you might provide an example of a temperature conversion function and then ask Copilot to create another similar function.
Few-shot learning
In this method, Copilot is presented with several examples, which strike a balance between zero-shot unpredictability and the precision of fine-tuning. This approach excels at generating sophisticated implementations that handle multiple scenarios and edge cases, reducing the time spent on manual testing and refinement. Let's say you want to generate code that sends you a greeting depending on the time of the day. Here's a few-shot version of that prompt.
Chain prompting and managing chat history
When working on complex features that require multiple steps, you might engage in extended conversations with GitHub Copilot Chat. While detailed context helps Copilot understand your requirements, maintaining long conversation histories can become inefficient and costly in terms of processing.
For example, you might start with a basic implementation, then iteratively add:
Each turn builds on the previous context, but the full history grows longer:
Note
Long prompts with full conversation history can consume 2–3 PRUs per turn. Summarizing context or resetting the conversation can keep it closer to 1 PRU per request.
To manage this efficiently:
“Based on our previous discussion about user authentication, now add rate limiting to prevent brute-force attacks.”
Role prompting for specialized tasks
Role prompting involves instructing GitHub Copilot to act as a specific type of expert, which can significantly improve the quality and relevance of generated code for specialized domains. This approach helps accelerate development by getting more targeted solutions on the first try.
Security expert role
When working on security-critical features, prompt Copilot to think like a security expert:
"Act as a cybersecurity expert. Create a password validation function that checks for common vulnerabilities and follows OWASP guidelines."
This approach typically generates code that includes:
Performance optimization role
For performance-critical code, use a performance expert role:
"Act as a performance optimization expert. Refactor this sorting algorithm to handle large datasets efficiently."
This often results in:
Testing specialist role
When creating comprehensive test suites, leverage a testing expert perspective:
"Act as a testing specialist. Create comprehensive unit tests for this payment processing module, including edge cases and error scenarios."
This typically produces:
Role prompting helps you get production-ready code faster by incorporating domain expertise into initial implementations, reducing the need for multiple revision cycles.
Now that you know how Copilot uses your prompts to learn, let's take an in-depth look at how it actually uses your prompt to suggest code for you.
GitHub Copilot user prompt process flow
In this unit, we'll break down how GitHub Copilot turns your prompts into smart, usable code. Generally, GitHub Copilot receives prompts and returns code suggestions or responses in its data flow. This process suggests an inbound and outbound flow.
Inbound flow
Let's walk through all the steps Copilot takes to process a user's prompt into a code suggestion.
1. Secure prompt transmission and context gathering
The process begins with the secure transmission of the user prompt over HTTPS. This ensures that your natural language comment is sent to GitHub Copilot's servers securely and confidentially, protecting sensitive information.
GitHub Copilot securely receives the user prompt, which could be a Copilot chat or a natural language comment provided by you within your code.
Simultaneously, Copilot collects context details:
These steps translate the user's high-level request into a concrete coding task.
2. Proxy filter
Once the context is gathered and the prompt is built, it passes securely to a proxy server hosted in a GitHub-owned Microsoft Azure tenant. The proxy filters traffic, blocking attempts to hack the prompt or manipulate the system into revealing details about how the model generates code suggestions.
3. Toxicity filtering
Copilot incorporates content filtering mechanisms before proceeding with intent extraction and code generation, to ensure that the generated code and responses don't include or promote:
4. Code generation with LLM
Finally, the filtered and analyzed prompt is passed to LLM Models, which generate appropriate code suggestions. These suggestions are based on Copilot’s understanding of the prompt and the surrounding context, ensuring that the generated code is relevant, functional, and aligned with project-specific requirements.
Outbound flow
5. Post-processing and response validation
Once the model produces its responses, the toxicity filter removes any harmful or offensive generated content. The proxy server then applies a final layer of checks to ensure code quality, security, and ethical standards. These checks include:
If any part of the response fails these checks, it is either truncated or discarded.
6. Suggestion delivery and feedback loop initiation
Only responses that pass all filters are delivered to the user. Copilot then initiates a feedback loop based on your actions to achieve the following:
7. Repeat for subsequent prompts
The process is repeated as you provide more prompts, with Copilot continuously handling user requests, understanding their intent, and generating code in response. Over time, Copilot applies the cumulative feedback and interaction data, including context details, to improve its understanding of user intent and refine its code generation capabilities.
GitHub Copilot data
In this unit, we'll cover how GitHub Copilot handles data for different environments, features, and configurations.
Data handling for GitHub Copilot code suggestions
GitHub Copilot in the code editor does not retain any prompts like code or other context used for the purposes of providing suggestions to train the foundational models. It discards the prompts once a suggestion is returned.
GitHub Copilot Individual subscribers can opt out of sharing their prompts with GitHub, which will otherwise be used to fine-tune GitHub’s foundational model.
Data handling for GitHub Copilot chat
GitHub Copilot Chat operates as an interactive platform, allowing developers to engage in conversational interactions with the AI assistant to receive coding assistance. Here are the steps that it carries out which might be distinct from other features like code completion:
The same applies for CLI, Mobile, and GitHub Copilot Chat on GitHub.com.
Prompt types supported by GitHub Copilot Chat
GitHub Copilot Chat processes a wide range of coding-related prompts, including:
Its ability to process diverse input types enhances its utility as a conversational coding companion.
Limited context windows
While GitHub Copilot Chat excels at understanding and responding to prompts, it has a context window limitation, referring to how much code/text the model can process at once.
You should break down complex problems into smaller prompts or provide relevant snippets for best results.
GitHub Copilot Large Language Models (LLMs)
GitHub Copilot is powered by Large Language Models (LLMs) to assist you in writing code seamlessly. In this unit, we focus on understanding the integration and impact of LLMs in GitHub Copilot. Let's review the following topics:
What are LLMs?
Large Language Models (LLMs) are artificial intelligence models designed and trained to understand, generate, and manipulate human language. They're trained on extensive amounts of text data and can perform a broad range of language-related tasks. Here are some key aspects:
Volume of training data
LLMs are exposed to vast amounts of text from diverse sources. This exposure equips them with a broad understanding of language, context, and the intricacies involved in communication.
Contextual understanding
LLMs excel in generating contextually relevant and coherent text. Their ability to understand context allows them to contribute meaningfully—whether completing sentences or generating whole documents.
Machine learning and AI integration
LLMs are grounded in machine learning and artificial intelligence principles. They're neural networks with millions or even billions of parameters fine-tuned during training to understand and predict text effectively.
Versatility
These models aren’t limited to a specific type of text or language. They can be tailored and fine-tuned to perform specialized tasks, making them highly versatile and applicable across various domains.
Role of LLMs in GitHub Copilot and prompting
GitHub Copilot utilizes LLMs to provide context-aware code suggestions. The LLM considers the current file, other open files, and surrounding context to generate relevant, tailored code completions that enhance developer productivity.
Fine-tuning LLMs
Fine-tuning is a process used to tailor pretrained LLMs for specific tasks or domains. It involves training the model on a smaller, task-specific dataset (target dataset) while leveraging the knowledge and parameters learned from a large pretrained dataset (source model).
Fine-tuning enhances performance on specialized tasks by adapting the model’s behavior to domain-specific needs.
LoRA fine-tuning
Traditional full fine-tuning requires updating all parameters of a neural network, which is computationally expensive. LoRA (Low-Rank Adaptation) offers an efficient alternative:
LoRA outperforms other adaptation techniques such as adapters and prefix tuning, offering a “work smarter, not harder” method for customizing LLMs to specific tasks within GitHub Copilot.
Beta Was this translation helpful? Give feedback.
All reactions