Microsoft’s open source tool helps you write code to work with generative AI, ensuring results give correct information and stay on topic. Credit: TippaPatt / Shutterstock The launch of Microsoft’s new AI-powered Bing shone new light on the company’s investments in OpenAI’s large language models and in generative AI, turning them into a consumer-facing service. Early experiments with the service quickly revealed details of the predefined prompts that Microsoft was using to keep the Bing chatbot focused on delivering search results. Large language models, like OpenAI’s GPT series, are best thought of as prompt-and-response tools. You give the model a prompt and it responds with a series of words that fits both the content and the style of the prompt and, in some cases, even the mood. The models are trained using large amounts of data which is then fine-tuned for a specific task. By providing a well-designed prompt and limiting the size of the response, it’s possible to reduce the risk of the model producing grammatically correct but inherently false outputs. Introducing prompt engineering Microsoft’s Bing prompts showed that it was being constrained to simulate a helpful personality that would construct content from search results, using Microsoft’s own Prometheus model as a set of additional feedback loops to keep results on topic and in context. What’s perhaps most interesting about these prompts is that it’s clear Microsoft has been investing in a new software engineering discipline: prompt engineering. It’s an approach that you should invest in too, especially if you’re working with Microsoft’s Azure OpenAI APIs. Generative AIs, like large language models, are going to be part of the public face of your application and your business, and you’re going to need to keep them on brand and under control. That requires prompt engineering: designing an effective configuration prompt, tuning the model, and ensuring user prompts don’t result in unwanted outputs. Both Microsoft and OpenAI provide sandbox environments where you can build and test base prompts. You can paste in a prompt body, add sample user content, and see the typical output. Although there’s an element of randomness in the model, you’re going to get similar outputs for any input, so you can test out the features and construct the “personality” of your model. This approach is not just necessary for chat- and text-based models; you’ll need some aspect of prompt engineering in a Codex-based AI-powered developer tool or in a DALL-E image generator being used for slide clip art or as part of a low-code workflow. Adding structure and control to prompts keeps generative AI productive, helps avoid errors, and reduces the risk of misuse. Using prompts with Azure OpenAI It’s important to remember that you have other tools to control both context and consistency with large language models beyond the prompt. One other option is to control the length of the response (or in the case of a ChatGPT-based system, the responses) by limiting the number of tokens that can be used in an interaction. This keeps responses concise and less likely to go off topic. Working with the Azure OpenAI APIs is a relatively simple way to integrate large language models into your code, but while they simplify delivering strings to APIs, what’s needed is a way to manage those strings. It takes a lot of code to apply prompt engineering disciplines to your application, applying the appropriate patterns and practices beyond the basic question-and-answer options. Manage prompts with Prompt Engine Microsoft has been working on an open source project, Prompt Engine, to manage prompts and deliver the expected outputs from a large language model, with JavaScript, C#, and Python releases all in separate GitHub repositories. All three have the same basic functionality: to manage the context of any interaction with a model. If you’re using the JavaScript version, there’s support for three different classes of model: a generic prompt-based model, a code model, and a chat-based system. It’s a useful way to manage the various components of a well-designed prompt, supporting both your own inputs and user interactions (including model responses). That last part is important as a way of managing context between interactions, ensuring that state is preserved in chats and between lines of code in an application. You get the same options from the Python version, allowing you to quickly use the same processes as JavaScript code. The C# version only offers generic and text analysis model support, but these can easily be repurposed for your choice of applications. The JavaScript option is good for web applications and Visual Studio Code extensions, whereas the Python tool is a logical choice for anyone working with many different machine learning tools. The intent is to treat the large language model as a collaborator with the user, allowing you to build your own feedback loops around the AI, much like Microsoft’s Prometheus. By having a standard pattern for working with the model, you’re able to iterate around your own base prompts by tracking outputs and refining inputs where necessary. Managing GPT interactions with Prompt Engine Prompt Engine installs as a library from familiar repositories like npm and pip, with sample code in their GitHub repositories. Getting started is easy enough once the module imports the appropriate libraries. Start with a Description of your prompt, followed by some example Interactions. For example, where you’re turning natural language into code, each interaction is a pair that has a sample query followed by the expected output code in the language you’re targeting. There should be several Interactions to build the most effective prompt. The default target language is Python, but you can configure your choice of languages using a CodeEngineConfig call. With a target language and a set of samples, you can now build a prompt from a user query. The resulting prompt string can be used in a call to the Azure OpenAI API. If you want to keep context with your next call, simply add the response to a new Interaction, and it will carry across to the next call. As it’s not part of the original sample Interactions, it won’t persist beyond the current user session and can’t be used by another user or in another call. This approach simplifies building dialogs, though it’s important to keep track of the total tokens used so your prompt doesn’t overrun the token limits of the model. Prompt Engine includes a way to ensure prompt length doesn’t exceed the maximum token number for your current model and prunes older dialogs where necessary. This approach does mean that dialogs can lose context, so you may need to help users understand there are limits to the length of a conversation. If you’re explicitly targeting a chat system, you can configure user and bot names with a contextual description that includes bot behaviors and tone that can be included in the sample Interactions, again passing responses back to Prompt Engine to build context into the next prompt. You can use cached Interactions to add a feedback loop to your application, for example, looking for unwanted terms and phrases, or using the user rating of the response to determine which Interactions persist between prompts. Logging successful and unsuccessful prompts will allow you to build a more effective default prompt, adding new examples as needed. Microsoft suggests building a dynamic bank of examples that can be compared to the queries, using a set of similar examples to dynamically generate a prompt that approximates your user’s query and hopefully generates more accurate output. Prompt Engine is a simple tool that helps you construct an appropriate pattern for building prompts. It’s an effective way to manage the limitations of large language models like GPT-3 and Codex, and at the same time to build the necessary feedback loops that help avoid a model behaving in unanticipated ways. Related content feature 14 great preprocessors for developers who love to code Sometimes it seems like the rules of programming are designed to make coding a chore. Here are 14 ways preprocessors can help make software development fun again. By Peter Wayner Nov 18, 2024 10 mins Development Tools Software Development news JetBrains IDEs ease debugging for Kubernetes apps Version 2024.3 updates to IntelliJ, PyCharm, WebStorm, and other JetBrains IDEs streamline remote debugging of Kubernetes microservices and much more. By Paul Krill Nov 14, 2024 3 mins Integrated Development Environments Java Python analysis Understanding Hyperlight, Microsoft’s minimal VM manager Microsoft is making its Rust-based, functions-focused VM tool available on Azure at last, ready to help event-driven applications at scale. By Simon Bisson Nov 14, 2024 8 mins Microsoft Azure Rust Serverless Computing analysis GitHub Copilot learns new tricks GitHub and Microsoft have taken their AI-powered programming assistant into new territories, tackling code reviews, simple web apps, Java upgrades, and Azure help and troubleshooting. By Simon Bisson Nov 07, 2024 8 mins GitHub Java Microsoft Azure Resources Videos