The rise of prompt engineering

I have said this before - with the advent of large AI models, Prompt Engineering is critical and is the next challenge for us to master.

What is Prompt engineering?

Prompt engineering is the process of fine-tuning large models and often is written in natural language, outlining the intention of the user. Prompt engineering is a key element that allows the output to be accurate and reflect the needs of the user. Prompts should not be thought as the explicit one input to the model, instead are multiple tasks for the model.

We use large language models (#LLM) such as #GPT3, or #Text2Image models like #DALLE and #StableFusion using a text prompt. The prompt is a string and is our way to ask the model to do what it is meant to. It also is our way to provide hints and directions on what you need and ultimately help the model understand the patterns that are important for us and be represented in the output.

The way we write a prompt is important - including the phrases, orders of the words, hints, etc. Prompts also need to be in the context of the use-case (see screenshot below on GPT3 use-case examples). For example, language generation prompts would be different from code generation or summarization, or image generation. The prompts are closely tied to the intended use cases.

Screenshot showing GPT3 example use cases. — GPT3 use case examples

Examples of prompt engineering

We start with a couple of examples related to language generation. I figured out what better way to show prompt engineering, by asking GPT3 about prompt engineering. 😇

In this first screenshot below, we use GPT3’s Davinci model and ask for a paragraph on prompt engineering. The first sentence is the prompt that was the input, and the text with the green background is what was generated.

Screenshot of the generated output of a GPT3 model — GPT3 screenshot showing a paragraph prompt

And in this second example, it is mostly the same prompt but we ask for a blog post instead of a paragraph. As we can see the output of course is quite different, but the essence of it is still quite the same.

And finally, another example, same as before, but in this case, we outline that be for a 5-year-old child (ignoring the fact would a 5-year-old understands the notion of AI, and models 😶).

Even though the changes might seem subtle in the examples shown earlier - consider them as toy examples.

Small changes to the prompt can lead to significant changes in the output. To show an example, below are two examples #StableDiffusion - which is an open-source image-to-text model. I used Harry Potter for inspiration and use Hogwarts and the dark forest where the first graders were forbidden to go.

For the first prompt example: a beautiful view of hogwarts school of witchcraft and wizardry and the dark forest, by Laurie Lipton, Impressionist Mosaic, Diya Lamp architecture, atmospheric, sense of awe and scale

And for the second example, the prompts was: a beautiful view of hogwarts school of witchcraft and wizardry and the dark forest, by Laurie Lipton, Impressionist Mosaic, atmospheric, sense of awe and scale.

The only difference between the two prompts was removing “Diya Lamp architecture”, resulting in dramatically different outputs. I am guessing this being image generation, the changes are more dramatic and easier to comprehend.

Prompts also are not universal and are very dependent to the models being used - what is considered a good example in one model (from one institution), won’t transpose to another model from another institution. For example, the same prompt as above (a beautiful view of hogwarts school of witchcraft and wizardry and the dark forest, by Laurie Lipton, Impressionist Mosaic, atmospheric, sense of awe and scale), when used for OpenAI’s DALLE model generates the image shown below - which is very different of course.

And if I want to tweak the same prompt specifically for DALLE here is another example using the prompt: Beautiful view of Hogwarts school of witchcraft and wizardry and the dark forest with a sense of awe and scale, Awesome, Highly Detailed.

As a side note, I particularly like this one:

This also has created several tools that allow us to craft prompts. Given many of us don’t quite understand the options and styles that can go in there. Some like promptoMANIA can cover multiple large models (images in this case) and can get very sophisticated themselves. And other simpler ones like this DALLE prompt generator by Adam Brown, and more like prompts.ai allow for tweaking and fine-tuning of prompts and effectively creating templates for GPT3.

Prompt engineering is a brand new and fascinating space for the industry and I for one am quite intrigued to see where it will lead us.