当前位置:首页|资讯|ChatGPT|提示词|大语言模型

Iterative-ChatGPT Prompt Engineering

作者:诗人的朦胧诗发布时间:2023-05-06

When I've been building applications with large language models, I don't think I've ever come to the prompt that I ended up using in the final application on my first attempt. And this isn't what matters.

当我一直在使用大型语言模型构建应用程序时,我认为我从来没有遇到过我在第一次尝试时最终在最终应用程序中使用的提示。而这不是最重要的。

As long as you have a good process to iteratively make your prompt better, then you'll be able to come to something that works well for the task you want to achieve.

只要你有一个好的过程来迭代地使你的提示更好,那么你就能够找到适合你想要完成的任务的东西。

You may have heard me say that when I train a machine learning model, it almost never works the first time. In fact, I'm very surprised if the first model I train works.

你可能听我说过,当我训练一个机器学习模型时,它几乎从来没有在第一次成功。事实上,如果我训练的第一个模型有效,我会感到非常惊讶。

I think we're prompting, the odds of it working the first time is maybe a little bit higher, but as he's saying, it doesn't matter if the first prompt works.

我认为我们在提示,第一次成功的几率可能会高一点,但正如他所说,第一次提示是否有效并不重要。

What matters most is the process for getting to the prompts that work for your application.

最重要的是获得适用于您的应用程序的提示的过程。

So with that, let's jump into the code and let me show you some frameworks to think about how to iteratively develop a prompt.

因此,让我们跳入代码,让我向您展示一些框架,以思考如何迭代开发提示。

Alright, so if you've taken a machine learning class with me, before you may have seen me use a diagram saying that with machine learning development, you often have an idea and then implement it.

好吧,所以如果你和我一起上过机器学习课,在你看到我用一张图说机器学习开发之前,你通常会有一个想法然后实施它。

So write the code, get the data, train your model, and that gives you an experimental result.

所以编写代码,获取数据,训练你的模型,然后给你一个实验结果。

And you can then look at that output, maybe do error analysis,  figure  out where it's working or not working, and then maybe even change your idea of exactly what problem you want to solve or how to approach it, and then change your implementation and run another experiment and so on, and iterate over and over to get to an effective machine learning model.

然后你可以查看那个输出,也许做错误分析,找出它在哪里工作或不工作,然后甚至可能改变你对你想要解决什么问题或如何解决它的想法,然后改变你的实施和运行另一个实验等等,一遍又一遍地迭代以获得有效的机器学习模型。

If you're not familiar with machine learning and haven't seen this diagram before, don't worry about it, not that important for the  rest  of this presentation.

如果您不熟悉机器学习并且之前没有看过此图,请不要担心,它对于本演示文稿的其余部分并不重要。

But when you are writing prompts to develop an application using an OOM, the process can be quite similar where you have an idea for what you want to do, the task you want to complete, and you can then take a first attempt at writing a prompt that hopefully is clear and specific and maybe, if  appropriate  , gives the system time to think, and then you can run it and see what result you get.

但是,当您编写使用 OOM 开发应用程序的提示时,过程可能非常相似,您对要做什么、要完成的任务有一个想法,然后您可以第一次尝试编写一个提示希望是清晰和具体的,并且如果合适的话,可能会给系统时间思考,然后你可以运行它并查看你得到的结果。

And if it doesn't work well enough the first time, then the iterative process of  figuring  out why the instructions, for example, were not clear enough or why it didn't give the algorithm enough time to think, allows you to  refine  the idea, refine the prompt, and so on, and to go around this loop multiple times until you end up with a prompt that works for your application.

如果第一次效果不佳,那么通过迭代过程找出为什么指令不够清晰,或者为什么没有给算法足够的时间来思考,可以让你改进构思、细化提示等,并多次绕过此循环,直到最终得到适用于您的应用程序的提示。

This too is why I personally have not paid as much attention to the internet articles that say 30  perfect  prompts, because I think there probably isn't a perfect prompt for everything under the sun.

这也是为什么我个人没有太关注那些说 30 条完美提示的互联网文章,因为我认为在阳光下可能没有完美的提示。

It's more important that you have a process for developing a good prompt for your specific application.

更重要的是你有一个为你的特定应用程序开发好的提示的过程。

So let's look at an example together in code.

那么让我们在代码中一起看一个例子。

I have here the starter code that you saw in the previous videos, have been port open AI and port OS. Here we get the open AI API key, and this is the same helper function that you saw as last time. And I'm going to use as the running example in this video the task of summarizing a fact sheet for a chair.

我这里有您在之前的视频中看到的起始代码,已经移植开放 AI 和移植操作系统。在这里我们获得了开放的 AI API 密钥,这与您上次看到的帮助函数相同。在本视频中,我将使用总结椅子情况说明书的任务作为运行示例。

So let me just paste that in here. Feel free to pause the video and read this more carefully in the notebook on the left if you want.

所以让我把它粘贴在这里。如果需要,请随时暂停视频并在左侧的笔记本中更仔细地阅读此内容。

But here's a fact sheet for a chair with a description saying it's part of a beautiful family of mid-century inspired, and so on. Talks about the construction, has the dimensions, options for the chair, materials, and so on. Comes from Italy.

但这里有一张椅子的说明书,上面描述说它是一个美丽的中世纪家庭的一部分,等等。谈论结构,有椅子的尺寸、选项、材料等。来自意大利。

So let's say you want to take this fact sheet and help a marketing team write a description for an online retail website.

假设您想使用这份资料单帮助营销团队为在线零售网站撰写描述。

as follows, and I'll just... and I'll just paste this in, so my prompt here says your task is to help a marketing team create the description for retail website or product based on a techno fact sheet, write a product description, and so on. Right?

如下,我只是......我只是把它粘贴进去,所以我在这里的提示说你的任务是帮助营销团队根据技术说明书创建零售网站或产品的描述,写一个产品描述等。正确的?

So this is my first attempt to explain the task to the large-language model. So let me hit shift enter, and this takes a few seconds to run, and we get this result.

所以这是我第一次尝试向大语言模型解释任务。所以让我按 shift enter,这需要几秒钟的时间来运行,我们得到了这个结果。

It looks like it's done a nice job writing a description, introducing a stunning mid-century inspired office chair,  perfect  edition, and so on, but when I look at this, I go, boy, this is really long. It's done a nice job doing exactly what I asked it to, which is start from the technical fact sheet and write a product description. But when I look at this, I go, this is kind of long. Maybe we want it to be a little bit shorter. So I have had an idea. I wrote a prompt, got the result. I'm not that happy with it because it's too long, so I will then clarify my prompt and say use at most 50 words to try to give better guidance on the desired length of this, and let's run it again.

看起来它在写描述、介绍令人惊叹的中世纪办公椅、完美版等方面做得很好,但当我看到这个时,我走了,男孩,这真的很长。它完全按照我的要求做了很好的工作,即从技术说明书开始并编写产品说明。但是当我看到这个时,我走了,这有点长。也许我们希望它短一点。所以我有了一个主意。我写了一个提示,得到了结果。我对它不太满意,因为它太长了,所以我会澄清我的提示并说最多使用 50 个词来尝试就所需的长度提供更好的指导,让我们再次运行它。

Okay, this actually looks like a much nicer short description of the product, introducing a mid-century inspired office chair, and so on, five you just, yeah, both stylish and  practical  . Not bad.

好吧,这实际上看起来像是对产品的更好的简短描述,介绍了一款受中世纪启发的办公椅,等等,五个,是的,既时尚又实用。不错。

And let me double check the length that this is. So I'm going to take the response, split it according to where the space is, and then you'll print out the length. So it's 52 words. Actually not bad. Large language models are okay, but not that great at following instructions about a very  precise  word count, but this is actually not bad. Sometimes it will print out something with 60 or 65 and so on words, but it's kind of within reason.

让我仔细检查一下它的长度。所以我要获取响应,根据空格的位置将其拆分,然后您将打印出长度。所以这是52个字。其实还不错。大型语言模型还可以,但不太擅长遵循有关非常精确的字数统计的说明,但这实际上还不错。有时它会打印出 60 或 65 等字样的内容,但这是在合理范围内的。

Some of the things you Let me run that again. But these are different ways to tell the large-language model what's the length of the output that you want. So this is one, two, three. I count these sentences. Looks like I did a pretty good job.

有些东西你让我再运行一遍。但是这些是告诉大型语言模型你想要的输出长度是多少的不同方法。所以这是一、二、三。我数着这些句子。看起来我做得很好。

And then I've also seen people sometimes do things like, I don't know, use at most 280 characters. Large-language models, because of the way they  interpret  text, using something called a tokenizer, which I won't talk about. But they tend to be so-so at counting characters. But let's see, 281 characters. It's actually surprisingly close. Usually a large-language model doesn't get it quite this close. But these are different ways they can play with to try to control the length of the output that you get.

然后我还看到人们有时会做一些事情,比如,我不知道,最多使用 280 个字符。大型语言模型,因为它们解释文本的方式,使用一种叫做分词器的东西,我不会谈论它。但他们在数字符方面往往马马虎虎。但是让我们看看,281 个字符。它实际上非常接近。通常,大型语言模型不会如此接近。但是他们可以使用这些不同的方式来尝试控制您获得的输出的长度。

But then just switch it back to use at most 50 words. And that's that result that we had just now. As we continue to  refine  this text for our website, we might decide that, boy, this website isn't selling  direct  to consumers, it's actually intended to sell furniture to furniture retailers that would be more interested in the technical details of the chair and the materials of the chair. In that case, you can take this prompt and say, I want to modify this prompt to get it to be more  precise  about the technical details.

但随后只需将其切换回最多使用 50 个单词。这就是我们刚才的结果。随着我们继续为我们的网站完善这段文字,我们可能会决定,天啊,这个网站不是直接向消费者销售,它实际上是为了向家具零售商销售家具,他们会对椅子的技术细节更感兴趣,并且椅子的材料。在这种情况下,您可以接受此提示并说,我想修改此提示以使其更准确地了解技术细节。

So let me keep on modifying this prompt. And I'm going to say, this description is intended for furniture retailers, so it should be technical and focus on materials, products and constructs it from. Well, let's run that. And let's see. Not bad. It says, coated aluminum base and pneumatic chair. High-quality materials. So by changing the prompt, you can get it to focus more on specific characters, on specific characteristics you want it to.

所以让我继续修改这个提示。我要说的是,这个描述是为家具零售商准备的,所以它应该是技术性的,并且专注于材料、产品和构造它的来源。好吧,让我们运行它。让我们看看。不错。它说,涂层铝制底座和气动椅。优质材料。所以通过改变提示,你可以让它更专注于特定的字符,你想要的特定特征。

And when I look at this, I might decide, hmm, at the end of the description, I also wanted to include the product ID. So the two offerings of this chair, SWC 110, SOC 100. So maybe I can further improve this prompt. And to get it to give me the product IDs, I can add this instruction at the end of the description, include every 7 character product ID in the technical specification.

当我看到这个时,我可能会决定,嗯,在描述的最后,我也想包括产品 ID。所以这把椅子的两个产品,SWC 110,SOC 100。所以也许我可以进一步改进这个提示。为了让它给我产品 ID,我可以在描述的末尾添加这条指令,在技术规范中包含每个 7 个字符的产品 ID。

And let's run it and see what happens. And so it says, introduce you to our mid-century inspired office chair, shell colors, talks about plastic coating aluminum base,  practical  , some options, talks about the two product IDs. So this looks pretty good.

让我们运行它,看看会发生什么。所以它说,向您介绍我们中世纪风格的办公椅、外壳颜色、塑料涂层铝底座、实用性、一些选项、两个产品 ID。所以这看起来很不错。

And what you've just seen is a short example of the iterative prompt development that many developers will go through. And I think a guideline is, in the last video, you saw Yisa share a number of best practices.

您刚刚看到的是许多开发人员将经历的迭代提示开发的一个简短示例。我认为一个指导方针是,在上一个视频中,您看到 Yisa 分享了一些最佳实践。

And so what I usually do is keep best practices like that in mind, be clear and specific, and if necessary, give the model time to think.

所以我通常做的是牢记这样的最佳实践,清晰具体,如果有必要,给模型时间思考。

With those in mind, it's worthwhile to often take a first attempt at writing a prompt, see what happens, and then go from there to iteratively  refine  the prompt to get closer and closer to the result that you need.

考虑到这些,值得经常第一次尝试编写提示,看看会发生什么,然后从那里开始反复优化提示,以越来越接近您需要的结果。

And so a lot of the successful prompts that you may see used in various programs was arrived at an iterative process like this.

因此,您可能在各种程序中看到的许多成功提示都是通过这样的迭代过程得出的。

Just for fun, let me show you an example of an even more complex prompt that might give you a sense of what ChatGPT can do, which is I've just added a few extra instructions here. After description, include a table that gives the product dimensions, and then you'll format everything as HTML.

只是为了好玩,让我向您展示一个更复杂的提示示例,它可能会让您了解 ChatGPT 可以做什么,我刚刚在这里添加了一些额外的说明。在描述之后,包括一个给出产品尺寸的表格,然后您将所有内容格式化为 HTML。

So let's run that. And in practice, you would end up with a prompt like this, really only after multiple iterations. I don't think I know anyone that would write this exact prompt the first time they were trying to get the system to process a fact sheet. And so this actually outputs a  bunch  of HTML.

所以让我们运行它。在实践中,你最终会得到这样的提示,实际上只是在多次迭代之后。我想我认识的任何人都不会在他们第一次尝试让系统处理情况说明书时写下这个确切的提示。所以这实际上输出了一堆 HTML。

Let's display the HTML to see if this is even valid HTML and see if this works. And I don't actually know it's going to work, but let's see.

让我们显示 HTML,看看这是否是有效的 HTML,看看它是否有效。我实际上并不知道它会起作用,但让我们看看。

Oh, cool. All right. Looks like a rendit. So it has this really nice looking description of a chair. Construction, materials, product dimensions. Oh, it looks like I left out the use at most 50 words instruction, so this is a little bit long, but if you want that, you can even feel free to pause the video, tell it to be more succinct and regenerate this and see what results you get.

哦,酷。好的。看起来像一个 rendit。所以它有一个非常漂亮的椅子描述。结构、材料、产品尺寸。哦,好像我遗漏了最多使用 50 个字的说明,所以这有点长,但如果你想要那个,你甚至可以随意暂停视频,让它更简洁并重新生成这个和看看你得到什么结果。

So I hope you take away from this video that prompt development is an iterative process. Try something, see how it does not yet,  fulfill  exactly what you want, and then think about how to clarify your instructions, or in some cases, think about how to give it more space to think, to get it closer to delivering the results that you want.

因此,我希望您从这段视频中了解到提示开发是一个迭代过程。尝试一些东西,看看它如何还没有,完全实现你想要的,然后思考如何明确你的指令,或者在某些情况下,思考如何给它更多的思考空间,让它更接近交付结果你要的那个。

And I think the key to being an effective prompt engineer isn't so much about knowing the  perfect  prompt, it's about having a good process to develop prompts that are effective for your application.

而且我认为成为一名有效的提示工程师的关键不在于了解完美的提示,而在于拥有一个良好的流程来开发对您的应用程序有效的提示。

And in this video I  illustrated  developing a prompt using just one example.

在本视频中,我仅使用一个示例说明了如何开发提示。

For more sophisticated applications, sometimes you will have multiple examples, say a list of 10 or even 50 or 100 fact sheets, and iteratively develop a prompt and evaluate it against a large set of cases.

对于更复杂的应用程序,有时您会有多个示例,例如 10 个甚至 50 个或 100 个事实表的列表,并迭代地开发提示并根据大量案例对其进行评估。

But for the early development of most applications, I see many people developing it sort of the way I am with just one example, but then for more  mature  applications, sometimes it could be useful to evaluate prompts against a larger set of examples, such as to test different prompts on dozens of fact sheets to see how this average or worst case performance is on multiple fact sheets.

但是对于大多数应用程序的早期开发,我看到很多人只用一个示例就按照我的方式进行开发,但是对于更成熟的应用程序,有时根据大量示例评估提示可能会很有用,例如在数十个情况说明书上测试不同的提示,以了解这种平均或最差情况下的性能在多个情况说明书上的表现。

But usually you end up doing that only when an application is more  mature  and you have to have those metrics to drive that incremental last few steps of prompt improvement.

但通常只有当应用程序更成熟时,您才会这样做,并且您必须有这些指标来推动最后几个增量步骤的迅速改进。

So with that, please do play with the Jupyter code notebook examples and try out different  variations  and see what results you get. And when you're done, let's go on to the next video where we'll talk about one very common use of large language models in software applications, which is to summarize text.

因此,请尝试使用 Jupyter 代码笔记本示例并尝试不同的变体,看看您会得到什么结果。完成后,让我们继续下一个视频,我们将讨论大型语言模型在软件应用程序中的一个非常常见的用途,即总结文本。


Copyright © 2025 aigcdaily.cn  北京智识时代科技有限公司  版权所有  京ICP备2023006237号-1