Our approach to AI safety
我们对人工智能安全的态度
Ensuring that AI systems are built, deployed, and used safely is critical to our mission.
确保人工智能系统的建立、部署和安全使用对我们的任务至关重要。
如果你需要 ChatGPT plus 会员 和 普通账号可以点击文末链接。
OpenAI is committed to keeping powerful AI safe and broadly beneficial. We know our AI tools provide many benefits to people today. Our users around the world have told us that ChatGPT helps to increase their productivity, enhance their creativity, and offer tailored learning experiences. We also recognize that, like any technology, these tools come with real risks—so we work to ensure safety is built into our system at all levels.
OpenAI 致力于保证强大的 AI 的安全和广泛的益处。我们知道我们的人工智能工具为今天的人们提供了许多好处。我们在世界各地的用户告诉我们,ChatGPT 有助于提高他们的生产力,增强他们的创造力,并提供量身定制的学习体验。我们也认识到,就像任何技术一样,这些工具也有真正的风险ーー因此我们努力确保我们的系统在各个层面都建立了安全性。
Building increasingly safe AI systems
构建越来越安全的人工智能系统
Prior to releasing any new system we conduct rigorous testing, engage external experts for feedback, work to improve the model's behavior with techniques like reinforcement learning with human feedback, and build broad safety and monitoring systems.
在发布任何新系统之前,我们会进行严格的测试,请外部专家提供反馈,采用人工反馈强化学习等技术来改善模型的行为,并建立广泛的安全和监控系统。
For example, after our latest model, GPT-4, finished training, we spent more than 6 months working across the organization to make it safer and more aligned prior to releasing it publicly.
例如,在我们最新的模型 GPT-4完成培训之后,我们花了6个多月的时间在整个组织中工作,以便在公开发布它之前使它更安全、更协调。
We believe that powerful AI systems should be subject to rigorous safety evaluations. Regulation is needed to ensure that such practices are adopted, and we actively engage with governments on the best form such regulation could take.
我们认为,强大的人工智能系统应该接受严格的安全评估。为了确保采用这种做法,需要进行监管,我们积极与各国政府就这种监管可能采取的最佳形式进行接触。
Learning from real-world use to improve safeguards
从现实世界中学习如何改进安全措施
We work hard to prevent foreseeable risks before deployment, however, there is a limit to what we can learn in a lab. Despite extensive research and testing, we cannot predict all of the beneficial ways people will use our technology, nor all the ways people will abuse it. That’s why we believe that learning from real-world use is a critical component of creating and releasing increasingly safe AI systems over time.
在部署之前,我们努力防止可预见的风险,然而,在实验室中我们能学到的东西是有限的。尽管进行了广泛的研究和测试,我们仍然无法预测人们将以何种有益的方式使用我们的技术,也无法预测人们将以何种方式滥用我们的技术。这就是为什么我们相信,随着时间的推移,从现实世界的使用中学习是创造和释放越来越安全的人工智能系统的关键组成部分。
We cautiously and gradually release new AI systems—with substantial safeguards in place—to a steadily broadening group of people and make continuous improvements based on the lessons we learn.
我们谨慎而逐步地向不断扩大的人群发布新的人工智能系统,并在总结经验教训的基础上不断改进。
We make our most capable models available through our own services and through an API so developers can build this technology directly into their apps. This allows us to monitor for and take action on misuse, and continually build mitigations that respond to the real ways people misuse our systems—not just theories about what misuse might look like.
我们通过我们自己的服务和 API 让我们最有能力的模型可用,这样开发者可以直接将这项技术构建到他们的应用程序中。这使我们能够对滥用情况进行监测并采取行动,不断建立缓解措施,以应对人们滥用我们系统的真实方式,而不仅仅是关于滥用可能是什么样子的理论。
Real-world use has also led us to develop increasingly nuanced policies against behavior that represents a genuine risk to people while still allowing for the many beneficial uses of our technology.
现实世界的使用也导致我们制定了越来越细致入微的政策来反对对人们构成真正风险的行为,同时仍然允许我们的技术的许多有益用途。
Crucially, we believe that society must have time to update and adjust to increasingly capable AI, and that everyone who is affected by this technology should have a significant say in how AI develops further. Iterative deployment has helped us bring various stakeholders into the conversation about the adoption of AI technology more effectively than if they hadn't had firsthand experience with these tools.
至关重要的是,我们认为,社会必须有时间更新和调整,以适应越来越强大的人工智能,每个受到这种技术影响的人都应该对人工智能如何进一步发展有重要的发言权。迭代部署已经帮助我们将各种利益相关者更有效地带入到关于采用 AI 技术的对话中,如果他们没有这些工具的第一手经验的话。
Protecting children
保护儿童
One critical focus of our safety efforts is protecting children. We require that people must be 18 or older—or 13 or older with parental approval—to use our AI tools and are looking into verification options.
我们安全工作的一个关键重点是保护儿童。我们要求使用我们的人工智能工具的人必须年满18岁或以上ーー或者年满13岁或以上并获得父母的批准ーー并正在研究验证方案。
We do not permit our technology to be used to generate hateful, harassing, violent or adult content, among other categories. Our latest model, GPT-4 is 82% less likely to respond to requests for disallowed content compared to GPT-3.5 and we have established a robust system to monitor for abuse. GPT-4 is now available to ChatGPT Plus subscribers and we hope to make it available to even more people over time.
我们不允许我们的技术被用来制造仇恨、骚扰、暴力或成人内容等等。与 GPT-3.5相比,我们最新的模型 GPT-4对不允许内容的请求的响应可能性降低了82% ,我们已经建立了一个健全的系统来监测滥用情况。GPT-4现在提供给 ChatGPT Plus 用户,我们希望随着时间的推移,它能够提供给更多的人。
We have made significant effort to minimize the potential for our models to generate content that harms children. For example, when users try to upload Child Sexual Abuse Material to our image tools, we block and report it to the National Center for Missing and Exploited Children.
我们已经做出了巨大的努力,尽量减少我们的模型产生伤害儿童的内容的可能性。例如,当用户试图上传儿童性虐待材料到我们的图像工具,我们阻止并报告给国家失踪和被剥削儿童中心。
In addition to our default safety guardrails, we work with developers like the non-profit Khan Academy—which has built an AI-powered assistant that functions as both a virtual tutor for students and a classroom assistant for teachers—on tailored safety mitigations for their use case. We are also working on features that will allow developers to set stricter standards for model outputs to better support developers and users who want such functionality.
除了我们默认的安全护栏之外,我们还与非营利组织可汗学院(Khan Academy)等开发商合作,为他们的用例量身定制安全缓解措施。可汗学院开发了一种基于人工智能的助手,既可以充当学生的虚拟家教,也可以充当教师的课堂助手。我们还在开发一些特性,这些特性允许开发人员为模型输出设置更严格的标准,以便更好地支持开发人员和需要此类功能的用户。
Respecting privacy
尊重隐私
Our large language models are trained on a broad corpus of text that includes publicly available content, licensed content, and content generated by human reviewers. We don’t use data for selling our services, advertising, or building profiles of people—we use data to make our models more helpful for people. ChatGPT, for instance, improves by further training on the conversations people have with it.
我们的大型语言模型基于广泛的文本语料库进行训练,这些文本语料库包括公开可用的内容、授权内容和由人工评审员生成的内容。我们不使用数据来销售我们的服务、广告或者建立人们的档案ーー我们使用数据来使我们的模型对人们更有帮助。例如,ChatGPT 通过进一步培训人们与它的对话而得到改善。
While some of our training data includes personal information that is available on the public internet, we want our models to learn about the world, not private individuals. So we work to remove personal information from the training dataset where feasible, fine-tune models to reject requests for personal information of private individuals, and respond to requests from individuals to delete their personal information from our systems. These steps minimize the possibility that our models might generate responses that include the personal information of private individuals.
虽然我们的一些培训数据包括可在公共互联网上获得的个人信息,但我们希望我们的模型了解世界,而不是个人。因此,我们致力于在可行的情况下从训练数据集中删除个人信息,微调模型以拒绝个人对个人信息的请求,并响应个人从我们的系统中删除个人信息的请求。这些步骤最小化了我们的模型可能产生包含私人个人信息的响应的可能性。
Improving factual accuracy
提高事实的准确性
Today’s large language models predict the next series of words based on patterns they have previously seen, including the text input the user provides. In some cases, the next most likely words may not be factually accurate.
今天的大型语言模型基于它们以前看到的模式预测下一系列单词,包括用户提供的文本输入。在某些情况下,下一个最有可能的词语可能不是事实上的准确。
Improving factual accuracy is a significant focus for OpenAI and many other AI developers, and we’re making progress. By leveraging user feedback on ChatGPT outputs that were flagged as incorrect as a main source of data—we have improved the factual accuracy of GPT-4. GPT-4 is 40% more likely to produce factual content than GPT-3.5.
提高事实的准确性是 OpenAI 和许多其他 AI 开发人员的一个重要关注点,我们正在取得进展。通过利用用户对 ChatGPT 输出的反馈,这些输出被标记为不正确的主要数据源ーー我们提高了 GPT-4的事实准确性。GPT-4产生事实性内容的可能性比 GPT-3.5高40% 。
When users sign up to use the tool, we strive to be as transparent as possible that ChatGPT may not always be accurate. However, we recognize that there is much more work to do to further reduce the likelihood of hallucinations and to educate the public on the current limitations of these AI tools.
当用户注册使用该工具时,我们尽可能做到透明,因为 ChatGPT 可能并不总是准确的。然而,我们认识到,要进一步减少产生幻觉的可能性,并教育公众了解这些人工智能工具目前的局限性,还有许多工作要做。
Continued research and engagement
继续研究和参与
We believe that a practical approach to solving AI safety concerns is to dedicate more time and resources to researching effective mitigations and alignment techniques and testing them against real-world abuse.
我们认为,解决人工智能安全问题的一个切实可行的办法是投入更多的时间和资源,研究有效的缓解和调整技术,并针对现实世界的滥用情况进行测试。
Importantly, we also believe that improving AI safety and capabilities should go hand in hand. Our best safety work to date has come from working with our most capable models because they are better at following users’ instructions and easier to steer or “guide.”
重要的是,我们还认为,提高人工智能的安全性和能力应该齐头并进。迄今为止,我们最好的安全工作来自与我们最有能力的模型合作,因为它们更善于遵循用户的指示,更容易掌舵或“指导”
We will be increasingly cautious with the creation and deployment of more capable models, and will continue to enhance safety precautions as our AI systems evolve.
随着人工智能系统的不断发展,我们将对创建和部署更有能力的模型越来越谨慎,并将继续加强安全防范措施。
While we waited over 6 months to deploy GPT-4 in order to better understand its capabilities, benefits, and risks, it may sometimes be necessary to take longer than that to improve AI systems' safety. Therefore, policymakers and AI providers will need to ensure that AI development and deployment is governed effectively at a global scale, so no one cuts corners to get ahead. This is a daunting challenge requiring both technical and institutional innovation, but it’s one that we are eager to contribute to.
为了更好地了解 GPT-4的能力、好处和风险,我们等待了6个多月才部署 GPT-4,但有时可能需要更长的时间来提高 AI 系统的安全性。因此,政策制定者和人工智能提供者将需要确保人工智能的开发和部署在全球范围内得到有效管理,因此没有人会为了获得成功而走捷径。这是一个令人生畏的挑战,需要技术和制度创新,但这是一个我们迫切希望做出贡献。
Addressing safety issues also requires extensive debate, experimentation, and engagement, including on the bounds of AI system behavior. We have and will continue to foster collaboration and open dialogue among stakeholders to create a safe AI ecosystem.
解决安全问题还需要广泛的辩论、实验和参与,包括人工智能系统行为的界限。我们已经并将继续促进利益攸关方之间的合作和公开对话,以创建一个安全的人工智能生态系统。