点击上方图片 播放音频
2024年10月9日,诺贝尔化学奖授予 John Jumper、Demis Hassabis 和 David Baker,表彰以AlphaFold 为代表的AI技术在蛋白质结构预测与蛋白质计算设计中的革命性影响。AlphaFold 2的局限性在于其仅能预测单一蛋白质,无法解决蛋白质与其他生物分子之间的复杂相互作用。AlphaFold 3的推出正是为此而生,它不但延续了其前身的精准性,还可以模拟蛋白质与DNA、RNA及修饰分子等其他生物分子间的复杂相互作用,进一步推动了结构生物学的发展。目前,AlphaFold 3尚未全面开源,其完整功能主要限于特定领域的研究者使用。
向上滑动阅览双语文本内容:
Nick Petrić Howe
AlphaFold and its successor AlphaFold 2 have been game changers in biology, as the AIs have made it easier than ever to predict the structures of proteins – the molecules that make up so much of life. And now there's a new version, AlphaFold 3, which promises even greater predictive abilities. To dive into the details of this upgrade, I'm joined by Ewen Callaway who has been writing about AlphaFold, well, basically since the beginning — for years now — and has been working on a new story about the latest iteration AlphaFold3. Ewen hi, how's it going?
AlphaFold及其继任者AlphaFold 2在生物学领域掀起了革命性的变化,AI使预测蛋白质结构——构成生命的重要分子——变得前所未有的容易。现在,全新的AlphaFold 3版本带来了更强的预测能力。为了深入探讨此次升级的细节,我邀请了Ewen Callaway,他几乎从一开始就一直在撰写关于AlphaFold的报道,已经多年了,并且最近正在撰写关于最新版本AlphaFold 3的文章。Ewen,你好,近况如何?
Ewen Callaway
It's good. Yeah, I like to see AlphaFold 3 as “The Return of the Jedi” of AlphaFold kind of completing the trilogy. Or “The Godfather III”. I don’t know which was deemed better.
很好。我觉得AlphaFold 3有点像《绝地归来》那样,完成了AlphaFold的三部曲。或者像《教父3》,不过我不知道哪一个评价更好。
Nick Petrić Howe
Hopefully it's not Godfather III because that one was not the best one in my opinion. But let's start off by talking about what this version can do. So, what is it that AlphaFold 3 promises that maybe the other versions lacked?
希望它不像《教父3》,因为在我看来那一部不算最好的。让我们从这个版本的功能开始谈起吧。AlphaFold 3承诺带来哪些其他版本可能缺乏的功能呢?
Ewen Callaway
Yeah. So the real revolution was AlphaFold 2. Mark one, it did a lot and it won this competition of protein protection but AlphaFold 2 was the game changer. And what it did was you know, you'd input a sequence of amino acids, which are the building blocks of proteins, and it would give you a pretty darn good prediction of what that protein looked like in three dimensions. The only problem — I wouldn't say a problem — was that it just predicted protein structures, it didn't predict proteins alongside all the other, you know, players of the cellular ecosystem. That was just not part of AlphaFold 2’s language, part of its abilities. And so AlphaFold 3, this latest update, is exactly that it brings in the rest of the ecosystem, all these other players and you know, predicts proteins alongside of them.
是的,真正的革命性进展出现在AlphaFold 2。虽然第一代版本做了很多工作,并赢得了蛋白质预测的竞赛,但AlphaFold 2才是真正的游戏规则改变者。它的工作原理是,输入一段氨基酸序列,也就是蛋白质的构建模块,它就可以非常精确地预测出蛋白质的三维结构。唯一的问题——我不认为这是个问题——就是它只预测了蛋白质结构,而没有预测与之共存的细胞生态系统中的其他“角色”。这并不是AlphaFold 2的设计语言或能力。而AlphaFold 3,这一最新的更新,正是弥补了这一点,它将整个生态系统中的其他“角色”引入,并能同时预测这些蛋白质与它们之间的相互作用。
Nick Petrić Howe
So, when you say like these other players, this is things like DNA, RNA…
那么,当你提到这些“其他角色”时,是指像DNA、RNA之类的分子吗?
Ewen Callaway
Yeah, all of these things. So say you've got a protein that's involved in copying our DNA, which is something that's one of the most basic features of life, well, that protein needs to attach to DNA, and AlphaFold 3 can do that. So you've got proteins that are helping to turn, you know, DNA into proteins via an RNA intermediate, you got some proteins that recognise RNA well, Fold 3 can do that. There are lots of modifications, you know, you plonk on something called a phosphate group. It's called phosphorylation and that activates many proteins, and they propagate signals throughout cells. And so it's really kind of embedding proteins in their ecosystem in their environment, you know, what they do with all these like complex roles, you really need to know about these other players.
是的,所有这些分子。比如你有一个参与DNA复制的蛋白质,而DNA复制是生命最基本的功能之一,这个蛋白质需要与DNA结合,而AlphaFold 3可以模拟这种结合。还有一些帮助将DNA通过RNA中间体转化为蛋白质的蛋白质,AlphaFold 3也能预测它们的相互作用。有许多修饰,比如你会添加一个叫磷酸基团的物质,这个过程叫做磷酸化,它能够激活许多蛋白质,并在细胞内传播信号。因此,AlphaFold 3实际上是将蛋白质嵌入它们的生态系统中,结合它们所承担的复杂功能,了解这些其他“角色”的存在是非常重要的。
Nick Petrić Howe
And I mean, given how many different kinds of molecules and how many different proteins there are, like, it seems like this would be a huge challenge to actually bring these things together in all the possible, you know, iterations they could be. How have they managed it?
我想考虑到有这么多种不同的分子和不同的蛋白质,似乎将这些东西以各种可能的方式结合起来会是一个巨大的挑战。他们是如何实现这一点的呢?
Ewen Callaway
I mean, it's a very sophisticated neural network. But I think the principle that, you know, I understood from talking with John Jumper — who led the development of AlphaFold 3 — was that all this information, you know, all these modifications, all these accessory molecules, they're experimental examples, real world examples of them with their protein partners sitting in this database called the protein databank that AlphaFold 1, 2 and now 3 was trained on. And so you've got lots of examples, lots of good data for a machine-learning model and artificial intelligence to learn from. And you know, with a bunch of bells and whistles that I won't bore you with, like transformer and diffusion and embeddings, they've, you know, created a model that can represent all this additional data, not just the protein sequence, but all these other atoms that are sitting there in this database.
这是一个非常复杂的神经网络。但我认为,我从与John Jumper(他领导了AlphaFold 3的开发)交谈中理解到的原则是,所有这些信息、所有这些修饰、所有这些辅助分子,它们都有实验实例,这些实例与它们的蛋白质伙伴一起存储在一个叫做蛋白质数据库的地方,AlphaFold 1、2以及现在的AlphaFold 3都是在这个数据库上进行训练的。因此,你有很多实例,很多优质的数据,供机器学习模型和人工智能学习。再加上一些我不会让你感到无聊的技术细节,比如变压器、扩散和嵌入等,他们创造了一个模型,能够代表所有这些额外的数据,不仅是蛋白质序列,还有数据库中存在的其他原子。
Nick Petrić Howe
And so this has been developed, as you said, by Google DeepMind. Have other researchers got to try out this new tool?
正如你所说的,这是由Google DeepMind开发的。其他研究人员是否有机会尝试这个新工具?
Ewen Callaway
Yeah, it seems like a fair number of researchers have got a sneak preview of it. I actually spoke with somebody this morning, who didn't get an official sneak preview, but reviewed the paper for me, and he was able to get onto the server. And people like it, you know, they say it's really fast, it's really convenient. But it lowers the barrier to entry I think, compared to AlphaFold 2 which to use it you almost sometimes had to download your own version and run it or run it on a server. This is, as far as I can tell, kind of a web form where you input your sequence, pick some boxes of sorts of modifications that you want to see, and bada bing bada boom, 10 minutes later, you've got a prediction that can help you do some experiments. So the limited feedback I've gotten so far is that it's really helpful.
是的,看起来有相当多的研究人员已经提前体验了这个工具。今天早上我还跟一个人谈过,他并没有获得官方的抢先体验,但为我审阅了论文,并且能够进入服务器。人们很喜欢它,他们说它非常快,非常方便。我认为与AlphaFold 2相比,它降低了使用门槛。使用AlphaFold 2时,有时你几乎需要下载自己的版本运行,或者在服务器上运行。而AlphaFold 3,至少据我所知,它类似于一个网页表单,你输入序列,选择一些你想查看的修饰,然后“砰”的一下,10分钟后,你就得到了一个预测,能够帮助你进行实验。到目前为止,我收到的反馈是它非常有用。
Nick Petrić Howe
What are researchers hoping that this tool could be used for?
研究人员希望这个工具能够用于哪些方面呢?
Ewen Callaway
I think it's part of maybe getting a better approximation of how your protein of interest is doing its job, how it's playing its part in the cellular ballet, whatever we want to call it. So the one example that, you know, I talked with the scientists who used it, he's a scientist at the Crick Institute, across from, you know, Nature HQ in London. He studies DNA replication and a lot of the proteins that his lab is interested in, directly bind DNA. And they'll be, you know, portions of these proteins that binds DNA. And so with the predictions that he got from AlphaFold 3 his lab started making mutations to their protein to try and alter, you know, how it bound DNA and found that some of these really panned out that the predictions were kind of on the spot. So it gives him some insight into how this protein he's interested in does its job.
我认为,这可能是为了更好地近似你的目标蛋白质如何发挥其作用,如何在细胞的“芭蕾”中扮演角色,不论我们怎么称呼它。我举个例子,我与使用过它的科学家交谈过,他是一位在伦敦克里克研究所的科学家,他研究DNA复制,很多他实验室感兴趣的蛋白质直接与DNA结合。那些蛋白质中有一些部分会与DNA结合。因此,利用从AlphaFold 3获得的预测,他的实验室开始对蛋白质进行突变,试图改变它与DNA结合的方式,结果发现其中一些预测非常准确。这给了他对他感兴趣的蛋白质如何发挥作用的一些见解。
Nick Petrić Howe
So one thing about AlphaFold’s previous iterations that researchers were a bit sceptical of was its ability to help with drug discovery. Do you think this new tool will help bridge that gap?
关于AlphaFold的前几个版本,研究人员有些怀疑其在药物研发中的作用。你认为这个新工具能帮助弥合这个差距吗?
Ewen Callaway
AlphaFold has been– it’s been revolutionary, but I think, you know, with drug discovery, you know it's been maybe a mixed bag. There have been you know, a lot of a lot of scepticism whether it's really a game changer. There have been some studies suggesting that it's structures can be useful for drug discovery, but drug discovery is a complex process with lots and lots of steps. And you know, one AI is not going to disrupt this. AlphaFold 3 I think, you know, I've talked with people because it can model because it predict potentially how other molecules interact with proteins, it could potentially be very useful for drug discovery. And in fact, Google DeepMind has a spin off called Isomorphic Labs and they're using AlphaFold 3 to do just that. The hitch is that, you know, the way that Google DeepMind is making AlphaFold 3 accessible to the scientific community, they're not going to allow researchers to straightforwardly look for how their protein of choice binds to a new drug, that's just not possible for people to do. You know, and that was a decision that Google DeepMind made, you know, they put a lot of resources into developing it, and they're going to reserve the commercial pursuits for their partners. But I did speak with, you know, some scientists who said that, you know, the paper that they published in Nature , you know, releases enough information about how this model was developed, that within a year or so, other researchers can develop open-source versions that you can plug any potential drug into it. So it could potentially have, you know, a really significant impact on drug discovery, not just for, for Google and its partners, but for the field as a whole. I think that remains to be seen. But you know, that's a possibility.
AlphaFold确实是一次革命性的突破,但在药物研发领域,情况可能有些复杂。许多人对其是否真正改变了游戏规则持怀疑态度。有一些研究表明,它的结构预测对药物研发是有帮助的,但药物研发是一个非常复杂的过程,包含很多步骤。一个AI系统无法颠覆这个过程。AlphaFold 3因为能够模拟其他分子与蛋白质的相互作用,可能在药物研发中非常有用。实际上,Google DeepMind有一个名为Isomorphic Labs的衍生公司,他们正在利用AlphaFold 3进行这方面的工作。问题在于,Google DeepMind让科学界使用AlphaFold 3的方式,不允许研究人员直接查找他们的目标蛋白与新药的结合方式,这对于大多数人来说是不可能的。这是Google DeepMind的一个决定,他们投入了大量资源来开发这个系统,并将商业追求保留给他们的合作伙伴。但我与一些科学家交谈时,他们说Google在《自然》发表的论文中提供了足够的信息,足以让其他研究人员在一年左右的时间内开发出开源版本,这样你就可以将任何潜在的药物输入系统进行预测。因此,它可能会对药物研发产生非常重要的影响,不仅对Google及其合作伙伴,对整个领域都是如此。我认为这一点尚待观察,但这是有可能的。
Nick Petrić Howe
Well, Ewen thank you so much for joining me.
好的,Ewen,非常感谢你加入我们。
Ewen Callaway
Yeah, thank you.
是的,谢谢你。
Nick Petrić Howe
And listeners, for more on that, check out our show notes for a link to Ewen’s News article.
各位听众,如果想了解更多相关信息,请查看我们的节目说明,我们会附上Ewen新闻文章的链接。
链接:https://www.nature.com/articles/d41586-024-01385-x
编辑:牛子璇
排版:Bonbon
校对:陈玮洁
审核:曹秋晨
转载须知:
本文为枢界centrangle原创笔记
如需转载请标明来源
扫码加入群聊,尽享文献全文