A16z:万亿Tokens的Luma视频模型简约解释;夏季书单

文摘   2024-07-31 23:59   浙江  

文章一:超越语言:洞察万亿Token的Luma视频模型

原文链接:https://a16z.com/podcast/beyond-language-inside-a-hundred-trillion-token-video-model/

Luma是一个AI视频生成模型,核心产品包括FieldsEditor、Imagine3D和Luma Unreal Engine,它们提供了全方位的3D捕捉、建模与渲染解决方案。特别值得一提的是其开发的Genie。

此外,Luma AI还有一个名为Dream Machine的先进人工智能视频生成模型,它能够迅速制作出高质量和高度逼真的视频内容。

Luma AI由一群来自苹果和伯克利的创始人共同创立,专注于AI技术,尤其是NeRF(神经辐射场)技术的应用。

Luma AI在资本市场上也取得了成功,2000万美元的A轮融资,吸引了包括Amplify Partners、NVIDIA和General Catalyst在内的多家知名投资机构。其应用场景广泛,从房地产展示、旅游观光到虚拟开放日和个人创作,无不体现着Luma AI的强大功能。

在本期 AI + a16z 播客中,Luma 的首席科学家宋家明(Jiaming Song)与 a16z 的普通合伙人 Anjney Midha 一起讨论了宋家明在视频模型领域杰出的职业生涯,这一生涯在 Luma 最近发布的 Dream Machine 3D 视频模型中达到了高潮,该模型展示了在多个方面推理世界的能力。

宋家明讲述了图像和视频模型的历史,分享了他对多模态模型未来的愿景,并解释了为什么他认为 Dream Machine 展示了其新兴的推理能力。

简而言之:因为它是在大量高质量的视频数据上进行训练的,如果按语言数据来衡量,相当于数万亿的Tokens。

宋家明解释了Dream Machine 能够利用上下文丰富的视频数据做到这些事情的一个重要原因:

“For a lot of the problems related to artificial intelligence, it is often more productive in the long run to use methods that are simpler but use more compute, [rather] than trying to develop priors and then trying to leverage the priors so that you can use less compute.

“对于许多与人工智能相关的问题,从长远来看,使用更简单但计算量更大的方法通常更有效,而不是试图开发先验知识然后利用这些先验知识来减少计算量。

“Cases in this question first happened in language, where people were initially working on language understanding, trying to use grammar or semantic parsing, these kinds of techniques. But eventually these tasks began to be replaced by large language models. And a similar case is happening in the vision domain, as well . . . and now people have been using deep learning features for almost all the tasks. This is a clear demonstration of how using more compute and having less priors is good.

“在这个问题上,最初发生在语言领域,人们最初在进行语言理解,试图使用语法或语义解析等技术。但最终这些任务开始被大型语言模型取代。而在视觉领域也发生了类似的情况……现在人们几乎在所有任务中都使用深度学习特征。这清楚地表明,使用更多的计算量和较少的先验知识是有好处的。

“But how does it work with language? Language by itself is also a human construct. Of course, it is a very good and highly compressed kind of knowledge, but it’s definitely a lot less data than what humans take in day to day from the real world . . .

“但这与语言有何关系呢?语言本身也是人类的构造。当然,它是一种非常好且高度压缩的知识,但绝对比人类日常从现实世界中获取的数据要少得多……

“[And] it is a vastly smaller data set size than visual signals. And we are already almost exhausting the . . . high-quality language sources that we have in the world. The speed at which humans can produce language is definitely not enough to keep up with the demands of the scaling laws. So even if we have a world where we can scale up the compute infrastructure for that, we don’t really have the infrastructure to scale up the data efforts . . .

“而且它的数据集规模比视觉信号小得多。我们几乎已经耗尽了世界上所有的高质量语言资源。人类产生语言的速度绝对不足以满足扩展法则的要求。所以即使我们有一个可以扩展计算基础设施的世界,我们实际上也没有扩展数据工作的基础设施……

“Even though people would argue that the emergence of large language models is already evidence of the scaling law . . . against the rule-based methods in language understanding, we are arguing that language by itself is also a prior in the face of more of the richer data signal that is happening in the physical world.”

“尽管人们会争辩说大型语言模型的出现已经证明了扩展法则……相对于基于规则的语言理解方法,我们认为语言本身也是面对物理世界中更多更丰富数据信号的一个先验知识。”

本文部分翻译由ChatGPT及Kimi辅助完成。

文章二:A16z 2024书籍推荐

原文链接:https://a16zcrypto.com/posts/article/book-recommendation-reading-list-summer-2024/

A16z crypto团队推荐了一份多样的书单,涵盖了从自由市场经济到黑暗学术幻想的各种主题。

1.《非凡款待》和《当我谈跑步时》,我谈些什么是重复推荐的书籍,A16z的长期审美‍

2. 《巴别塔》是一部结合了奇幻和革命元素的感人小说。

3. 《光的记忆》是时间之轮系列的终章,有史诗般的结局和情感冲击。

4. 《拯救书籍的猫》是一部关于书籍热爱的温馨读物。

5. 《破碎的钱》探讨了去中心化数字货币解决货币系统问题的潜力。题目取得很好。

6. 《美国的支付系统》是一本关于理解支付系统复杂性的专业指南。

重复推荐 Repeated Recommendations

1. 非凡款待**(Unreasonable Hospitality)作者:Will Guidara

Unreasonable Hospitality by Will Guidara

   - 餐饮业的经验教训适用于多个领域。

   - Lessons from the restaurant industry that apply to many fields.

2. 当我谈跑步时,我谈些什么(What I Talk About When I Talk About Running)作者:村上春树

What I Talk About When I Talk About Running** by Haruki Murakami

   - 一本关于跑步及其深层意义的回忆录。

   - A memoir about running and its deeper meanings.

3. 各种詹姆斯·C·斯科特(James C. Scott)的作品

Various works by **James C. Scott**

   - 关于国家建设和社会规划的见解。

   - Insights on state formation and social planning.

新推荐 New Recommendations

 来自Brittney Burrows(活动团队):

 From Brittney Burrows (Events Team):

1. 巴别塔(Babel)作者:R.F. Kuang

Babelby R.F. Kuang

   - 一部美丽写成的小说,结合了奇幻元素和革命的心碎故事。

   - A beautifully written novel combining elements of fantasy and a heartbreaking story of revolution.

2. 光的记忆(A Memory of Light)作者:罗伯特·乔丹和布兰登·桑德森

A Memory of Light** by Robert Jordan and Brandon Sanderson

   - 时间之轮系列的史诗终章。

   - The epic conclusion to the Wheel of Time series.

3. 拯救书籍的猫(The Cat Who Saved Books)作者:那须川草介

The Cat Who Saved Books** by Sosuke Natsukawa

   - 一本温馨而感人的读物,关于对书籍的热爱。

   - A warm and touching read about the love of books.

 来自Michael Blau(交易团队):

 From Michael Blau (Deal Team):

1. 破碎的钱(Broken Money)作者:Lyn Alden

Broken Moneyby Lyn Alden

   - 关于货币历史的见解以及如何通过去中心化数字货币解决货币系统问题。

   - Insights into the history of money and how decentralized digital currency can solve monetary system issues.

2. 美国的支付系统(Payments Systems in the U.S.)作者:Carol Coye Benson, Scott Loftesness, Russ Jones

Payments Systems in the U.S. by Carol Coye Benson, Scott Loftesness, and Russ Jones

   - 一本关于支付系统的详细指南。

   - A detailed guide on payment systems.

B Impact出海
读者交流:jiangxiangyuting2;关注前沿科技与商业,创业4年;前彭博商业周刊中文版科技副总监·记者
 最新文章