这是 高翻考研 发出的第 577 篇文章
昨天吃完粽子,冲浪时发现一个趣闻。
一篇来自AI研究非营利组织LAION的论文引起了广泛关注,我仔细看了下,挺有意思。
摘两段外刊报道,给大家看一下:
A fascinating new paper from scientists at the AI research nonprofit LAION finds that even the most sophisticated large language models (LLMs) are frequently stumped by the same simple logic question — a finding that the researchers believe casts doubt on whether frontier AI language models are quite as advanced as their creators often claim.
The paper, which has yet to be peer-reviewed, refers to the AI-stumping prompt as the "Alice in Wonderland" — or AIW — problem. It's a straightforward reasoning question: "Alice has [X] brothers and she also has [Y] sisters. How many sisters does Alice's brother have?" (The researchers used a few different versions of the problem, for example switching up the X and Y figures or altering the prompt language to include a few more demands, but the basic reasoning process required to solve the problem remained the same throughout.)
文章出自Futurism,感兴趣的同学可以去逛逛。
该报道称,研究发现,即便是领先的大型语言模型(LLMs)在面对一个简单的逻辑问题时也频繁出错。
这个问题被称为“爱丽丝梦游仙境”问题,也即:
Alice has [X] brothers and she also has [Y] sisters. How many sisters does Alice's brother have?
简单翻译一下:
爱丽丝有[X]个兄弟,她还有[Y]个姐妹。爱丽丝的兄弟有多少个姐妹?
这项研究所揭示的事实还挺让人惊讶的:前沿AI语言模型的能力或许并不像其创造者所宣称的那样强大。
怀着好奇的心情,我也测试了一下这个问题。结果正如论文所述,我在ChatGPT(无论是4还是4o)上测试时,得到的答案是错误的。
即便告诉它,它是错的,依旧没有用。甚至,4o又给出了更加错误的答案。
我又用了一下腾讯元宝。元宝的答案是对的,甚至有点无懈可击。