Sam Altman says the research strategy that birthed ChatGPT is played out and future strides in artificial intelligence will require new ideas.

薩姆·奧特曼表示,催生了ChatGPT的研究策略已經(jīng)用盡,未來人工智能的進(jìn)步將需要新的想法。

THE STUNNING CAPABILITIES of ChatGPT, the chatbot from startup OpenAI, has triggered a surge of new interest and investment in artificial intelligence. But late last week, OpenAI’s CEO warned that the research strategy that birthed the bot is played out. It''s unclear exactly where future advances will come from.

來自O(shè)penAI初創(chuàng)公司的聊天機(jī)器人ChatGPT的驚人能力激發(fā)了社會(huì)對人工智能的新興趣和投資的激增。但就在上周晚些時(shí)候,OpenAI的首席執(zhí)行官警告說,催生該機(jī)器人的研究策略已經(jīng)用盡。目前尚不清楚未來該向何處發(fā)展。

OpenAI has delivered a series of impressive advances in AI that works with language in recent years by taking existing machine-learning algorithms and scaling them up to previously unimagined size. GPT-4, the latest of those projects, was likely trained using trillions of words of text and many thousands of powerful computer chips. The process cost over $100 million.

近年來,OpenAI通過將現(xiàn)有的機(jī)器學(xué)習(xí)算法擴(kuò)展到以前難以想象的規(guī)模,在自然語言處理的人工智能領(lǐng)域取得了一系列令人印象深刻的進(jìn)步。這些項(xiàng)目中最新的一個(gè)是GPT-4,它很可能是用數(shù)萬億文字和數(shù)千個(gè)強(qiáng)大的計(jì)算機(jī)芯片進(jìn)行訓(xùn)練的。這個(gè)過程耗資超過1億美元。

But the company’s CEO, Sam Altman, says further progress will not come from making models bigger. “I think we're at the end of the era where it's going to be these, like, giant, giant models,” he told an audience at an event held at MIT late last week. “We'll make them better in other ways.”

但該公司首席執(zhí)行官薩姆·奧特曼表示,進(jìn)一步的進(jìn)展不會(huì)是制造更大的模型?!拔艺J(rèn)為我們已經(jīng)到了這樣一個(gè)時(shí)代的盡頭,大模型的盡頭”他上周晚些時(shí)候在麻省理工學(xué)院舉行的一次活動(dòng)上對觀眾說。“我們會(huì)用其他方式讓人工智能變得更好?!?/b>

Altman’s declaration suggests an unexpected twist in the race to develop and deploy new AI algorithms. Since OpenAI launched ChatGPT in November, Microsoft has used the underlying technology to add a chatbot to its Bing search engine, and Google has launched a rival chatbot called Bard. Many people have rushed to experiment with using the new breed of chatbot to help with work or personal tasks.

奧特曼的聲明表明,在開發(fā)和部署新人工智能算法的競賽中出現(xiàn)了一個(gè)意想不到的轉(zhuǎn)折。自從OpenAI在11月推出ChatGPT以來,微軟已經(jīng)使用底層技術(shù)為其必應(yīng)搜索引擎添加了一個(gè)聊天機(jī)器人,谷歌也推出了一個(gè)名為Bard的聊天機(jī)器人作為競爭對手。許多人急于嘗試使用這種新型聊天機(jī)器人來幫助工作或個(gè)人任務(wù)。

Meanwhile, numerous well-funded startups, including Anthropic, AI21, Cohere, and Character.AI, are throwing enormous resources into building ever larger algorithms in an effort to catch up with OpenAI’s technology. The initial version of ChatGPT was based on a slightly upgraded version of GPT-3, but users can now also access a version powered by the more capable GPT-4.

與此同時(shí),包括Anthropic、AI21、Cohere和Character.AI在內(nèi)的許多資金充足的初創(chuàng)公司正在投入巨大資源構(gòu)建越來越大的算法,以努力趕上OpenAI的技術(shù)。ChatGPT的最初版本是基于GPT-3的略微升級版本,但現(xiàn)在用戶也可以訪問由能力更強(qiáng)的GPT-4驅(qū)動(dòng)的版本。

Altman’s statement suggests that GPT-4 could be the last major advance to emerge from OpenAI’s strategy of making the models bigger and feeding them more data. He did not say what kind of research strategies or techniques might take its place. In the paper describing GPT-4, OpenAI says its estimates suggest diminishing returns on scaling up model size. Altman said there are also physical limits to how many data centers the company can build and how quickly it can build them.

奧特曼的聲明表明,GPT-4可能是OpenAI制造更大模型并輸入更多數(shù)據(jù)的戰(zhàn)略中出現(xiàn)的最后一個(gè)重大進(jìn)步。他沒有說明可能取代它的將會(huì)是哪種研究策略或技術(shù)。在描述GPT-4的論文中,OpenAI表示,其估計(jì)表明,擴(kuò)大模型規(guī)模的收益正在減少。奧特曼說,對于公司可以建造多少數(shù)據(jù)中心以及建造速度有多快,也存在物理限制。

Nick Frosst, a cofounder at Cohere who previously worked on AI at Google, says Altman’s feeling that going bigger will not work indefinitely rings true. He, too, believes that progress on transformers, the type of machine learning model at the heart of GPT-4 and its rivals, lies beyond scaling. “There are lots of ways of making transformers way, way better and more useful, and lots of them don’t involve adding parameters to the model,” he says. Frosst says that new AI model designs, or architectures, and further tuning based on human feedback are promising directions that many researchers are already exploring.

Cohere的聯(lián)合創(chuàng)始人尼克·弗羅斯特曾在谷歌從事人工智能工作,他表示,阿爾特曼認(rèn)為不斷擴(kuò)大規(guī)模并非長久之計(jì)的感覺是正確的。他也認(rèn)為,在GPT-4及其競爭對手的核心——基于注意力機(jī)制的機(jī)器學(xué)習(xí)模型上取得進(jìn)展,不僅僅在于規(guī)模?!坝泻芏喾椒梢宰屪⒁饬C(jī)制模型變得更好、更有用,而且這些方法并不包括向模型添加參數(shù),”他說。弗羅斯特表示,新的人工智能模型設(shè)計(jì)或架構(gòu),以及基于人類反饋的進(jìn)一步調(diào)整,是許多研究人員已經(jīng)在探索的有希望的方向。

Each version of OpenAI’s influential family of language algorithms consists of an artificial neural network, software loosely inspired by the way neurons work together, which is trained to predict the words that should follow a given string of text.

OpenAI有影響力的語言算法家族的每個(gè)版本都由一個(gè)人工神經(jīng)網(wǎng)絡(luò)組成,該軟件的靈感來自于神經(jīng)元協(xié)同工作的方式,經(jīng)過訓(xùn)練可以預(yù)測給定文本字符串后應(yīng)跟隨的單詞。

The first of these language models, GPT-2, was announced in 2019. In its largest form, it had 1.5 billion parameters, a measure of the number of adjustable connections between its crude artificial neurons.

這些語言模型中的第一個(gè),GPT-2,于2019年宣布。在其最大形式中,它有15億個(gè)參數(shù),這是其原始人工神經(jīng)元之間可調(diào)節(jié)連接數(shù)量的度量。
原創(chuàng)翻譯:龍騰網(wǎng) http://nxnpts.cn 轉(zhuǎn)載請注明出處


At the time, that was extremely large compared to previous systems, thanks in part to OpenAI researchers finding that scaling up made the model more coherent. And the company made GPT-2’s successor, GPT-3, still bigger, with a whopping 175 billion parameters. That system’s broad abilities to generate poems, emails, and other text helped convince other companies and research institutions to push their own AI models to similar and even greater size.

當(dāng)時(shí),與以往系統(tǒng)相比,這非常龐大,部分歸功于OpenAI研究人員發(fā)現(xiàn)擴(kuò)大規(guī)模會(huì)使模型更加連貫的特性。該公司使GPT-2的繼任者GPT-3更加龐大,擁有驚人的1750億參數(shù)。該系統(tǒng)廣泛的能力能夠生成詩歌、電子郵件和其他文本,幫助說服其他公司和研究機(jī)構(gòu)推動(dòng)他們自己的人工智能模型達(dá)到類似甚至更大的規(guī)模。

After ChatGPT debuted in November, meme makers and tech pundits speculated that GPT-4, when it arrived, would be a model of vertigo-inducing size and complexity. Yet when OpenAI finally announced the new artificial intelligence model, the company didn’t disclose how big it is—perhaps because size is no longer all that matters. At the MIT event, Altman was asked if training GPT-4 cost $100 million; he replied, “It’s more than that.”

ChatGPT在11月首次亮相后,表情包制作者和科技評論家推測,GPT-4一旦到來,將是一個(gè)令人眩暈的大小和復(fù)雜度的模型。然而,當(dāng)OpenAI最終宣布了新的人工智能模型時(shí),公司并沒有透露它有多大——也許是因?yàn)榇笮〔辉倌敲粗匾?。在麻省理工學(xué)院的活動(dòng)中,奧特曼被問及訓(xùn)練GPT-4是否花費(fèi)了1億美元;他回答說,“超過那個(gè)數(shù)目。”

Although OpenAI is keeping GPT-4’s size and inner workings secret, it is likely that some of its intelligence already comes from looking beyond just scale. On possibility is that it used a method called reinforcement learning with human feedback, which was used to enhance ChatGPT. It involves having humans judge the quality of the model’s answers to steer it towards providing responses more likely to be judged as high quality.

盡管OpenAI對GPT-4的大小和內(nèi)部工作保持秘密,但它的一些智能表現(xiàn)可能已經(jīng)說明它不僅僅靠堆疊參數(shù)規(guī)模而做到的。一種可能性是它使用了一種稱為強(qiáng)化學(xué)習(xí)與人類反饋的方法,這種方法被用來增強(qiáng)ChatGPT。它包括讓人類判斷模型答案的質(zhì)量,以引導(dǎo)它提供更有可能被評為高質(zhì)量的回答。

The remarkable capabilities of GPT-4 have stunned some experts and sparked debate over the potential for AI to transform the economy but also spread disinformation and eliminate jobs. Some AI experts, tech entrepreneurs including Elon Musk, and scientists recently wrote an open letter calling for a six-month pause on the development of anything more powerful than GPT-4.

GPT-4的卓越能力震撼了一些專家,并引發(fā)了關(guān)于人工智能可能如何轉(zhuǎn)變經(jīng)濟(jì)的辯論,同時(shí)也引發(fā)了關(guān)于散布虛假信息和消除就業(yè)崗位的擔(dān)憂。一些人工智能專家、包括埃隆·馬斯克在內(nèi)的科技企業(yè)家和科學(xué)家最近寫了一封公開信,呼吁暫停開發(fā)任何比GPT-4更強(qiáng)大的技術(shù)六個(gè)月。

At MIT last week, Altman confirmed that his company is not currently developing GPT-5. “An earlier version of the letter claimed OpenAI is training GPT-5 right now,” he said. “We are not, and won't for some time.”

上周在麻省理工學(xué)院,奧特曼確認(rèn)他的公司目前沒有在開發(fā)GPT-5?!霸缙诎姹镜墓_信聲稱OpenAI目前正在訓(xùn)練GPT-5,”他說?!暗鋵?shí)我們沒有,而且短時(shí)間內(nèi)也不會(huì)。”