生成式人工智能(AI)市場持續(xù)火爆,主要市場參與者競相開發(fā)最優(yōu)秀的產(chǎn)品。作為在這場競賽中一直更加謹慎的參與者,谷歌(Google)本周發(fā)布了一條其Gemini大語言模型的預(yù)告視頻。但有媒體曝光該視頻并非實時錄制的AI的表現(xiàn)。
谷歌發(fā)布的演示視頻,展示了其AI模型的多模態(tài)能力,它能夠巧妙解讀和處理從直播視頻和音頻中提取的信息。這對谷歌而言是一項重大成就,尤其是在競爭激烈的生成式AI領(lǐng)域,谷歌一直落后于OpenAI等競爭對手。然而,據(jù)彭博社報道,演示視頻實際上是通過「使用視頻的靜態(tài)圖像幀,并通過文本提示」制作,而不能代表它似乎真得實現(xiàn)了實時語音和視頻處理。
周一,在舊金山召開的《財富》AI頭腦風暴大會上,谷歌助手與Bard部門副總裁兼總經(jīng)理蕭茜茜談?wù)摿诉@條引起爭議的演示視頻,著重闡述了Gemini作為一個大語言模型在基準測試中的表現(xiàn),以及它將如何完善谷歌聊天機器人Bard。
她說道:“視頻是完全真實的。所有提示詞和模型的回答都是真實的。我們確實為了簡潔縮短了部分內(nèi)容,我們把它們放到視頻中,作為制作視頻的參考信息?!?/p>
演示視頻展示了新AI模型的多模態(tài)能力,它能識別出一條曲線,可以評價新增線條的曲度,最后識別出一副鴨子的畫。在這個過程中,模型始終能夠識別每一種元素,并實時提供與鴨子相關(guān)的事實和回答。
蕭茜茜強調(diào)了Gemini的里程碑意義,展示了它在基準測試中的表現(xiàn)?;鶞蕼y試是對AI模型進行的測試,測試范圍包括高中物理、專業(yè)法律問題和道德情境等。據(jù)the Verge報道,Gemini Ultra在32個基準測試中,有30個測試的表現(xiàn)超過OpenAI的GPT-4,這是一項了不起的成就,但Gemini Ultra要到明年才能發(fā)布。目前,Bard使用的是相對落后的Gemini Pro,相當于GPT 3.5。
蕭茜茜表示,Gemini模型將繼續(xù)完善谷歌搜索和Bard聊天機器人。她表示,Bard “目前是市場上最受歡迎的免費聊天機器人”。(財富中文網(wǎng))
翻譯:劉進龍
審校:汪皓
生成式人工智能(AI)市場持續(xù)火爆,主要市場參與者競相開發(fā)最優(yōu)秀的產(chǎn)品。作為在這場競賽中一直更加謹慎的參與者,谷歌(Google)本周發(fā)布了一條其Gemini大語言模型的預(yù)告視頻。但有媒體曝光該視頻并非實時錄制的AI的表現(xiàn)。
谷歌發(fā)布的演示視頻,展示了其AI模型的多模態(tài)能力,它能夠巧妙解讀和處理從直播視頻和音頻中提取的信息。這對谷歌而言是一項重大成就,尤其是在競爭激烈的生成式AI領(lǐng)域,谷歌一直落后于OpenAI等競爭對手。然而,據(jù)彭博社報道,演示視頻實際上是通過「使用視頻的靜態(tài)圖像幀,并通過文本提示」制作,而不能代表它似乎真得實現(xiàn)了實時語音和視頻處理。
周一,在舊金山召開的《財富》AI頭腦風暴大會上,谷歌助手與Bard部門副總裁兼總經(jīng)理蕭茜茜談?wù)摿诉@條引起爭議的演示視頻,著重闡述了Gemini作為一個大語言模型在基準測試中的表現(xiàn),以及它將如何完善谷歌聊天機器人Bard。
她說道:“視頻是完全真實的。所有提示詞和模型的回答都是真實的。我們確實為了簡潔縮短了部分內(nèi)容,我們把它們放到視頻中,作為制作視頻的參考信息。”
演示視頻展示了新AI模型的多模態(tài)能力,它能識別出一條曲線,可以評價新增線條的曲度,最后識別出一副鴨子的畫。在這個過程中,模型始終能夠識別每一種元素,并實時提供與鴨子相關(guān)的事實和回答。
蕭茜茜強調(diào)了Gemini的里程碑意義,展示了它在基準測試中的表現(xiàn)?;鶞蕼y試是對AI模型進行的測試,測試范圍包括高中物理、專業(yè)法律問題和道德情境等。據(jù)the Verge報道,Gemini Ultra在32個基準測試中,有30個測試的表現(xiàn)超過OpenAI的GPT-4,這是一項了不起的成就,但Gemini Ultra要到明年才能發(fā)布。目前,Bard使用的是相對落后的Gemini Pro,相當于GPT 3.5。
蕭茜茜表示,Gemini模型將繼續(xù)完善谷歌搜索和Bard聊天機器人。她表示,Bard “目前是市場上最受歡迎的免費聊天機器人”。(財富中文網(wǎng))
翻譯:劉進龍
審校:汪皓
Amidst the frenzy that is the generative AI market, major players are fiercely vying for the shiniest product. For its part, Google, traditionally a more measured participant in this race, unveiled a teaser video for their Gemini large language model this week. However, things took a controversial turn when reports revealed the video was not actually a real time representation of the AI in action.
In the demo video released by Google, the showcased AI model shows its multimodal capabilities, demonstrating an ability to deftly decipher and handle information gleaned from live video and audio. It’s a formidable achievement for Google, particularly in the fierce arena of competition against the likes of OpenAI, where it has lagged behind. However, as reported by Bloomberg, the showcased demo was crafted by “using still image frames from the footage, and prompting via text,” rather than the real-time and vocal and video processing it seemed to achieve.
On stage at Fortune‘s Brainstorm AI conference in San Francisco on Monday, vice president and general manager of Google Assistant and Bard Sissie Hsiao spoke about the contentious demo video, focusing on the benchmarks Gemini reached as a model, and how it’ll propel Google’s chatbot Bard.
“The video is completely real. All the prompts and the model responses are real,” Hsiao said. “We did shorten parts for brevity, which we put in the video as information on making the video,” she noted.
The demo video displays the new AI model’s multimodal capabilities, identifying a squiggly line, then the curves of new lines, culminating in the creation of the drawing of a duck. Throughout this process, the model consistently recognizes each element, offering duck-related facts and answers in real-time.
Hsiao highlighted the milestones conquered by Gemini, showcasing its abilities in benchmarks that put AI models to the test, spanning high school physics, professional legal quandaries, and moral scenarios. According to the Verge, Gemini Ultra beat OpenAI’s GPT-4 in 30 out of 32 benchmarks—an achievement worth boasting about, although Gemini Ultra will not be released until next year. For now, Bard uses the less advanced Gemini Pro, which is roughly akin to GPT 3.5.
Hsiao said these Gemini models will continue to improve Google search as well as the Google Bard chatbot, which she said is “the most preferred free chat bot now in the market.”