據(jù)稱,有三條瘋傳的視頻拍攝了湯姆·克魯斯表演魔術(shù)、講一個(gè)不太搞笑的笑話和練習(xí)高爾夫球揮桿動(dòng)作等場(chǎng)面。據(jù)數(shù)字圖像法醫(yī)分析專家表示,這三條視頻是深度偽造視頻最逼真的例子。深度偽造視頻是利用人工智能技術(shù)創(chuàng)作的具有高度迷惑性的虛假視頻。
上周,一個(gè)名為@deeptomcruise的賬號(hào)在社交媒體TikTok上發(fā)布了這三條視頻,累計(jì)觀看次數(shù)達(dá)到約1,100萬次。該賬號(hào)有34.2萬粉絲,獲得該平臺(tái)的其他用戶100萬次點(diǎn)贊。
@deeptomcruise賬號(hào)背后的運(yùn)營(yíng)人員和運(yùn)營(yíng)團(tuán)隊(duì)尚無法確定,但湯姆·克魯斯的模仿者埃文·費(fèi)蘭迪在上周末告訴Mic網(wǎng)站,他認(rèn)為這些視頻應(yīng)該出自演員米爾斯·費(fèi)舍爾之手。費(fèi)舍爾與克魯斯長(zhǎng)相酷似,以前曾經(jīng)模仿過克魯斯。社交媒體網(wǎng)站上也有許多人認(rèn)為是費(fèi)舍爾在視頻中模仿了克魯斯,并使用深度偽造技術(shù)對(duì)臉部進(jìn)行了修改。
加州大學(xué)伯克利分校(University of California at Berkeley)的漢尼·法立德教授專門從事數(shù)字圖像分析。他表示,他相信這些是深度偽造視頻,但他們“做得非常逼真”。
根據(jù)法立德和他的一名研究生施盧蒂·阿加瓦爾的分析,有一些微小的證據(jù)能夠證明這些視頻是人工智能生成的偽造視頻。在一條視頻中,克魯斯似乎正在用一枚硬幣表演魔術(shù),在視頻最后克魯斯的眼睛顏色和眼型有輕微變化??唆斔沟暮缒ど线€可以看到兩個(gè)異常的小白點(diǎn),表面看是光線反射。但法立德表示,這兩個(gè)小白點(diǎn)的變化較真實(shí)視頻的變化更加明顯。
使用人工智能方法創(chuàng)建深度偽造視頻,通常會(huì)在他們創(chuàng)建的圖像和視頻中留下微妙的視覺異常,比如眼睛顏色或眼型不一致、耳朵輪廓古怪或發(fā)際線異常等。
深度偽造視頻常被用于換頭或換臉,而不是生成整個(gè)身體,而且法立德指出,表演硬幣魔術(shù)的雙手看起來像是克魯斯本人的雙手?;蛟S它們屬于正在表演硬幣魔術(shù)的另一位演員,但將臉換成了湯姆·克魯斯。
法立德還表示,雖然深度偽造視頻通常會(huì)涉及到全臉更換,但使用人工智能技術(shù)只生成臉部的一部分,有時(shí)候能夠達(dá)到更逼真的效果。他和阿加瓦爾懷疑,這三條視頻就是這種情況。他們認(rèn)為,嘴部可能是真實(shí)的,但眼睛區(qū)域使用了深度偽造技術(shù)。
法立德說:“如果視頻中是真人在模仿克魯斯,可能通過化妝達(dá)到某種良好的效果,這或許有一定的道理,而且換掉獨(dú)特的眼睛足以讓臉部肖像達(dá)到逼真的效果。也有可能是在拍攝完視頻之后進(jìn)行了一些編輯?!?/p>
深度偽造視頻使用一種名為GAN(生成對(duì)抗網(wǎng)絡(luò))的機(jī)器學(xué)習(xí)技術(shù)創(chuàng)作而成,該技術(shù)會(huì)對(duì)兩個(gè)深度神經(jīng)網(wǎng)絡(luò)進(jìn)行輪流訓(xùn)練。這種技術(shù)是大致基于人腦運(yùn)行模式的一種機(jī)器學(xué)習(xí)技術(shù)。一個(gè)網(wǎng)絡(luò)通過克魯斯本人的圖片或視頻進(jìn)行訓(xùn)練,用于生成克魯斯處在不同環(huán)境中或擺出不同姿勢(shì)的新圖像,其逼真程度足以騙過另外一個(gè)網(wǎng)絡(luò),后者接受的訓(xùn)練是從許多人的圖像中挑選出湯姆·克魯斯。
與大部分人工智能方法一樣,數(shù)據(jù)的數(shù)量和質(zhì)量決定了系統(tǒng)的效果。這解釋了為什么克魯斯經(jīng)常成為深度偽造視頻模仿的對(duì)象:他是世界上被拍照次數(shù)最多的名人之一。這些數(shù)據(jù)更容易訓(xùn)練出一個(gè)極其出色的湯姆·克魯斯圖像生成器。
法立德表示,克魯斯有獨(dú)特的嗓音和一些特殊的動(dòng)作,這并沒有壞處,反而增加了與他有關(guān)的深度偽造視頻的娛樂價(jià)值和社交媒體傳播力。
在這三條克魯斯視頻之前,有一條最廣為流傳并且最神秘的深度偽造視頻同樣涉及到克魯斯。這條視頻的發(fā)布者網(wǎng)名為Ctrl Shift Face,已經(jīng)創(chuàng)作了大批高度逼真的深度偽造視頻。在其去年發(fā)布的這條視頻中,喜劇演員比爾·哈德正在2008年大衛(wèi)·萊特曼的節(jié)目中模仿克魯斯。發(fā)布者利用深度偽造技術(shù)對(duì)這條視頻進(jìn)行了修改,讓哈德在模仿克魯斯的時(shí)候幾乎與本人一模一樣。在GAN技術(shù)誕生大約三年之后,即2017年,首次出現(xiàn)了深度偽造視頻。最早的視頻是將名人的頭換到色情影片女演員的身體上。但在那之后,該技術(shù)被用于創(chuàng)作了不同名人在不同環(huán)境下的視頻?,F(xiàn)在已經(jīng)有一款現(xiàn)成的軟件,支持用戶創(chuàng)作同樣令人難以辨別的深度偽造視頻,并且由于深度偽造視頻可能被用于復(fù)雜的政治虛假信息宣傳的情況,因此引起了安全研究人員的高度警惕。不過到目前為止,雖然專家仍然在就幾種可能的情況進(jìn)行爭(zhēng)論,但深度偽造視頻并不是虛假信息傳播的主要原因。
雖然今天的深度偽造視頻通??梢允褂脭?shù)字法醫(yī)分析進(jìn)行識(shí)別,但這個(gè)過程耗時(shí)漫長(zhǎng),而且需要具備豐富的專業(yè)知識(shí)。研究人員正在研究創(chuàng)建能夠自動(dòng)識(shí)別虛假偽造視頻的人工智能系統(tǒng)。2019年,F(xiàn)acebook發(fā)起了年度競(jìng)賽,希望找到最佳的系統(tǒng)。但在現(xiàn)有系統(tǒng)中,表現(xiàn)最好的系統(tǒng)只有65%的時(shí)間可以準(zhǔn)確識(shí)別。
阿加瓦爾表示,使用專門的商業(yè)軟件,有可能創(chuàng)作出與這三條克魯斯視頻同等質(zhì)量的深度偽造視頻。但這需要具備一定的技能,相關(guān)人工智能系統(tǒng)還需要有大量數(shù)據(jù)和訓(xùn)練時(shí)間,而且訓(xùn)練時(shí)間成本高昂。所以,為了一條TikTok爆款視頻付出這么多努力和時(shí)間是否值得,我們?nèi)匀粺o法確定。(財(cái)富中文網(wǎng))
翻譯:劉進(jìn)龍
審校:汪皓
據(jù)稱,有三條瘋傳的視頻拍攝了湯姆·克魯斯表演魔術(shù)、講一個(gè)不太搞笑的笑話和練習(xí)高爾夫球揮桿動(dòng)作等場(chǎng)面。據(jù)數(shù)字圖像法醫(yī)分析專家表示,這三條視頻是深度偽造視頻最逼真的例子。深度偽造視頻是利用人工智能技術(shù)創(chuàng)作的具有高度迷惑性的虛假視頻。
上周,一個(gè)名為@deeptomcruise的賬號(hào)在社交媒體TikTok上發(fā)布了這三條視頻,累計(jì)觀看次數(shù)達(dá)到約1,100萬次。該賬號(hào)有34.2萬粉絲,獲得該平臺(tái)的其他用戶100萬次點(diǎn)贊。
@deeptomcruise賬號(hào)背后的運(yùn)營(yíng)人員和運(yùn)營(yíng)團(tuán)隊(duì)尚無法確定,但湯姆·克魯斯的模仿者埃文·費(fèi)蘭迪在上周末告訴Mic網(wǎng)站,他認(rèn)為這些視頻應(yīng)該出自演員米爾斯·費(fèi)舍爾之手。費(fèi)舍爾與克魯斯長(zhǎng)相酷似,以前曾經(jīng)模仿過克魯斯。社交媒體網(wǎng)站上也有許多人認(rèn)為是費(fèi)舍爾在視頻中模仿了克魯斯,并使用深度偽造技術(shù)對(duì)臉部進(jìn)行了修改。
加州大學(xué)伯克利分校(University of California at Berkeley)的漢尼·法立德教授專門從事數(shù)字圖像分析。他表示,他相信這些是深度偽造視頻,但他們“做得非常逼真”。
根據(jù)法立德和他的一名研究生施盧蒂·阿加瓦爾的分析,有一些微小的證據(jù)能夠證明這些視頻是人工智能生成的偽造視頻。在一條視頻中,克魯斯似乎正在用一枚硬幣表演魔術(shù),在視頻最后克魯斯的眼睛顏色和眼型有輕微變化。克魯斯的虹膜上還可以看到兩個(gè)異常的小白點(diǎn),表面看是光線反射。但法立德表示,這兩個(gè)小白點(diǎn)的變化較真實(shí)視頻的變化更加明顯。
使用人工智能方法創(chuàng)建深度偽造視頻,通常會(huì)在他們創(chuàng)建的圖像和視頻中留下微妙的視覺異常,比如眼睛顏色或眼型不一致、耳朵輪廓古怪或發(fā)際線異常等。
深度偽造視頻常被用于換頭或換臉,而不是生成整個(gè)身體,而且法立德指出,表演硬幣魔術(shù)的雙手看起來像是克魯斯本人的雙手。或許它們屬于正在表演硬幣魔術(shù)的另一位演員,但將臉換成了湯姆·克魯斯。
法立德還表示,雖然深度偽造視頻通常會(huì)涉及到全臉更換,但使用人工智能技術(shù)只生成臉部的一部分,有時(shí)候能夠達(dá)到更逼真的效果。他和阿加瓦爾懷疑,這三條視頻就是這種情況。他們認(rèn)為,嘴部可能是真實(shí)的,但眼睛區(qū)域使用了深度偽造技術(shù)。
法立德說:“如果視頻中是真人在模仿克魯斯,可能通過化妝達(dá)到某種良好的效果,這或許有一定的道理,而且換掉獨(dú)特的眼睛足以讓臉部肖像達(dá)到逼真的效果。也有可能是在拍攝完視頻之后進(jìn)行了一些編輯?!?/p>
深度偽造視頻使用一種名為GAN(生成對(duì)抗網(wǎng)絡(luò))的機(jī)器學(xué)習(xí)技術(shù)創(chuàng)作而成,該技術(shù)會(huì)對(duì)兩個(gè)深度神經(jīng)網(wǎng)絡(luò)進(jìn)行輪流訓(xùn)練。這種技術(shù)是大致基于人腦運(yùn)行模式的一種機(jī)器學(xué)習(xí)技術(shù)。一個(gè)網(wǎng)絡(luò)通過克魯斯本人的圖片或視頻進(jìn)行訓(xùn)練,用于生成克魯斯處在不同環(huán)境中或擺出不同姿勢(shì)的新圖像,其逼真程度足以騙過另外一個(gè)網(wǎng)絡(luò),后者接受的訓(xùn)練是從許多人的圖像中挑選出湯姆·克魯斯。
與大部分人工智能方法一樣,數(shù)據(jù)的數(shù)量和質(zhì)量決定了系統(tǒng)的效果。這解釋了為什么克魯斯經(jīng)常成為深度偽造視頻模仿的對(duì)象:他是世界上被拍照次數(shù)最多的名人之一。這些數(shù)據(jù)更容易訓(xùn)練出一個(gè)極其出色的湯姆·克魯斯圖像生成器。
法立德表示,克魯斯有獨(dú)特的嗓音和一些特殊的動(dòng)作,這并沒有壞處,反而增加了與他有關(guān)的深度偽造視頻的娛樂價(jià)值和社交媒體傳播力。
在這三條克魯斯視頻之前,有一條最廣為流傳并且最神秘的深度偽造視頻同樣涉及到克魯斯。這條視頻的發(fā)布者網(wǎng)名為Ctrl Shift Face,已經(jīng)創(chuàng)作了大批高度逼真的深度偽造視頻。在其去年發(fā)布的這條視頻中,喜劇演員比爾·哈德正在2008年大衛(wèi)·萊特曼的節(jié)目中模仿克魯斯。發(fā)布者利用深度偽造技術(shù)對(duì)這條視頻進(jìn)行了修改,讓哈德在模仿克魯斯的時(shí)候幾乎與本人一模一樣。在GAN技術(shù)誕生大約三年之后,即2017年,首次出現(xiàn)了深度偽造視頻。最早的視頻是將名人的頭換到色情影片女演員的身體上。但在那之后,該技術(shù)被用于創(chuàng)作了不同名人在不同環(huán)境下的視頻。現(xiàn)在已經(jīng)有一款現(xiàn)成的軟件,支持用戶創(chuàng)作同樣令人難以辨別的深度偽造視頻,并且由于深度偽造視頻可能被用于復(fù)雜的政治虛假信息宣傳的情況,因此引起了安全研究人員的高度警惕。不過到目前為止,雖然專家仍然在就幾種可能的情況進(jìn)行爭(zhēng)論,但深度偽造視頻并不是虛假信息傳播的主要原因。
雖然今天的深度偽造視頻通??梢允褂脭?shù)字法醫(yī)分析進(jìn)行識(shí)別,但這個(gè)過程耗時(shí)漫長(zhǎng),而且需要具備豐富的專業(yè)知識(shí)。研究人員正在研究創(chuàng)建能夠自動(dòng)識(shí)別虛假偽造視頻的人工智能系統(tǒng)。2019年,F(xiàn)acebook發(fā)起了年度競(jìng)賽,希望找到最佳的系統(tǒng)。但在現(xiàn)有系統(tǒng)中,表現(xiàn)最好的系統(tǒng)只有65%的時(shí)間可以準(zhǔn)確識(shí)別。
阿加瓦爾表示,使用專門的商業(yè)軟件,有可能創(chuàng)作出與這三條克魯斯視頻同等質(zhì)量的深度偽造視頻。但這需要具備一定的技能,相關(guān)人工智能系統(tǒng)還需要有大量數(shù)據(jù)和訓(xùn)練時(shí)間,而且訓(xùn)練時(shí)間成本高昂。所以,為了一條TikTok爆款視頻付出這么多努力和時(shí)間是否值得,我們?nèi)匀粺o法確定。(財(cái)富中文網(wǎng))
翻譯:劉進(jìn)龍
審校:汪皓
A trio of viral videos allegedly depicting the actor Tom Cruise performing a magic trick, telling a not-so-funny joke, and practicing his golf swing are some of the most sophisticated examples yet seen of deepfakes, highly convincing fake videos created using A.I. technology, according to experts in the forensic analysis of digital images.
The three videos, which were posted last week on the social media platform TikTok from an account called @deeptomcruise, have collectively been viewed about 11 million times. The account has garnered more than 342,000 followers and 1 million likes from other users of the social media platform.
The person or people behind @deeptomcruise have not yet been definitely identified, but Cruise impersonator Evan Ferrante told website Mic over the weekend that he believed the videos were the work of an actor named Miles Fisher, who resembles Cruise and has done impressions of him in the past. Several people on social media sites also said they believed Fisher is depicting Cruise in the videos, with his face modified using deepfake technology.
Hany Farid, a professor at the University of California at Berkeley who specializes in the analysis of digital images, says he is convinced that the videos are deepfakes but that they are “incredibly well done.”
According to an analysis by Farid and one of his graduate students, Shruti Agarwal, there are a few tiny pieces of evidence that give away the fact that the videos are A.I.-generated fakes. In one video, in which Cruise seems to perform a magic trick with a coin, Cruise’s eye color and eye shape change slightly at the end of the video. There are also two unusual small white dots seen in Cruise’s iris—ostensibly reflected light—that Farid says change more than would be expected in an authentic video.
The A.I. methods used to create deepfakes often leave subtle visual oddities in imagery and videos they create—inconsistencies in eye color or shape, or strange ear contours or anomalies around the hairline.
Deepfakes are most often used to swap one person’s head or face for another’s as opposed to generating the entire body, and Farid notes that the hands performing the coin trick don’t look like the real Cruise’s hands. Presumably they belong to an actor, who was filmed performing the coin trick and then had Cruise’s face substituted for his.
Farid also says that while a true deepfake often involves a full-face swap, a more convincing result can sometimes be obtained by using the A.I. technique to generate only a portion of the face. He and Agarwal suspect that this is the case with the three Cruise videos. They think that the mouth is probably real, but that the eye region has been created with deepfake technology.
“This would make sense if the actual person in the video resembles Cruise, did some good work with makeup perhaps, and the swapping of the distinct eyes is enough to finalize a compelling likeness,” Farid says. “It is also possible that there was some postproduction video editing.”
Deepfakes are created using a machine-learning technique called a GAN (generative adversarial network), in which two deep neural networks—a type of machine learning loosely based on the way the human brain works—are trained in tandem. One network is trained from pictures or videos of the real Cruise to generate new images of Cruise in different settings or poses that are realistic enough to fool the other network, which is trained to pick out images of Tom Cruise from those of other people.
As with most A.I. methods, the amount and quality of the data help determine how good the system is. That goes to explain why Cruise has been a frequent target for deepfakes: He is among one of the most photographed celebrities on the planet. All that data makes it easier to train a very good Tom Cruise image generator.
Farid says it also doesn’t hurt that Cruise has a distinctive voice and mannerisms that add to the entertainment value and social media virality of deepfakes involving him.
Prior to the current trio of Cruise videos, one of the most wildly circulated and uncanny examples of a deepfake also involved Cruise. Released last year by a person who goes by the Internet handle Ctrl Shift Face, who has created a number of highly realistic deepfakes, it involves a video of the comedian Bill Hader doing an impersonation of Cruise on the David Letterman show in 2008. Deepfake technology is used to modify the video so that Hader’s face seamlessly morphs into Cruise’s as he does the impression. Deepfakes first surfaced in 2017, about three years after GANs were invented. Some of the earliest examples were videos in which the head of a celebrity was swapped with the body of an actress in a pornographic film. But since then they have been used to create fake videos of a lot of different celebrities in different settings. There is now off-the-shelf software that enables users to create fairly convincing deepfakes, and security researchers have become increasingly alarmed that deepfakes could be used for sophisticated political disinformation campaigns. But so far, despite a couple of possible examples that are still being debated by experts, deepfakes have not become a major factor in disinformation efforts.
While today’s deepfakes are usually identifiable with careful digital forensic analysis, this process is time-consuming and requires a certain amount of expertise. Researchers are working to create A.I. systems that would be able to automatically identify deepfakes, and Facebook in 2019 launched an annual competition to find the best of these. But in its inaugural running, the top performing system was accurate only 65% of the time.
Agarwal says it is possible to create deepfakes of the quality seen in the three Cruise videos using commercial software for deepfake generation. But doing so requires some skill, as well as a significant amount of data and training time for the A.I. system involved—and that training time can be expensive. So whether it would have been worth that sort of effort and cost for a viral TikTok video remains uncertain.