The Deepfakes Analysis Unit (DAU) analysed a video, which seems to feature Mark Zuckerberg, founder and CEO of Meta, apparently promoting an investment platform associated with his company. After putting the video through A.I. detection tools, and escalating it to our detection and forensic partners we assessed that synthetic speech had been used over original footage from an interview of Mr. Zuckerberg’s to produce the manipulated video.
The 46-second video in English was sent to the DAU tipline for assessment. It opens with a voice over the visuals of a Coca-Cola bottle being crushed. The narration cautions against wasting time watching random videos and as Zuckerberg’s visuals fade into the video the voice introduces an investment platform promising high returns.
The voice in the video sounds very similar to Zuckerberg’s. While there is synchronisation between the words and the lip movement it seems unnatural as the mouth opens at the same angle at each enunciation. The area around the lips gets blurry in some parts of the video and the chin movement also seems odd. The overall delivery is monotonous and robotic.
The video is packaged with a logo bearing resemblance to Meta’s official logo and has visuals of purported payout certificates from people who have supposedly invested in the platform.
We undertook a reverse image search using screenshots from the video that we were analysing and tracked down the original video featuring Zuckerberg, which was published on Feb.16, 2024 from the YouTube channel of Morning Brew Daily, an American daily talk show. In both the videos he can be seen wearing identical clothes in the same studio setting, however, the original video bears the logo of the show and the audio tracks in both videos are distinct.
To discern if A.I. had been used to manipulate the video we received on the tipline, we put the video through A.I. detection tools.
Hive AI’s deepfake video detection tool indicated A.I. manipulation in different parts of the video, their audio tool picked up A.I. elements in the audio track as well.
We also used TrueMedia’s deepfake detector, which overall categorised the video as “highly suspicious”, pointing to high probability of manipulation in the video. It gave a 100 percent confidence score to the subcategory of “AI generated audio detection” as well as “face manipulation”.
To get a further understanding of the A.I. elements used in the video, we escalated it to our partner IdentifAI, a San Francisco-based deepfake security startup. They used their audio detection software to check the authenticity of the audio in the clip.
First, they took two real voice samples of Zuckerberg’s to generate an audio profile of his. Then, the sample retrieved from the manipulated video was isolated and compared with the real voice sample through a heat-map analysis.
The image on the left displays a comparison between Zuckerberg’s real voice and the audio profile of his generated by our partner; the green circles draw out the similarity in the audio patterns. The image on the right compares the voice sample from the manipulated video with the generated audio profile and there are visible patterns of dissimilarity than similarity in the two.
Based on the heat-map analysis and iterative testing, the team at IdentifAI were able to establish that it is highly likely that the audio in the manipulated video has been produced using generative A.I. and is not Zuckerberg’s real voice.
We also sought an expert view on the video from a lab run by the team of Dr. Hany Farid, a professor of computer science at the University of California in Berkeley, who specialises in digital forensics and A.I. detection. They noted that the video was a lip-sync deepfake and that the words that are heard in the video are inauthentic.
On the basis of our findings and analyses from experts, we can conclude that the words being attributed to Zuckerberg were not uttered by him and that synthetic voice was used over original visuals to fabricate a video.
(Written by Debopriya Bhattacharya with inputs from Areeba Falak, and edited by Pamposh Raina.)
Kindly Note: The manipulated audio/video files that we receive on our tipline are not embedded in our assessment reports because we do not intend to contribute to their virality.