Screenshots of the video sent to the DAU tipline

The Deepfakes Analysis Unit (DAU) reviewed a video that apparently features  India TV news anchor and chairperson Rajat Sharma presenting a news story about a cure discovered by a doctor for several diseases that impact vision.  After running the video through A.I. detection tools and escalating it to a lab at the University of California, Berkeley, we concluded that the video has been doctored using fake audio.    

In the 45-second clip, sent to the DAU tipline for assessment, two other men besides Mr. Sharma can be heard talking in Hindi, one of them narrates his experience of regaining vision as a result of the treatment and the other man talks about how this treatment is administered. There is no name mentioned on either of the speakers’ sound bite, which is not usual practice in news story production. The lip movements of the speakers are not consistent with their speech at various points in the video, and one of the speakers sounds very robotic in his delivery.   

We ran the video through a video deepfake detection tool but the results that returned did not hint at any elements of the video being a deepfake. Since we were suspicious about the audio, we put  the video through the voice detection tool of Loccus.ai, a company that specialises in artificial intelligence solutions for voice safety. The results that returned indicated that only 8.94 percent of the audio was real, which means that there is a significant percentage of synthetic speech in this video.

Screenshot of the analysis from Loccus.ai's audio detection tool.

We further expanded our scope of investigation, and escalated the video for analysis to a lab run by the team of Dr. Hany Farid, a professor of computer science at the University of California, Berkeley, who specialises in digital forensics and A.I. detection. They identified this video as a lip sync deepfake after using their own tools for analysis as well as running the audio through the A.I. detection tool of  ElevenLabs

Dr. Farid’s team mentioned that during their analysis they observed that the first speaker in the video had the most prevalent lip sync artifacts, and the clip of the news anchor at the end of the video had out of sync lip movements. They classified the audio as fake and noted that in lip sync deepfakes the mouth or face are also generated. 

Their findings corroborated our analysis of the audio being synthetic and the video being manipulated. 

(Written by Areeba Falak and edited by Pamposh Raina.)