Voice Clones of Shah Rukh Khan, Priyanka Chopra Jonas Used in Fake Gaming App Video

August 20, 2024
August 20, 2024
Manipulated Media/Altered Mediablog main image
Screengrabs of the video analysed by the DAU

The Deepfakes Analysis Unit (DAU) analysed a video that features actors Shah Rukh Khan and Priyanka Chopra Jonas apparently promoting an investment gaming app. After putting the video through A.I. detection tools and escalating it to our expert partners, we were able to conclude that the fabricated video was produced using a mosaic of unrelated video clips, graphics, and an A.I.-generated audio track. 

The 33-second video in Hindi, embedded in a Facebook post, was sent to the DAU by a fact-checking partner for analysis. It opens with the visuals of a news presenter talking to the camera in a studio setting, with the accompanying male voice announcing that Mr. Khan has introduced a highly lucrative mobile gaming app.

A logo resembling that of an Indian Hindi news channel can be seen on the top right corner of the video frame. Superimposed text at the bottom left reads “breaking news” and below that somewhat broken Hindi letters arranged to mirror the news ticker format are visible. Bangla alphabets can also be seen in the anchor's backdrop.

The remaining footage includes separate clips of the two actors, peppered with visuals of young women gasping in excitement; a graphical representation of an interface showing multiplier returns on investment and money being credited to an account bearing a logo that looks similar to that of ICICI Bank, an Indian multinational bank.

The distinct male and female voice recorded over the visuals of Khan and Ms. Chopra Jonas, respectively, endorses the app as an easy and quick way of making big money. The male voice resembles Khan's real voice but sounds too scripted and hastened, unlike his characteristic delivery. His lip movements seem consistent with the audio in his close-ups, which span no more than two-seconds at a stretch.

The lip movements of Chopra Jonas are not consistent with the audio in her close-ups, spanning two-seconds and four-seconds. Her visuals are marked by a distinct quivering around her lips as well as odd chin movement. The colour and shape of her lips appear to be changing with the movement of her mouth.

In this GIF Priyanka Chopra Jonas’s chin changes shape as she speaks

The voice sounds somewhat like hers but it has a nasal quality to it. The Hindi used in the audio is not conversational but literary, something out of character with her style of talking. The self is referred to as male as opposed to female, which is grammatically incorrect.

The GIF here shows the anchor’s mouth stretching unnaturally

The lip-sync of the anchor seems imperfect. The audio accompanying the anchor’s visuals sounds monotonous, with no change in tone and pitch.

We ran a reverse image search using screenshots from the video under review, to locate the original videos from which the short clips were apparently lifted.

A reverse image search for the anchor’s visuals did not yield useful results. Taking a cue from the logo identical to that of a Hindi news channel we spotted in the video, we checked if the visuals were from the Bengali channel owned by the same media house; but that was not the case. However, the Bangla alphabets in the backdrop helped with the keyword search on Youtube through which we located this video published on Aug. 4 by the official YouTube channel of Jamuna TV, a privately owned news channel in Bangladesh.

Khan’s clip could be traced to an interview of his to the Asian Network, a radio station owned and operated by the British Broadcasting Corporation (BBC). It was published on their official website on Jan. 25, 2017.

Chopra Jonas’s clip is from an interview of hers, published on May 1, 2023 from the official YouTube channel of the Jennifer Hudson Show, an American talk show focusing on celebrities and lifestyle.

The clothing, backdrop, and body language of each of the subjects is identical in the original clips as well as the doctored clips. Khan and Chopra Jonas spoke in English in the original clips and the anchor spoke in the Bangladeshi dialect of Bengali. None of them mention anything about a gaming app or anything related to it in those clips.

The logos of Jamuna TV, Asian Network, and the Jennifer Hudson Show, visible in the original clips, do not feature in the manipulated video. And the logo resembling that of an Indian news channel is not visible in any of the original clips. The visuals of people gasping and the dubious gaming app’s user interface graphics could not be traced to any of the original footage.

To discern if A.I. had been used to manipulate the video, we put it through A.I. detection tools.

Hive AI’s deepfake video detection tool indicated that the video was manipulated using A.I. and pointed out a marker on Khan’s face. However, their audio tool did not detect the use of A.I. in the audio track.

Screenshot of the analysis from Hive AI’s deepfake video detection tool

We also ran the video through our partner TrueMedia’s deepfake detector which suggested substantial evidence of manipulation in the video, overall.

In a further breakdown of the analysis, the tool gave 93 percent confidence score to “face manipulation” and 82 percent confidence score to “generative convolutional vision transformer”, both these subcategories indicate manipulation using A.I. in the faces featured in the video. The tool gave a 63 percent confidence score to “audio analysis” subcategory, which assesses if the audio track was generated by an A.I. audio generator.

Screenshot of the overall analysis from TrueMedia’s deepfake detection tool
Screenshot of the audio and video analysis from TrueMedia’s deepfake detection tool

To get a further analysis on the audio track, we put it through the A.I. speech classifier of ElevenLabs, a company specialising in A.I. voice research and deployment. It returned results as ‘very unlikely’ indicating that it was highly unlikely that the audio track featured in the video was generated using their software.

We reached out to ElevenLabs for a comment on the results from the classifier. They told the DAU that they could not confirm that the audio was A.I.-generated, neither could they confirm that the audio originated from their platform.

For a further analysis on the video, we escalated the video to the Global Online Deepfake Detection System (GODDS), a detection service set up by Northwestern University's Security & AI Lab (NSAIL). They used a combination of 22 deepfake detection algorithms and analyses from four human analysts trained to detect deepfakes, to analyse the video escalated by the DAU.

Of the 22 predictive models used to analyse the video, 14 models gave a lower probability of the video being fake, while the remaining 8 models indicated a much higher probability that the video is fake.

The team noted in their report that despite a lack of consensus among the automated detectors’ predictions, the high precision of the models that identified the video to be fake is suggestive of a stronger likelihood that the video under review was automatically generated.

Analyses from their trained human analysts pointed out several visual indicators that suggested digital manipulation in the video under review. The team marked inconsistencies in subjects’ faces and voices which corroborated our own observations on the video. In the overall verdict, the team concluded that the video is likely to be fake.

To get yet another expert to weigh in on the audio, we escalated it to our partner IdentifAI, a San-Francisco based deepfake security startup. They told us that the manipulation in the video fits perfectly with a lip-sync deepfake, where just the lips are manipulated from an original video. They added that popular tools such as synclabs and wav2lip use this technique.

For further clarification on the nature of the audio featured in the video, we escalated it to our partner GetReal Labs, co-founded by Dr. Hany Farid and his team, they specialise in digital forensics and A.I. detection. Analysis from their tools and team indicated evidence of synthetic voices throughout the video. For assessment purposes, their team isolated the speakers’ voices, denoised them and removed background music, which helped them effectively evaluate the synthetic quality in the voices featured in the video.

Based on our observations and expert analyses, we assessed that the video featuring the celebrities has been fabricated using synthetic audio. It is yet another case where a dubious gaming app is being promoted in a format similar to a news package.

(Written by Debraj Sarkar and Debopriya Bhattacharya, edited by Pamposh Raina.)

Kindly Note: The manipulated audio/video files that we receive on our tipline are not embedded in our assessment reports because we do not intend to contribute to their virality.

You can read below the fact-checks related to this piece published by our partners:

गेमिंग ऐप का प्रचार करते शाहरुख खान और प्रियंका चोपड़ा का वीडियो फर्जी है

Fact Check: ‘एविएटर प्ले’ एप का प्रमोशन कर रहे शाहरुख़ खान और प्रियंका चोपड़ा का यह वीडियो डीपफेक है