The Deepfakes Analysis Unit (DAU) analysed an audio clip being purported as a phone conversation between Swati Maliwal, Rajya Sabha member from the Aam Aadmi Party (AAP), and Dhruv Rathee a popular Youtuber. After putting the audio through A.I. detection tools and escalating it to our detection partners we were able to establish that A.I. voice clones of both of them were used to fabricate a conversation.
The DAU tipline received multiple X links for assessment with the 43-second audio embedded. The supposed dialogue is mostly in Hindi with some bits in English. The audio plays over a graphic with static pictures of the alleged speakers and a phone call symbol in between. On the top left of the graphic there is a recording icon with captions providing additional context behind the presumed conversation.
The sound levels of the two speakers are highly inconsistent throughout the duration of the recording, and their pitch and tonality seem synthetic. We noticed that the voice purported to be of Ms. Maliwal’s seems to convey cues of distress in some parts, as if choking up while speaking.
We put the audio through A.I. detection tools to identify any A.I. elements in it.
The voice detection tool of Loccus.ai, a company that focuses on artificial intelligence solutions for voice safety, indicated that the probability of the audio being real was very low.
We also ran the audio through TrueMedia’s deepfake detector for analysis, which overall categorised the audio as “highly suspicious”. In a further breakdown of the analysis, the tool gave a 100 percent confidence score to the subcategory of “audio analysis” and a 99 percent confidence score to “advanced foundational features” — both indicators that it is highly probable that an A.I. audio generator was used to synthesise the audio.
The low confidence score of nine percent to the “AI generated audio detection” category suggests that the tool has more confidence in the audio having been synthesised using A.I., and that it is not fully A.I.-generated.
We also used the A.I. speech classifier of ElevenLabs, a company specialising in voice A.I. research and deployment, to further analyse the audio. The results that returned indicated that there was a very high probability that the audio file was generated using their software.
ElevenLabs told the DAU that after analysing the audio they were able to identify the user who had used their software to generate this audio. They have banned that account from using any of their tools in the future.
They also noted that they will use the audio generated by the user, escalated to them by the DAU, to further train their in-house automated moderation system to better capture similar content in the future.
We even shared the audio file for analysis with our detection partner ConTrails AI, a Bangalore-based startup, that has its own A.I. tools for detection of audio and video spoofs. Their assessment suggested that they picked up patterns of A.I. voice cloning from both the voices heard in the audio.
They added that it is possible that techniques such as Retrieval-based Voice Conversion (RVC) cloning or Text-to-Speech (TTS) cloning were used to generate the voices.
On the basis of our investigation and expert analyses, we can conclude that the purported audio of Maliwal and Rathee is a deepfake audio.
(Written by Debraj Sarkar and edited by Pamposh Raina.)
Kindly Note: The manipulated audio/video files that we receive on our tipline are not embedded in our assessment reports because we do not intend to contribute to their virality.
You can read below the fact-checks related to this piece published by our partners:
Fact Check: स्वाति मालीवाल और ध्रुव राठी की कॉल रिकॉर्डिंग के दावे के साथ वायरल ऑडियो AI जनरेटेड है
Viral Phone Call Between Swati Maliwal And Dhruv Rathee Is A Deepfake
Fact Check: स्वाति मालीवाल और ध्रुव राठी की बातचीत का वायरल ऑडियो AI निर्मित है