Protecting against deepfakes in the era of LLMs
The rise of AI has had many by-products, and the use of deepfakes in financial scams is among the most unpalatable ones.
The emergence of large language models (LLMs) has further heightened the risk of deepfake technology being exploited for fraud, in particular in the financial sector, where stakes are the highest.
A cat and mouse game is unfolding worldwide, as regulatory bodies and fintechs struggle to stay ahead of fraudsters, working tirelessly to prevent or minimise user losses.
Human-like qualities
I recently interviewed Lei Chen from FinVolution Group, who explained that LLM-generated voices display striking human-like qualities, making it challenging to discern authenticity.
FinVolution Group is a US-listed fintech platform providing customers with credit services and anti-fraud technologies in the pan-Asian region and beyond.
Chen, vice president of FinVolution and head of its big data and AI division, adds that earlier attempts only produced voices that sounded similar. By contrast, modern deepfake technology creates textually coherent and free-flowing dialogues that closely mimic real human conversation, raising the likelihood of deception.
The recent release of GPT-4o by OpenAI highlights just how good technology is getting at mimicking human speech. Based on text-to-speech (TTS) technology, the model can parse text prompts to create highly natural and emotionally rich voice outputs.
The result is an almost flawless imitation that complicates the task of detection.
Chen views GPT-4o with both delight and concern. On the one hand, it represents yet another significant step toward realising artificial general intelligence. On the other, he is highly skeptical about the potential risks it brings.
One immediate concern around the advancement of AI and deepfake technology involves financial transactions, where the elderly and even financial professionals are at a higher risk of being deceived by fraudsters.
In February this year, an accountant at the Hong Kong branch of a multinational company unwittingly transferred HK $200 million across 15 transactions, as per the instructions of who he believed were the company’s chief financial officer and other members of staff during a video meeting – only to later realise it was a sham and that all the other people on the call were deepfake imposters.
There has been an alarming spike in similar cases in recent years.
Rampant voice forgery
According to a research report by Sumsub, an identity verification service provider, the number of reported deepfake-related frauds jumped tenfold across all industries globally from 2022 to 2023.
Notably, fraud attempts in the Philippines skyrocketed by a whopping 4,500 percent year on year, followed by nations like Vietnam, the United States and Belgium, the Sumsub study finds.
Chen says that while fraudulent videos garner more attention globally, a bigger challenge lies in voice forgery.
This is because voice cloning and recognition present higher difficulties compared to images, according to Qiang Lyu, an algorithm scientist at FinVolution Group.
Human speech, being a one-dimensional continuous signal, involves more intricate processing logic than that for two-dimensional images, he notes.
What’s more, human voices encompass various personal traits such as accents, intonation, speech habits, and dialects – making them more complex than images or videos.
Lyu believes that the processing of voices is lengthier, prone to interference, and technologically more challenging.
“This leads fraudsters outside China to prefer fake videos,” he says, adding that image and video cloning still dominate in overseas markets like Indonesia, the Philippines, and Pakistan, where FinVolution has a presence.
As the global tech community pivots from traditional deep learning models to LLMs, detecting synthesised voices is even trickier.
Therefore, future advances in fake voice recognition will rely on LLMs for finer detail capture, Lyu claims.
FinVolution has been on the guard against voice-based fraud attempts in the markets where it operates. Last year alone, the company logged and intercepted more than 1,000 such cases in China within two to three months.
Modelling the polygraph
In the face of a constantly evolving AI battle, bolstering defence mechanisms to protect consumers and third-party partners has become imperative for fintech experts like Lyu.
To better spot financial hoaxes employing cloned voices, novel approaches are needed alongside the use of sophisticated questioning.
He points to the polygraph, which is designed to detect subtle tremors in conversations to probe emotional fluctuations or other tell-tale signs of lies.
He suggests modelling the polygraph to capture these nuances with the aid of LLMs.
“We can sift through vast amounts of real data to pinpoint minute deceptive details,” Lyu explains. “These details can then be categorised to determine if they are genuine or not.”
In addition to combatting phone scams involving fake voices, FinVolution is also exploring ways to leverage voice recognition technology to better serve users and protect their financial well-being.
With the ascent of technologies like digital personas and the metaverse, more individuals will have AI-powered personal assistants. This exposes them to elevated risks of identity theft.
Likewise, financial professionals will increasingly be tasked to decide whether a call is from a real person and make more nuanced judgments.
Lyu says that call centres are in greater need of capabilities to bust illicit activities and fraud. This entails gradually amassing databases of illicit activities and fraud samples.
Consequently, when high-risk calls occur, they can be promptly identified and flagged, alerting users to exercise caution during transactions.
On the front of consumer protection, FinVolution has integrated voiceprint services into its apps, enabling users to record their voices during registration.
These voiceprints can later be used for ID verification, expediting sensitive operations like changes to credit limits.
In the case of malicious attempts to acquire user identities, traditional verification methods such as behavioural analytics or document checks may not be enough.
For instance, in the Philippines, FinVolution has encountered numerous cases where local IDs are sometimes worn out, posing authentication challenges.
In response, the fintech introduced facial and voiceprint recognition technologies to detect fake identities, assisting the local credit risk team.
Proactive regulation
But above all, Chen emphasises the importance of collaboration with regulatory authorities in combatting deepfakes.
“Clear legislation and stringent enforcement are necessary, both in China and globally, to ensure the proper use of personal data and privacy,” he says.
Indeed, proper acquisition and use of sensitive data like voices are at the heart of a current debate on AI ethics. Some commentators argue that financial and social platforms have an obligation to disclose if AI is involved in communications or mark content produced by AI. Others call for legislation to be stepped up to keep pace with technological progress.
By labelling datasets, fintechs like FinVolution are able to promptly detect instances of misuse. This proactive strategy can effectively mitigate potential damages before they escalate.
Fostering innovations in regtech is crucial. “Most commercial institutions focus on applying technology in business. They may overlook systemic risks,” Chen says.
In his opinion, this is an area regulators can play an active role in, uniting ecosystem partners to drive collective progress in fighting fraud.
On the technological side, Lyu underscores the importance of data governance. Stricter regulations and security measures can raise the cost for fraudsters to access the computing power and data necessary to commit a scam.
Lyu believes it’s the mission of fintechs like FinVolution to showcase superior AI capabilities. “This can act as a potent deterrent,” he says.
“Persistent efforts can dissuade reckless AI usage, given our detection capabilities,” Lyu adds. “We must foster a spirit of technology for benevolence.”