Deepfake Vishing Attacks: A Growing Threat to Personal Security
The use of AI-based voice cloning has become a significant threat to personal security, allowing attackers to convincingly impersonate individuals and trick victims into divulging sensitive information or performing malicious actions. These deepfake vishing attacks have been on the rise in recent years, with researchers and government officials warning about their exponential growth.
How Deepfake Vishing Attacks Work
Deepfake vishing attacks involve a series of steps that are both easy to execute and challenging to detect. The basic workflow involves collecting voice samples of the person being impersonated, which can be as short as three seconds. These samples are then fed into AI-based speech-synthesis engines, such as Google’s Tacotron 2 or Microsoft’s Vall-E, to create a convincing replica of the person’s voice.
The Role of AI-Based Speech-Synthesis Engines
AI-based speech-synthesis engines play a crucial role in deepfake vishing attacks. These engines allow attackers to use a text-to-speech interface that produces user-chosen words with the voice tone and conversational tics of the person being impersonated. Most services bar such use of deepfakes, but as Consumer Reports found in March, the safeguards these companies have in place to curb the practice could be bypassed with minimal effort.
Spoofing Numbers and Initiating Scam Calls
An optional step in deepfake vishing attacks is to spoof the number belonging to the person or organization being impersonated. This has been a technique used for decades and can make it more difficult for victims to distinguish between legitimate and illegitimate calls. Once the scam call is initiated, the attacker uses the fake voice to generate a pretense for needing the recipient to take immediate action.
Real-Time Impersonation and the Future of Deepfake Vishing
While real-time impersonation has been demonstrated by open-source projects and commercial APIs, real-time deepfake vishing in-the-wild remains limited. However, given ongoing advancements in processing speed and model efficiency, real-time usage is expected to become more common in the near future. This means that attackers will be able to respond to questions a skeptical recipient may ask, making the attacks even more convincing.
The Alarming Ease of Breaching Organizations
Mandiant’s security team executed such a scam in a simulated red team exercise, designed to test defenses and train personnel. The results were alarming: the victim bypassed security prompts from both Microsoft Edge and Windows Defender SmartScreen, unknowingly downloading and executing a pre-prepared malicious payload onto their workstation.
Precautions for Preventing Deepfake Vishing Scams
While deepfake vishing attacks are challenging to detect and prevent, there are precautions that parties can take to minimize the risk. These include agreeing to a randomly chosen word or phrase that the caller must provide before the recipient complies with a request. Recipients should also end the call and call the person back at a number known to belong to the caller.
The Importance of Staying Calm and Alert
Preventing deepfake vishing scams requires recipients to remain calm and alert, despite the legitimate sense of urgency that would arise if the feigned scenario were real. This can be even harder when the recipient is tired, overextended, or otherwise not at their best. For this reason, so-called vishing attacks—whether AI-enabled or not—are unlikely to go away any time soon.
Conclusion
Deepfake vishing attacks are a growing threat to personal security that require immediate attention and action. As technology continues to evolve and improve, it’s essential for individuals and organizations to stay vigilant and take necessary precautions to prevent these types of scams from succeeding. By understanding the anatomy of deepfake vishing attacks and taking proactive steps to protect ourselves, we can reduce the risk of falling victim to these convincing impersonations.