Machine learning and AI

Engineering a Twitter spearphishing bot from machine learning

June 9, 2021 by Dimitar Kostadinov

The term bot is derived from a web robot. A web robot is a type of software that can perform automated tasks over the internet and it is enhanced with machine learning capabilities.

In 2014, Twitter established that 23 million bots operate on their social media platform. After analyzing 88 million accounts, another report from 2018 estimated that Twitter hosts at least 15,000 crypto scam bots. Researchers used machine learning to depict how this automated architecture works.

But that trend of increasing bot activity online is rather ubiquitous 76% of companies in the e-commerce, entertainment, travel and financial services sectors report that they have been targeted by bots in the past. According to another research report, bots generate as much as 59% of all online traffic on dating websites, social media services and even online poker websites.

How bots are used for phishing

Countless interactions occur between humans and chatbots. Nevertheless, many people may not realize they are not communicating with another human. Bots can prompt users to pay for membership fees such was exactly the case concerning the Ashley Madison data breach. Other bots try to obtain users’ personal information. The goal of phishing is to persuade users to click on links or download infected attachments and spread the threat throughout the network.

Unfortunately, phishing campaigns entail grievous real-life consequences. For example, following a social engineering attack, innovative designs for a new model of heavy construction equipment were stolen, according to a Verizon Data Breach Digest report. The initial compromise took place after the chief design engineer fell prey to a spearphishing campaign initiated through a fake LinkedIn profile of some recruiter.

Unlike emails, social media provides attackers with abundant personal data related to the victim. From a social engineer’s point of view, machine learning can be used, for instance, to train artificial intelligence (AI) and its algorithms to understand patterns and then create convincing spam, phishing and spearphishing in an automated manner. Sentiment analysis, for instance, is a popular application related to natural language processing (NLP), which enables malicious actors to scan thousands of text documents for given filters in seconds.

To illustrate, the researcher Eugene Aiken analyzed Twitter posts of two people Donald Trump and Hillary Clinton to establish probability criteria concerning from whom a specific tweet may have come. The whole process consists of four stages:

  1. Scraping tweets
  2. Using them with a natural language processor
  3. Classifying the tweets based on a machine learning algorithm
  4. Determining probability based on a predict and probe method

As a side note: most election administrators during the 2020 U.S. presidential race did not have basic controls to prevent phishing.

Case study in focus: ZeroFOX’s SNAP_R

Conventional spearphishing methods can be both time-consuming and labor-intensive. A real person needs to research, craft and send a message that fits the target’s interests while sounding plausible. It can take five to 10 minutes to draft a convincing spearphishing email. Generating a great number of tweets may take only seconds or minutes provided that appropriate tools and hardware are used.

In a video from the hacker conference DEF CON 24 in 2016, two data experts working at ZeroFOX, a company that specializes in threats associated with social media, made a demonstration of an automated tool dubbed SNAP_R for writing targeted tweets that contain malicious links.

The model is trained using spearphishing pen­testing data, and in order to make a click­through more likely, it is dynamically seeded with topics extracted from timeline posts of both the target and the users they retweet or follow,” explained the creators of SNAP_R, John Seymour and Philip Tully.

The whole process seemed to work disturbingly well, with the machine learning mechanism powered by AI which is designed to write convincing tweets at machine speed. Its speed was similar to the speed of spam but the success rate was close to that of manual messaging. The researchers estimated that between 30% and 67% of the spearphished individuals took the bait. For comparison, spam and other similar automated phishing campaigns can achieve a 5% to 14% click-through rate.

Make your bird tweet in unison with the environment

The nature of Twitter contributes to a greater extent to the good results potential attackers would be able to achieve if they employ machine learning for generating a spearphishing bot; therefore, when you create your bot, make sure you take advantage of the following aspects:

  • One of the reasons why machine-generated spearphishing is so successful on Twitter is the fact that grammar errors are not that noticeable on this social network platform, since many users are not strict about the grammar accuracy of their messages.
  • Furthermore, if a victim replies to a malicious message, that will not show on other people’s timelines, which helps hide attempted tricks by not drawing attention. In addition, shortened links are normal for the Twitter environment, which can lure people into opening them when embedded in a tweet. The phishing links will be obfuscated and thereby the target will not be able to know what is hidden behind them until after he or she clicks on them.
  • One way chatbots give away their identity is how they write suspiciously fast. This will not raise suspicion on Twitter because the messages there are traditionally not that long.
  • Popular networking websites like Twitter create a trusting culture, meaning their users do not expect there to be harbored negative content, but SNAP_R can further exploit that aspect by creating a believable profile that mixes non-phishing with phishing posts.

There are also other experiments in this field a group of scholars from Universidade de Brasília Campus Universitário Darcy Ribeiro (Brazil) developed a bot for various social engineering attacks, including phishing, which operated for 10 days without detection on Twitter, attacking users aggressively at times.

Protecting yourself against Twitter spearphishing attacks

According to Carnegie Mellon University, almost half of the accounts that tweet about the coronavirus seem to be bot accounts. The pandemic itself creates a situation where many companies are shifting to remote work, which means that spearphishing attacks could become even more popular as employees are faced with more indirect online communication among each other than ever. Twitter even went further, deciding to allow some of their employees to work from home permanently.

Twitter fell victim to a phone spearphishing attack also known as vishing in the summer of 2020 which targeted specific employees to obtain access to internal systems and tools. While this type of social engineering attack does not involve bots, it goes to show that even the people who support the said social media can fall prey to a spearphishing attack, which may be later used for further exploitation against users.

All things considered, bot phishing and bot attacks are becoming worryingly prevalent, and although they are not given as much attention in news outlets as ransomware campaigns, they are equally difficult to mitigate.


At Black Hat: A free tool for spear phishing Twitter, CSO
Bot Attacks Are Increasingly Targeting Businesses. How to Stay Safe, Heimdal Security
Hackers unleash smart Twitter phishing tool that snags two in three users, The Register
Researchers: Nearly Half Of Accounts Tweeting About Coronavirus Are Likely Bots, NPR
Researchers Unveil Crypto Scam Involving 15,000 Twitter Bots, Medium
The Sexbot Scam: Don’t Be Fooled By A Fake Dating Profile, Truthfinder

Posted: June 9, 2021
Dimitar Kostadinov
View Profile

Dimitar Kostadinov applied for a 6-year Master’s program in Bulgarian and European Law at the University of Ruse, and was enrolled in 2002 following high school. He obtained a Master degree in 2009. From 2008-2012, Dimitar held a job as data entry & research for the American company Law Seminars International and its Bulgarian-Slovenian business partner DATA LAB. In 2011, he was admitted Law and Politics of International Security to Vrije Universiteit Amsterdam, the Netherlands, graduating in August of 2012. Dimitar also holds an LL.M. diploma in Intellectual Property Rights & ICT Law from KU Leuven (Brussels, Belgium). Besides legal studies, he is particularly interested in Internet of Things, Big Data, privacy & data protection, electronic contracts, electronic business, electronic media, telecoms, and cybercrime. Dimitar attended the 6th Annual Internet of Things European summit organized by Forum Europe in Brussels.