In the security field, we have this idea of offensive security. It means to build malicious capabilities out before attackers do, so we can learn to defend against them. And in fact, at the NIPS Workshop on Machine Deception, my co-author and I brought the idea of using machine learning for spear-phishing back to the machine learning community. We showed that it’s actually very easy for attackers to use machine learning in order to drastically increase click-through rates of phishing links — possibly up to as high as 66%.
Combining the effectiveness of spear-phishing and the automation of regular phishing is possible using generative models and features from social media profiles. There are many ways to do this, but the implementation we demonstrated is split into two components: a machine learning component to detect whether a given user is (un)likely to click a link, and another machine learning component to generate text geared specifically for that user.
Because we don’t have labeled data for users as to what links they click, we need to use an unsupervised technique to measure engagement of a user. We take features from the user’s profile: whether they’ve changed default settings, number of followers and posts, words from their bio, etc.
Twitter keeps track of whether you change your default profile image and other settings.
We convert each user to a vector in a matrix based on those features, and train unsupervised models to cluster the users. Out of our tested methods, K-Means performed best based on Silhouette score with “reasonable structure” (0.5–0.7 Silhouette Coefficient).
Examples of output plots for clustering using K-Means (left). Silhouette scores for different methods/parameters (right).
In order to generate text, we compare two different methods. Markov models are simple: they calculate the probability of co-occurrence of terms in the training corpus, and generate new terms based on those probabilities. So, in the example below, a new post starting with the term “I” has a 62% chance of being followed by the word “like” and 38% chance of being followed by the word “don’t”. Markov models can work reasonably well even when trained on small datasets (making them perfect for training directly on a user’s timeline!) but have issues maintaining state throughout the generated message.
A markov chain.
In contrast, Recurrent Neural Networks (RNNs) are great at remembering what happened earlier on in a sentence, at the expense of more complexity in the model. We use a particular type of RNN called a Long-Short Term Memory (LSTM) network. At each timestep, an LSTM will will:
- Forget old information from the hidden state that is no longer relevant,
- Update the hidden state with new information, and
- Output a number between -1 and 1 (Activation).
Because of this complexity, more samples are needed to train; we instead train the model on a more general corpus and use a topic that the user frequently uses as a seed.
Example diagram showing the inner workings of an LSTM.
Using these techniques, increasing the click-through rate for a malicious web page is nearly trivial.
Here is one example from recent research that shows the realistic threats we as a field should prepare for, before AI and machine learning are misused at scale. Attacks using AI are already starting to be seen in the real world: see concerns of commodity software allowing users to replace the original subjects in photos or videos with anyone they have pictures of. Preparing for such attacks is crucial.