Published: March 26, 2021
Reading time: 4 minute read
Written by: Forter Team

Modern fraudsters are innovative; they use automated tools like bots, machine learning, and device emulators to hide their activities and launch massive online fraud attacks.  Data indicates that medium to large retailers with high website traffic see an average of 206,000 web attacks every month. And the cost of a single cyber incident averages $200,000 for businesses of all sizes.

Without an effective fraud prevention solution in place, online fraud and abuse can greatly impact your business. And an effective fraud prevention solution requires machine learning (ML). Machine learning drives most fraud prevention platforms today because of its ability to identify patterns in massive volumes of data, including sophisticated fraud behaviors, and then use that information to assess user behaviors in real time.

You need machine learning to accurately identify fraudulent transactions and user behavior. But what kind of machine learning?

Machine learning approaches for fraud prevention.

Two of the most common machine learning approaches in fraud prevention are supervised learning and unsupervised learning.

What is supervised machine learning?

Supervised learning (SML) involves humans teaching algorithms patterns and values by providing the algorithms labeled training examples. Humans provide the algorithms known inputs and known outputs. Each training example is labeled with a specific value of interest and the algorithm looks for patterns in these value labels. For fraud prevention, the training data would include known fraud incidents drawn from historical data. The algorithms learn from the fraud data, then a model of what they learned is saved. The model is used later to make fraud predictions on new data. Most companies in the fraud detection and prevention market use supervised learning.

What is unsupervised machine learning?

In unsupervised learning (UML) humans provide the algorithms input data only. UML algorithms can learn to identify patterns on their own – they don’t need labeled training examples to learn from the input data. Types of unsupervised learning methods include clustering, anomaly detection, and density estimation. Unlike supervised learning, unsupervised learning doesn’t require that humans constantly retrain the models. And UML algorithms can analyze and detect patterns of fraud without needing prior fraud knowledge. This ability enables UML algorithms to detect new and unknown forms of fraud.

Both supervised and unsupervised learning can help you battle forms of fraud and abuse. However, each one comes with some challenges and limitations.

Challenges of supervised learning.

While supervised learning is predominantly used in fraud prevention platforms, it comes with a few significant challenges.

First and foremost, supervised learning cannot detect unknown fraud.

Supervised learning can accurately detect known fraud types. However, the challenge lies in detecting new and evolving fraud. Fraudsters constantly change their tactics and often use machine learning themselves to commit many types of online fraud. Supervised learning requires that humans train the algorithms with examples of fraud incidents they know about. Humans can’t train algorithms about fraud techniques they aren’t aware of yet.

You need a lot of training data.

Eventually, “unknown fraud” becomes “known fraud.” You need a massive amount of high-quality training data for SML algorithms to detect known fraud techniques. Few companies have the human resources necessary to keep up with emerging fraud trends and then train SML algorithms to detect them.

Challenges of unsupervised learning.

Unsupervised learning can assess transactions and user behavior in real time. However, you face some significant challenges when using this type of machine learning, for example:

It’s difficult to assess the accuracy of the models.

Assessing the accuracy of UML models is one of the biggest challenges of unsupervised learning. With UML algorithms, you only have labeled input data. SML algorithms, on the other hand, are trained with labeled input and output data. So, for example, if you train a SML algorithm to detect a “cat” in a set of photos, and the result includes photos of both cats and dogs, you know something went wrong with your model. With SML models, the results are pretty cut and dry, not so with UML models.

Typical UML approaches are prone to false positives.

Some fraud prevention companies use unsupervised learning to find patterns of fraud within transactional data. The typical approach involves analyzing all transactions to establish statistical bounds associated with normal transactions – and then recognizing new transactions that stray too far from the statistical normal path and considering them suspicious. This approach can detect new forms of fraud, but it delivers high false-positive rates.

A false positive means that your system rejected a legitimate customer along with all the potential future purchases that customer would have made from your business.

To prevent fraud successfully, you need both.

You need supervised and unsupervised learning to prevent fraud effectively. You also need human expertise and ongoing research so that your ML models return accurate results. You need humans to continually research:

  • Consumer buying patterns
  • Behavioral analytics
  • Payment trends
  • The fraudster ecosystem
  • Advanced fraud analytics

The knowledge and insights gained from ongoing research allow you to better train SML algorithms and evaluate UML models more effectively.

A modern approach to fraud prevention means a solution powered by machine learning that allows you to automate fraud prevention and incorporate real-time decisioning into the customer purchasing journey. It allows you to approve more good customers while preventing bad actors from successfully defrauding your business.

Full automation, with real-time approve/decline decisions at any point in the customer purchasing journey, is impossible to achieve without both supervised and unsupervised machine learning, and data curation by the foremost human experts.

Ultimately, machine learning is only a tool. It is only as good as the data it has to work with and the experts who continually refine it. Machine learning requires expert human intervention as it alone is not enough for accuracy.

To learn more about how you can use machine learning to improve fraud prevention accuracy, download our white paper “Cyber Crime is Costing the World $6 Trillion Annually. Is Your Fraud Prevention Ready?

4 minute read