Fraud detection & machine learning

Every year, online fraud costs businesses an average of $4.5 million – and it’s growing. According to Checkout.com’s State of Retail report, 25% of ecommerce companies globally are experiencing a spike in card-not-present (CNP) fraud and chargebacks.

Between CNP fraud, account takeovers, and chargeback abuse, cybercriminals are evolving: finding inventive, ever-improving ways to beat the system, and cheat honest merchants out of hard-earned revenue. Fortunately, though it isn’t just the perpetrators of fraud who are getting smarter – it’s the detectors, too.

Here, we’re talking about machine learning. It’s a subset of AI (Artificial Intelligence) which harnesses large data sets to clamp down quicker and more efficiently on online fraud – taking a more scalable, accurate, and cost-effective approach to safeguarding your brand and revenue.

Below, we’ll explain what machine learning is, and how it’s already being used across a range of industries to detect online fraud. We’ll discuss machine learning’s benefits for your business – and how Checkout.com can help you put them into practice.

What is fraud detection in machine learning?

Fraud detection in machine learning refers to the application of artificial intelligence (AI) and machine learning techniques to identify and prevent fraudulent activities.

Let’s break that down some more.

Machine learning is a form, or subset, of artificial intelligence (AI). It involves using historical data to train algorithms – feeding them huge amounts of information to allow them to make decisions about the data. Like humans, these algorithms can learn from the patterns and characteristics of past behaviors – then use this knowledge to be better at their job.

For our purposes, that job is fraud detection: the act of identifying and putting a stop to fraudulent activity, such as chargeback fraud and the various types of payment fraud.

By using machine learning, algorithms can be trained to analyze data around both genuine and fraudulent transactions. Having ‘learnt’ the differences and nuances within this transaction data, these programmes and platforms can detect anomalies, or any suspicious patterns that could indicate fraudulent activity, going forward – and in real time, too.

Learn more: A merchant's guide for preventing payment fraud

How does machine learning in fraud detection work?

Machine learning in fraud detection has six key stages:

Collecting the data: first, you need to provide your AI-powered algorithms with the data they need to learn. In online fraud detection, that could be information about your transactions – both legitimate and illegitimate – as well as the attack attempts (plus their sources and methods) your website has received.

Preprocessing and extracting the data: not every piece of data will be relevant to teaching the algorithm. So here, you’ll need to decide what’s relevant – and extract it. This stage also includes feature engineering: selecting the aspects of the data that will help you distinguish between genuine and fraudulent transactions. Preprocessing the data also helps you work around missing values, outliers, and inconsistencies.

Creating the model: next, select the right machine learning algorithm for the task. Ones you can choose from include decision trees, random forests, gradient boosting, neural networks, and anomaly detection algorithms.

Training – and testing – the model: split the data into training and testing sets. The training set teaches the model to identify patterns common to fraudulent transactions – then learn the relationships between features and outcomes through a gradual, iterative process. The testing set evaluates the model’s performance.

Setting a threshold: determine a threshold (or ‘decision boundary’) to separate legitimate transactions from potentially fraudulent ones. This balances the trade-off between false positives (legitimate transactions flagged as fraudulent) and false negatives (fraudulent transactions the model misses).

Deploying the model: after it’s earned its stripes in a sandbox environment, let your machine learning model loose in real time to analyze incoming transactions.

Learn more: How to avoid vendor fraud?

Benefits of machine learning for fraud detection

Before we get to the benefits of machine learning, we first need to understand the alternative.

Without machine learning to detect fraud, businesses have to rely on ‘rules’ alone. Basically, these are a series of set parameters that, if met, suggest fraud. One rule may dictate, for instance, that a transaction over $10,000 from a high-risk region equals fraud.

Although rules are still an important part of online fraud detection, they’re limited. For one, they lead to a lot of false positives, and – for two – they can be inefficient and tough to scale as your business grows. Plus, as time goes by, and your prices (and your average order value) increase, rules’ preference for fixed outcomes means they can quickly become obsolete.

So – why should your business use machine learning for fraud detection? Well, machine learning’s many benefits include:

Accuracy: free from human error, machine learning models have reams of data at their disposal; with the ability to compare and contrast established patterns to instantly identify suspected fraud, and reduce instances of false positives and false negatives.
Speed: machine learning for fraud detection takes place in milliseconds, and happens in real time. This reduces friction at the checkout and, let’s face it – is far quicker than any human would be!
Cost- and time-effectiveness: by automating your fraud detection processes through machine learning, you free your team from having to do this manually – and cut back on the associated time, labor, and technological costs. (Algorithms don’t, after all, draw a salary!)
Scalability: unlike traditional rules – which struggle as data increases, and have to expand to adapt – machine learning models thrive on the addition of new data. Machine learning systems get better as the data does: allowing them to scale in tandem with your customer base and transaction volume.
Efficiency: machine learning models don’t eat, sleep, take breaks, or go on vacation – meaning they can detect fraud 24/7, 365 days a year.

To learn more about rules and risk strategies – and to develop a better understanding of fraud detection – explore how we categorize risk and set rules here at Checkout.com.

Learn more: 10 fraud rules you need at minimum

Use cases of machine learning for fraud detection

Whether you’re a fintech company needing to authenticate a user or an ecommerce business verifying a transaction, machine learning can prevent and detect fraud across a range of situations and scenarios.

Let’s take a look at some.

Ecommerce and online payment fraud

Ecommerce businesses process hundreds, often thousands, of transactions every day. That’s a lot of sales to keep track of – and a lot of opportunity for fraud.

Card-not-present (CNP) fraud is a particularly prevalent concern for ecommerce businesses. CNP fraud is the most common type of fraud in the world, and involves the thief using a stolen or compromised card to make a purchase. Often, however, these fraudulent transactions aren’t flagged automatically by your ecommerce website’s fraud detection tool. Many slip through the net – and end up costing your business the time, money, and hassle of credit card disputes.

In this use case, machine learning can be employed to analyze your transaction data in bulk. By feeding it information around legitimate transactions – alongside those of detected and undetected fraud – your ecommerce business can understand:

Which items fraudsters target most
Which kind of shipping information incurs the most risk
Which devices fraudulent users are targeting your site from

You can then use this knowledge to arm your fraud detection tools with the informational arsenal they need to deal with the threats of today – and tomorrow.

BNPL and account takeover fraud

Buy now, pay later (BNPL) fraud can be as simple as a customer simply refusing to make a payment they owe. However, it can also happen when what’s known as an account takeover (ATO) occurs.

ATO attacks happen when fraudsters illegally obtain (often through phishing) usernames and passwords for a customer’s BNPL account. From here, they can gain access to the account, then place orders to their own address (but under the guise of the legitimate customer).

Machine learning can help you prevent BNPL ATO fraud by enabling a better understanding of how customers log on to your platform. If you know from a customer’s IP address that they regularly log in from Colorado, a sudden login from New Delhi might sound the alarm.

Over time, running machine learning on these various data points (not only where your customers log in, but when; and on which devices) gives you a deeper understanding of what fraudulent login attempts look like – and helps you combat them accordingly.

Chargeback fraud

Chargeback fraud (another big bugbear for ecommerce businesses) happens when a customer falsely claims that the goods or services you sold them weren’t delivered – or didn’t arrive as promised – and dispute the charge with their bank.

In the case of chargeback fraud, machine learning can be used to sort through a range of historical transaction and customer information, including:

The time and date of the transaction
The customer’s IP address
Network information (the number of payment methods, emails, or phone numbers a network shares, as well as that network’s age)
Device information (type, browser, location)
The purchase amount and product category
Customer behavior (account age, purchase history, chargeback frequency; how many pages they browse before making an order)

After extracting these relevant features and labeling each chargeback as either legitimate (some are) or fraudulent, you can train a machine learning model to distinguish patterns that indicate genuine chargebacks – as well as those suggesting potential fraud.

Card testing fraud

Card testing fraud occurs when fraudsters ‘test’ stolen credit card information by making small purchases. (They often use automated bots to this end.) If the transaction is successful, the fraudster tends to then progress to larger purchases, or even an account takeover.

In this context, a machine learning model can identify patterns and behaviors – like small transactions, and the devices and geographic areas they come from – to better understand and prevent card testing fraud.

How to use machine learning in fraud detection

By now, you’re aware of the benefits of machine learning – and have seen it in action to fight CNP fraud, account takeovers, and illegitimate chargebacks.

So how can your business use machine learning to fight fraud? And what specific applications does this subset of AI have for merchants looking to safeguard their reputation and revenue?

Identity verification

Whether it’s onboarding new users, enabling remote login to an online dashboard, or authenticating payments via 3DS, verifying your customers’ identity is vital.

It’s especially crucial in industries like gaming and online gambling, where fraudsters may seek to create multiple accounts on a casino websites to claim signup incentives – and, once there, engage in collusion to cheat the system.

To drive down this form of fraud, you can employ machine learning to analyze images of government-issued data, as well as the user’s location and historical behavior (such as any prior transaction patterns and interactions). Machine learning models can also sort through large volumes of data from other sources – such as a user’s social media profile and online presence – to cross-reference and validate the identity details they provide.

Machine learning can also be combined with a biometric approach to identity verification. This involves applying algorithms to biometric data sets, which include:

Fingerprint, voice, and facial pattern data
User behavioral biometrics (such as mouse movement and typing rhythm)

For more help with this, explore what our Identity Verification product can do for you. It allows your business to confirm the authenticity of your customer’s identity – and that the documents and their holder correspond – through a seamless digital flow.

Real-time fraud monitoring

A key feature of machine learning models is in the name – they learn.

Machine learning algorithms aren’t just chewing through huge amounts of data to help you make decisions – they’re assessing customer behavior as it takes place, and using this to inform their approach. And, because these AI-drive algorithms are always ‘on’, they – by necessity – operate in real time; constantly iterating, optimizing, and improving their approach.

What’s more, machine learning’s features are based on the latest, real-world incidents of fraud – working with the most relevant, up-to-date fraud statistics from around the world.

This helps machine learning models monitor fraud in real time. And, with access to real-time categorical data, help you move confidently into new international markets.

Predictive analytics

Machine learning enables predictive fraud analytics – the practice of using historical data to predict what’s ahead. By developing risk scores for previous transactions, for example, you can automate actions for future, similar transactions – ensuring your business remains agile and alert to any suspicious goings on.

For an even more detailed take on how organizations are using data analytics and AI to fight fraud online, explore our comprehensive guide to fraud analytics.

How Checkout.com helps with fraud detection machine learning

A study by Juniper Research expects the total cost of ecommerce fraud to merchants to surpass $48 billion in 2023. This is up from $41 billion in 2021, and suggests an undeniable, if unpalatable, fact – that the fraudsters aren’t slowing down.

Fortunately, big data is on your side. And with machine learning, you can sort, process, extract, and learn from this information – then wield it to prevent fraud’s damaging impact on your brand. Better still? When you take payments with Checkout.com, this approach is built into your system from the get-go: keeping your business and reputation safe, without requiring any additional integrations or effort.

Our Fraud Detection Pro solution harnesses machine learning to stay on top of fraudulent trends across the entire Checkout.com network. By combining this flexible approach with the reliability of robust rules – and the power of approve and deny list – you’ll have everything you need to prevent fraud and reduce false positives. And do so in a way that’s tailored to you: whether that’s going live fast, or customizing the approach to your business’s exact needs.

Want to know more? To equip your business with machine learning, protect your revenue – and build frictionless payments flows with Fraud Detection Pro?

Get in touch today to start the conversation.

Download PDF