Discover More

You don’t need a gun to rob a bank any more. Now a bank robber’s most powerful weapon is a computer and an internet connection.


Fraud detection in the banking and financial space is a critical activity that can span a series of fraud schemes and fraudulent activity from bank employees and customers alike.

Step 1:

As before the data needed for building such algorithms would come from a variety of sources, right from in-house databases to third party applications to data from CRMs. Divide this into training and test data to account for overfitting (avoiding an algorithm which corresponds too closely to a particular dataset and therefore failing to fit additional data or predict future observations reliably.)

The more information that is available, the better the accuracy the model.

Step 2:

Oversample the data for fraudulent accounts as such activity is outnumbered by other accounts and hence it is preferable to artificially increase its representation.

Step 3:

Since the more is merrier for raw data, we need to prune it down to make this simpler and the final algorithm relevant and much more accurate.

Step 4:

Run it through a model to do the computing and arrive at a final basis on which to decide which transactions might be fraudulent. Since the final prediction would either be a yes or a no, the model selected generally is a logistic regression.

Step 5:

Test it on another dataset and finally apply it to the live transactional data to check for frauds.