Demystifying Machine Learning in Today's Banking Industry
by Gary M. Shiffman, PhD, on Dec 10, 2018, 4:00:00 PM
Last week, the ABA/ABA Financial Crimes Enforcement Conference shed light on the critical importance of technology in the financial industry. But before we can evaluate technology, we need to understand what drives it.
The explosion of the digital age has brought with it more efficient banking, more convenient, technology-based customer solutions and increasingly customer-specific services. But it has also brought something else with it: an overwhelming amount of data. In 2017, an IBM Marketing Cloud report found that 90 percent of the data available had been created in the past two years; today, this percentage is likely even higher. But what exactly does this mean? And what can those working in the financial industry do with all this data?
The writer Pico Iyer argues that our society sees too much: too much data, too much information, and too much technology. He has proposed that our increasing reliance on technology and the bombardment of data that accompanies it has also tremendously increased the urgency of stillness. This is a valid point. But the sheer enormity of available data can have a positive result as well, especially for the banking industry – it increases our ability to monitor risks and threats.
This not only makes the financial industry more secure; it makes our society, as a whole, safer.
But how do machines make us safer, exactly? To understand this, we have to understand what machine learning is. The terms “big data,” “machine learning,” and “artificial intelligence” have become ubiquitous. But they are often so overused that their real meaning and applications have become obscured.
When we type in the word “cat” in a search bar, we take for granted that Google presents us with thousands of images of cats. There was a time not long ago, however, when none of us could have imagined this was possible. Today, these kinds of search results are everywhere. And they are prime examples of machine learning.
Through millions of user clicks, the “machine”—in this case, Google—has been trained to locate the most accurate pictures of “cat,” so that we will rarely, if ever, get a result that is something other than a cat. If you tell a machine ten million times, “This is a cat,” eventually the machine learns to identify and display only pictures of cats.
We see this when we do commercial search engine searches for people by name. If we type in, for example, the name of a famous actor, most of the results a search engine displays will be photographs of this celebrity. If we type in the name of a person who is not a public figure, a search engine may only display the person we are looking for in a small percentage of the results. The more clicks a search term receives, the better “trained” the machine becomes, the more accurate the results.
More data is better, in the sense that it allows for more effective search. This can be understood in terms of columns and rows in a spreadsheet. If, in the case of a bank, customers are listed in rows (the larger the bank, the more customers and the more rows), then the columns represent the attributes of each customer. What has been transformative in the last decade is the sheer number of columns, or attributes, available about customers. We can see this in the incredible precision of Facebook and Google marketing. Ten years ago, marketing used broad population demographics. Today, marketing is about targeting you and me as individuals, not our broad demographic. In other words, marketers can now tailor their advertising to a specific person’s interests, because the amount of data available has increased exponentially.
If a bank uses "big data" to determine whether a potential customer poses a money-laundering or terrorist financing threat, we need data to help us train the models. The difficulty occurs when the data becomes overwhelming. When the number of columns go from 5 to 500,000, a human alone can’t process this information. So we must train machines to identify patterns, infer, and predict.
This has remarkable applications when it comes to security in the banking world. If, for example, if we want to identify money launderers, we can train a machine to find correlations between certain attributes, or columns, and a person's likelihood of laundering money. The catch here is that we cannot do this effectively without enough data. The more data we have on the attributes of past money launderers, the more accurately we can identify current yet unrecognized money launderers.
This principle applies to all elements of financial security and risk mitigation. The combination of advanced technology and behavioral science can have significant implications not just for streamlining bank processes, but for preventing substantial financial losses.
This is the upside of data. While its enormity may be overwhelming at times, the availability of big data, machine learning and artificial intelligence, and behavioral science will transform the banking industry in the next few years as it has transformed so many other aspects of our lives. And because criminal activity is often tied to financial transactions, the applications of machine learning are many, and invaluable. The more we invest in technology that processes data, the better equipped we can be to take advantage of these applications.