LO 75.3: Analyze the application of machine learning in three use cases: | Ken Li, FRM

LO 75.3: Analyze the application of machine learning in three use cases: Credit risk and revenue modeling Fraud Surveillance of conduct and market abuse in trading
Credit Risk and Revenue Modeling
Financial institutions recently moved to incorporate machine learning methods with traditional models in order to improve their abilities to predict financial risk. In turn, they have moved away from the less complex traditional linear credit risk model regressions.
Page 184
2018 Kaplan, Inc.
Topic 75 Cross Reference to GARP Assigned Reading – van Liebergen
However, machine learning models are often unfit to be successfully incorporated into the ongoing risk monitoring of financial institutions. Machine learning models can be overly complex and sensitive to overfitting data. Their (often extreme) complexity makes it difficult to apply jurisdictionally consistent definitions of data, and the models are too complex for regulatory purposes, including internal models in the Basel internal ratings- based (IRB) approach, because it is very difficult for auditors to understand them.
Despite their disadvantages, machine learning models can be successfully used in optimizing existing models with regulatory functions. For example, both linear and less complex nonlinear machine learning models can be applied to existing regulatory and revenue forecasting models.
Fraud
Banks have successfully used machine learning in the detection of credit card fraud. Models are used to detect fraudulent transactions, which can then be blocked in real time. Credit card fraud can incorporate machine learning more usefully than other risk areas because of the very large number of credit card transactions that are needed for the training, backtesting, and validation of models. The models then predetermine the key features of a fraudulent transaction and are able to distinguish them from normal transactions. Models can also be successfully used in anti-money laundering or combating the financing of terrorism (AML/CFT) activities through unsupervised learning methods, such as clustering. Clustering identifies outliers that do not have strong connections with the rest of the data. In this way, financial institutions can detect anomalies and reduce the number of false positives.
Many banks still rely on traditional fraud detection through identifying individual transactions or simple patterns, but these systems lead to a large number of false positives and lack the predictive capabilities of the more sophisticated machine learning models. In addition, the traditional models still require significant human involvement to filter the false positives from suspicious activities. Data sharing, data usage, and entrenched regulatory frameworks can also hinder the successful use of machine learning.
Other factors also make the use of machine learning more difficult. Money laundering is difficult to define, and banks do not receive adequate feedback from law enforcement agencies on which transactions were truly fraudulent. As a result, it is difficult to use only historical data to teach money-laundering detection algorithms to detect fraudulent activity.
Surveillance of Conduct and Market Abuse in Trading
Surveillance of trader conduct breaches is another growing area in which machine learning is being increasingly used to detect rogue trading, insider trading, and benchmark rigging activities. Financial institutions find early detection of these violations important because they can cause material financial and reputational damage to the institution.
Early monitoring techniques tended to rely on monitoring trading behavior and assessing single trades. With machine learning, monitoring techniques were enhanced to evaluate entire trading portfolios, and connect information to other activities of the trader, including emails, calendar items, phone calls, and check-in and check-out times. The traders behavior
2018 Kaplan, Inc.
Page 185
Topic 75 Cross Reference to GARP Assigned Reading – van Liebergen
could then be compared to other traders normal behavior. The system detects any deviation from the normal pattern and alerts the financial institutions compliance team.
One of the challenges facing financial institutions in successfully applying machine learning includes the legal complexities of sharing past breach information with developers. Also, systems need to be auditable, but because machine learning models are designed to continuously learn from the data, it can be difficult to explain to a compliance officer why a certain behavior set off an alert. As a remedy to these problems, systems can be designed to combine machine learning with human decisions. By incorporating human decisions with machine learning, systems data can be used to know a comprehensive set of information about a trader, and create a system that is less complex and more suitable for audit and regulatory purposes.
Page 186
2018 Kaplan, Inc.
Topic 75 Cross Reference to GARP Assigned Reading – van Liebergen
Ke y Co n c e pt s
LO 75.1 Machine learning uses algorithms that allow computers to learn without programming. Supervised machine learning predicts outcomes based on specific inputs, whereas unsupervised machine learning analyzes data to identify patterns without estimating a dependent variable.
Three broad classes of statistical problems include regression, classification, and clustering. Regression problems make predictions on quantitative, continuous variables, including inflation and GDP growth. Classification problems make predictions on discrete, dependent variables. Clustering observes input variables without including a dependent variable.
Overfitting is a problem in nonparametric, nonlinear models which tend to be complex by nature. Boosting overweights less frequent observations to train the model to detect these more easily. Bagging involves running a very large number of model subsets to improve its predictive ability.
Deep learning differs from classical learning models in that it applies many layers of algorithms into the learning process to identify complex patterns.
LO 75.2 Machine learning is a powerful tool for financial institutions because it allows them to adequately structure, analyze, and interpret a very large set of data they collect, and improve the quality of their supervisory data.
Financial institutions can use both conventional machine learning techniques to analyze high-quality, structured data, and use deep learning techniques to analyze low-quality, high frequency data.
LO 75.3 Three cases of machine learning include (1) credit risk and revenue modeling, (2) fraud detection, and (3) surveillance of conduct and market abuse in trading.
Credit risk and revenue modeling, despite their disadvantages stemming from their complexity and overfitting, have been successfully used to optimize existing models with regulatory functions. These include both linear and less complex nonlinear machine learning models which can be paired with existing regulatory and revenue forecasting models.
Traditional fraud detection systems identify individual transactions or simple patterns, leading to a large number of false positives and require significant human involvement to filter the false positives from suspicious activities. Machine learning systems can help financial institutions detect fraudulent transactions and block them in real time. Clustering refers to identifying outliers that do not have strong connections with the rest of the data.
2018 Kaplan, Inc.
Page 187
Topic 75 Cross Reference to GARP Assigned Reading – van Liebergen
Drawbacks of machine learning include difficulty identifying money laundering, and lack of adequate feedback from law enforcement agencies.
Surveillance of trader conduct breaches through machine learning allows for monitoring techniques to evaluate entire trading portfolios, and connecting information to other activities of the trader and comparing this information to traders normal behavior.
Page 188
2018 Kaplan, Inc.
Topic 75 Cross Reference to GARP Assigned Reading – van Liebergen
Co n c e pt Ch e c k e r s
1.
2.
3.
4.
Which of the following classes of statistical problems typically cannot be solved through supervised machine learning? A. Regression problems. B. Penalized regression. C. Classification problems. D. Clustering.
Which of the following concepts best identifies the problem where a highly complex model describes random error or noise rather than true underlying relationships in the data? A. Bagging. B. Boosting. C. Overfitting. D. Deep learning.
Which data type is most characteristic of big data? A. High-quality data. B. Low frequency data. C. Structured supervisory data. D. Low-quality, unstructured data.
Which of the following factors does not explain why machine learning systems have been less widespread in the anti-money laundering (AML) space? A. Existence of unsupervised learning methods. B. Lack of a universal definition of money laundering. C. Inadequate feedback from law enforcement agencies. D. Inadequacy of historical data for money laundering detection algorithms.
5.
A credit analyst makes the following statements:
Statement 1: Financial institutions face barriers in applying machine learning systems because supervisory learning approaches are difficult to apply.
Statement 2: Combining machine learning with human decisions tends to produce inferior model results.
The analyst is accurate with respect to: A. statement 1 only. B. statement 2 only. C. both statements. D . neither statement.
2018 Kaplan, Inc.
Page 189
Topic 75 Cross Reference to GARP Assigned Reading – van Liebergen
Co n c e pt Ch e c k e r An s w e r s
1. D Clustering typically involves applying unsupervised learning to a dataset. It involves
observing input variables without knowing which dependent variable corresponds to them (e.g., detecting fraud without knowing which transactions are fraudulent).
Regression problems, including penalized regression, and classification problems involve predictions around a dependent variable. These statistical problems can be solved through machine learning.
2. C Overfitting is a concern where highly complex models describe noise or random error rather than true underlying relationships in the model. Overfitting is a particular concern in non- parametric, nonlinear models.
Boosting overweights less frequent observations to train the model to detect these more easily. Bagging involves running a very large number of model subsets to improve its predictive ability. Deep learning differs from classical learning models in that it applies many layers of algorithms into the learning process to identify complex patterns.
3. D
Big data is data that arises from large volumes of low-quality, high-frequency, unstructured data.
4. A Unsupervised learning methods can be used in AML detection to identify and learn relevant
patterns in client activity.
Money laundering is difficult to define, and financial institutions do not receive adequate feedback from law enforcement agencies on which transactions were truly fraudulent. As a result, it is difficult to use only historical data to teach money-laundering detection algorithms to detect fraudulent activity.
5. A
Incorporating human decisions with machine learning can improve data, because systems data can be used to identify a comprehensive set of information about a trader, and create a system that is less complex and more suitable for audit and regulatory purposes.
Financial institutions have difficulty in successfully applying machine learning because of legal complexities of sharing past breach information with developers.
Page 190
2018 Kaplan, Inc.
The following is a review of the Current Issues in Financial Markets principles designed to address the learning objectives set forth by GARP. This topic is also covered in:
Ce n t r a l Cl e a r i n g a n d Ri s k Tr a n s f o r m a t i o n
Topic 76
Ex a m Fo c u s
This topic emphasizes liquidity risk, as opposed to solvency risk and counterparty risk, as being the primary concern for central counterparties (CCPs) and clearing members. For the exam, focus on the advantages of central clearing as well as the sequencing involved in the five-step CCP loss waterfall, primarily the details of the third and fourth steps. A good understanding of liquidation costs is also essential.
C e n t r a l C l e a r in g o f OTC Tr a n s a c t io n s