LO 75.1: Describe the process of machine learning and compare machine learning approaches.
Machine learning is a field of artificial intelligence (AI) that uses algorithms which allow computers to learn without programming. There are two forms of machine learning: supervised and unsupervised. In supervised machine learning, a statistical model is built in order to predict outcomes based on specific inputs (e.g., predicting GDP growth based on inputs of various macroeconomic variables). In unsupervised machine learning, data analysis is performed to identify patterns without estimating a dependent variable.
Machine learning is important because it can analyze data samples in order to identify patterns and relationships in the data, and can make out-of-sample predictions. Models are then analyzed thousands or millions of times so that the model can improve its predictive capability. In this respect, machine learning is closely tied to the big data revolution. Supervised machine learning can also analyze nonparametric and nonlinear relationships that can fit any given model and make inferences about the dependent and independent variables.
Machine Learning Approaches
Although many approaches exist to analyzing machine learning, it can be applied to three broad classes of statistical problems: regression, classification, and clustering. Both
Page 182
2018 Kaplan, Inc.
Topic 75 Cross Reference to GARP Assigned Reading – van Liebergen
regression and classification can be addressed through supervised machine learning, while clustering follows an unsupervised approach. 1. Regression problems make predictions on quantitative, continuous variables, including
inflation and GDP growth. Regressions can involve both linear (e.g., partial least squares) and nonlinear (e.g., penalized regression in which complexity is penalized to improve predictability) learning methods.
2. Classification problems make predictions on discrete, dependent variables such as
filtering spam email and blood types, where the variable can take on values in a class. Observations may be classified by support vector machines.
3. Clustering involves observing input variables without including a dependent variable.
Examples include anti-money laundering (AML) analysis to detect fraud without knowing which variables are fraudulent. Data can be grouped into clusters, where outputs from unsupervised learning are used as inputs for supervised learning methods.
As mentioned, machine learning can be used to make out-of-sample predictions, for example, predicting borrowers ability to repay their obligations and borrower default. However, a good predictive model does not need to also be good at explaining or inferring performance. For example, a credit scoring model will make inferences as to why borrowers default, whereas a good predictive model only needs to identify which indictors lead to borrower default.
Other Concepts in Machine Learning
Models that are very complex may describe noise or random error rather than true underlying relationships in the model. This is called overfitting. Overfitting is a particular concern in nonparametric, nonlinear models which tend to be complex by nature. Models that describe noise will only fit that specific dataset and will not perform well in out-of- sample datasets.
Boosting (or bootstrapping) refers to overweighting scarcer observations to train the model to detect these more easily. For example, overweighting scarcer fraudulent transactions in a dataset can train the model to better detect them. Bagging describes the process of running several hundreds of thousands of models on different subsets of the model to improve its predictive ability. These models may also be combined with other machine learning models, called an ensemble, in order to further improve their out-of-sample predictive capabilities.
Machine learning uses past, in-sample data to make predictions about future, out-of-sample data. As a result, it has been criticized at times for being backward looking and for making predictions without truly understanding the underlying relationships.
Deep Learning
Deep learning approaches move away from the classic model approaches we have been discussing until now. Whereas classic models focus on well-defined and structured datasets, deep learning essentially mimics the human brain by applying several layers of algorithms into the learning process and converts raw data to identify complex patterns. Each
2018 Kaplan, Inc.
Page 183
Topic 75 Cross Reference to GARP Assigned Reading – van Liebergen
algorithm focuses on a particular feature of the data (called representations), and the layering of these representations allows the model to incorporate a wide range of inputs, including low quality or unstructured data. Importantly, the layers are not designed by engineers, but instead learned by the model from the various data.
For example, deep learning has been used in face-recognition and natural language learning models. Models have been complex enough to be able to classify not only the discussion topics, but also the emotions of the people involved. However, deep learning models are extremely complex, often requiring several million or hundreds of millions of datasets.
Th e Ap p l ic a t io n o f M a c h in e Le a r n in g