LO 47.1: Identify and explain errors in modeling assumptions that can introduce

LO 47.1: Identify and explain errors in modeling assumptions that can introduce model risk.
Modeling is a critical component in the risk management of an organization. Models help quantify risk and other exposures as well as potential losses. However, models can be complex and are subject to model risk, which includes input errors, errors in assumptions, and errors in interpretation.
Model Complexity
When quantifying the risk of simple financial instruments such as stocks and bonds, model risk is less of a concern. These simple instruments exhibit less volatility in price and sensitivities relative to complex financial instruments so, therefore, their market values tend to be good indicators of asset values. However, model risk is a significantly more important consideration when quantifying the risk exposures of complex financial instruments, including instruments with embedded options, exotic over-the-counter (OTC) derivatives, synthetic credit derivatives, and many structured products. For these complex instruments, markets are often illiquid and do not provide sufficient price discovery mechanisms, which puts greater emphasis on models to value instruments, typically through a mark-to-model valuation approach. These models are important not only for valuing instruments and assessing risk exposure, but also to determine the proper hedging strategy.
As financial instruments increase in complexity, so do the models used to value them. More complex models, such as the Black-Scholes-Merton option pricing model, increased the
Page 116
2018 Kaplan, Inc.
Topic 47 Cross Reference to GARP Assigned Reading – Crouhy, Galai, and Mark, Chapter 15
threat of model risk, especially for the more complex derivatives such as interest rate caps and floors, swaptions, and credit and exotic derivatives. As technology advanced, so did the complexity of the models created and used. The growth in complexity of the models also increased the reliance on these models. In addition, managers often do not have a solid understanding of the more complex models. When models are difficult to understand, the risk of model errors and the risk of incorrect vetting, interpretation, and oversight increases.
The dangers of relying too heavily on complex models became especially apparent during the 2007-2009 financial crisis. When markets endure a prolonged period of turmoil, models tend to underestimate the volatilities, correlations, and risks of financial instruments, and can overstate values, all of which may lead to sustained losses by market participants. Since models are often used for valuing instruments, a model may show that a strategy is profitable when in fact it is experiencing losses. Following the global credit crisis, model risk became more regulated as the Basel Committee mandated that financial institutions more rigorously assess model risk.
Common Model Errors
Model risk has been apparent over the last several decades through various international crises. A model may be incorrect if it contains incorrect assumptions about a financial instruments price or risk. One example of model error was the remarkable collapse in 1997 of a hedge fund run by Victor Niederhoffer, a well-known Wall Street trader. The funds strategy was to write (sell) deep out-of-money put options on the S&P 500, based on the assumption that the index volatility would not exceed 5% daily, and therefore, the option would expire worthless. In October 1997, the Asian financial crisis created a contagion effect that impacted North American markets. As a result, market volatilities increased significantly above historical levels. This level of volatility was not priced into the advanced mathematical models used by the fund which instead assumed a normal distribution of risk and historical correlations. The fund ultimately experienced substantial losses as its equity was completely wiped out.
Losses from model errors can be due to errors in assumptions, carelessness, fraud, or intentional mistakes that undervalue risk or overvalue profit. The six common model errors are as follows: 1. Assuming constant volatility. One of the most common errors in modeling is the
assumption that the distribution of asset price and risk is constant. The 20072009 financial crisis showed just how incorrect this assumption can be, when market volatilities not predicted by models increased significantly over a short period of time.
2. Assuming a normal distribution o f returns. Market participants frequently make the simplifying assumption in their models that asset returns are normally distributed. Practice has shown, however, that returns typically do not follow a normal distribution, because distributions in fact have fat tails (i.e., unexpected large outliers).
3. Underestimating the number o f risk factors. Many models assume a single risk factor. A
single risk factor may produce accurate prices and hedge ratios for simple products such as a callable bond. For more complex products, including many exotic derivatives (e.g., Bermuda options), models need to incorporate multiple risk factors.
2018 Kaplan, Inc.
Page 117
Topic 47 Cross Reference to GARP Assigned Reading – Crouhy, Galai, and Mark, Chapter 15
4. Assuming perfect capital markets. Models are generally derived with the assumption that
capital markets behave perfectly. Consider a delta hedge strategy that requires active rebalancing based on the assumption that the underlying asset position is continuously adjusted in response to changes in the derivatives price. This strategy will not be effective if capital markets include imperfections, including limitations on short selling, various costs (e.g., fees and taxes), and a lack of continuous trading in the markets.
5. Assuming adequate liquidity. Models often assume liquid markets for long or short
trading of financial products at current prices. During periods of volatility, especially extreme volatility, as seen during the recent financial crisis, liquidity could decline or dry up completely.
6. Misapplying a model. Historically, model assumptions have worked well in most world markets, but tend to break down during periods of greater uncertainty or volatility. For example, traditional models assuming normality did not work well in many countries, including the United States, Europe, and Japan in the post financial crisis period, which has been characterized by low or negative interest rates and unconventional monetary policies including quantitative easing. In these markets, models that include other statistical tools work better. Similarly, models that work well for traditional assets could yield incorrect results when complex factors including embedded options are factored in. Another example of misapplying a model is to use one that was created to value bonds with no embedded options (e.g., a non-callable, non-convertible bond) to now value bonds with embedded options (e.g., a callable, convertible bond).

LO 46.4: Explain how to validate the calibration and the discriminatory power of a

LO 46.4: Explain how to validate the calibration and the discriminatory power of a rating model.
Validating Calibration
The validation process looks at the variances from the expected PDs and the actual default rates.
The Basel Committee (2005a)2 suggests the following tests for calibration: Binomial test. Chi-square test (or Hosmer-Lemeshow). Normal test. Traffic lights approach. The binomial test looks at a single rating category at a time, while the chi-square test looks at multiple rating categories at a time. The normal test looks at a single rating category for more than one period, based on a normal distribution of the time-averaged default rates. Two key assumptions include (1) mean default rate has minimal variance over time and (2) independence of default events. The traffic lights approach involves backtesting in a single rating category for multiple periods. Because each of the tests has some shortcomings, the overall conclusion is that no truly strong calibration tests exist at this time.
Validating Discriminatory Power
The validation process is performed ex post using backtesting of defaulting and non defaulting items. Therefore, the concept of a longer forecast period requires that the forecast period begin further away from t = 0 and from the time the data is collected.
Validating discriminatory power can be done using the following four methods as outlined by the Basel Committee (2005a):
Migration matrices. Accuracy indices (e.g., Lorentzs concentration curves and Gini ratios). Classification tests (e.g., binomial test, Type I and II errors, chi-square test, and
Statistical tests (e.g., Fishers r2, Wilks X, and Hosmer-Lemeshow).
normality test).
The frequency distribution of errors is key to assessing the models forecasting reliability. With regard to error rates, validation requires an assessment of error tolerance, its calibration, and its financial impact (e.g., a false positive or Type I error increases losses, and a false negative or Type II error increases opportunity costs).
Basel Committee on Banking Supervision (2005a), Studies on Validation of Internal Rating Systems, Working Papers 14, Basel, Switzerland.
2018 Kaplan, Inc.
Page 111
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
Ke y C o n c e pt s
LO 46.1 To validate a rating model, a financial institution must confirm the reliability of the results produced by the model and that the model still meets the financial institutions operating needs and any regulatory requirements. The tools and approaches to validation are regularly reassessed and revised to stay current with the changing market and operating environment.
Best practices for the roles of internal organizational units in the validation process include active involvement of senior management and the internal audit group. In general, all staff involved in the validation process must have sufficient training to perform their duties properly.
With regard to independence, the validation group must be independent from the groups that are developing and maintaining validation models and the group(s) dealing with credit risk. The validation group should also be independent of the lending group and the rating assignment group. Ultimately, the validation group should not report to any of those groups. Given that validation is mainly done using documentation received by groups dealing with model development and implementation, the quality of the documentation is important. Controls must be in place to ensure that there is sufficient breadth, transparency, and depth in the documentation provided.
LO 46.2 There are five key areas regarding rating systems that are analyzed during the qualitative validation process: (1) obtaining probabilities of default, (2) completeness, (3) objectivity, (4) acceptance, and (3) consistency.
Quantitative validation comprises the following areas: (1) sample representativeness, (2) discriminatory power, (3) dynamic properties, and (4) calibration.
LO 46.3 Defaults are the key constraint in terms of creating sufficiently large data sets for model development, rating quantification, and validation purposes.
With regard to sample size and sample homogeneity, it is difficult to create samples from a population over a long period using the same lending technology. Lending technology is most likely to change. Unfortunately, the changes result in less consistency between the data used to create the rating model and the population to which the model is applied.
The time horizon of the data may be problematic because the data should take into account a full credit cycle. If it is less than a full cycle, the estimates will be biased by the favorable or unfavorable stages during the selected period within the cycle.
Validating data quality focuses on the stability of the lending technology and the degree of calibration required to infer sample results to the population.
Page 112
2018 Kaplan, Inc.
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
LO 46.4 Validating calibration looks at the variances from the expected probabilities of default and the actual default rates. Tests of calibration include (1) binomial test, (2) chi-square test (or Hosmer-Lemeshow), (3) normal test, and (4) traffic lights approach.
Validating discriminatory power involves backtesting of defaulting and non-defaulting items. Tests of discriminatory power include (1) statistical tests, (2) migration matrices, (3) accuracy indices, and (4) classification tests.
2018 Kaplan, Inc.
Page 113
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
C o n c e pt Ch e c k e r s
1.
2.
3.
4.
3.
Which of the following statements regarding the model validation process is most accurate? A. The validation process places equal importance on quantitative and qualitative
B. The validation group could be involved with the rating system design and
validation.
development process.
methodology.
C. The quantitative validation process involves an analysis of structure and model
D. The breadth and depth of validation should be commensurate primarily with
the dollar value of the loans outstanding.
Which of the following areas of quantitative validation would focus on rating systems stability? A. Calibration. B. Discriminatory power. C. Dynamic properties. D. Sample representativeness.
The increasing use of heuristic rating models versus statistical rating models would most likely be covered under which area of qualitative validation? A. Acceptance. B. Completeness. C. Consistency. D. Objectivity.
Which of the following statements regarding the validation of data quality is correct? A. Data should be created from a full credit cycle. B. Validating central tendency in the long term is done through normality testing. C. In practice, it is necessary to create samples from a population over a five-year
period using the same lending technology.
D. To make inferences about the population from the samples used in a model, it is
necessary to calibrate appropriately and do in-sample testing.
Which of the following methods would most likely be used to validate both the calibration and the discriminatory power of a rating model? A. Accuracy indices. B. Classification tests. C. Migration matrices. D. Traffic lights approach.
Page 114
2018 Kaplan, Inc.
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
C o n c e pt C h e c k e r An s w e r s
1. B The validation group could be involved with the rating system design and development
process as long as sufficient controls are in place to ensure independence. For example, the internal audit group could confirm that the validation group is acting independently.
There is more emphasis on qualitative validation over quantitative validation. Structure and model methodology is dealt with under qualitative validation, not quantitative. The breadth and depth of validation is not primarily focused on the dollar value of the loans outstanding and takes a broader approach by considering the type of credit portfolios analyzed, the complexity of the financial institution, and the level of market volatility.
2. C Dynamic properties include rating systems stability and attributes of migration
matrices. Calibration looks at the relative ability to estimate probability of default (PD). Discriminatory power is the relative ability of a rating model to accurately differentiate between defaulting and non-defaulting entities for a given forecast period. Sample representativeness is demonstrated when a sample from a population is taken and its characteristics match those of the total population.
3. A Heuristic models are more easily accepted since they mirror past experience and the credit
assessments tend to be consistent with cultural norms. In contrast, statistical models are less easily accepted given the high technical knowledge demands to understand them and the high complexity that creates challenges when interpreting the output.
Completeness refers to the sufficiency in number of factors used for credit granting purposes since many default-based models use very few borrower characteristics. In contrast, statistical- based models allow for many borrower characteristics to be used. Consistency refers to models making sense and being appropriate for their intended use. For example, statistical models may produce relationships between variables that are nonsensical, so the process of eliminating such variables increases consistency. Objectivity is achieved when the rating system can clearly define creditworthiness factors with the least amount of interpretation required, choosing between judgment-based versus statistical-based models.
4. A
If data is created from less than a full credit cycle, the estimates will be biased by the favorable or unfavorable stages during the selected period within the cycle.
Validating central tendency in the long term is done through backtesting and stress testing. In practice, it is almost impossible to have credit rules and regulations remain stable for even five years of a credit cycle. To make inferences about the population, it is necessary to use out- of-sample testing whereby the observations are created from the same lending technology but were not included in the development sample.
5. B Classification tests include the binomial test, chi-square test, and normality test. Those tests
are used to analyze discriminatory power and calibration.
Accuracy indices and migration matrices are used only for discriminatory power. The traffic lights approach is used only for calibration.
2018 Kaplan, Inc.
Page 115
The following is a review of the Operational and Integrated Risk Management principles designed to address the learning objectives set forth by GARP. This topic is also covered in:
M o d e l Ri s k
Topic 47
E x a m F o c u s
Models are indispensible in modern finance in quantifying and managing asset-liability risk management, credit risk, market risk, and many other risks. Models rely on a range of data input based on a combination of historical data and risk assumptions, and are critical in managing risk exposures and financial positions. However, models rely on the accuracy of inputs, and errors give rise to model risk. Model risk can range from errors in inputs and assumptions to errors in implementing or incorrectly interpreting a model, and can result in significant losses to market participants. For the exam, be able to identify and explain common model errors, model implementation and valuation issues, and model error mitigation techniques. Also, be familiar with the two case studies discussed related to model risk: Long-Term Capital Management and the London Whale incident.
S o u r c e s o f M o d e l R i s k

LO 46.3: Describe challenges related to data quality and explain steps that can be

LO 46.3: Describe challenges related to data quality and explain steps that can be taken to validate a models data quality.
Challenges to Data Quality
Strong data quality is crucial when performing quantitative validation. General challenges involved with data quality include (1) completeness, (2) availability, (3) sample representativeness, (4) consistency and integrity, and (3) data cleansing procedures.
2018 Kaplan, Inc.
Page 109
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
Defaults are the key constraint in terms of creating sufficiently large data sets for model development, rating quantification, and validation purposes. As a result, reliability and completeness are important issues. In addition, the definition of default needs to be consistent between the potentially wide variety of data collection sources and the Basel II definition of default.
Sample size and sample homogeneity present challenges as well. Practically speaking, it is difficult to create samples from a population over a long period using the same lending technology. Lending technology refers to the information, rules, and regulations used in credit origination and monitoring. In practice, it is almost impossible to have credit rules and regulations remain stable for even five years of a credit cycle. Changes occur because of technological breakthroughs that allow for more efficient handling of the credit function, market changes and new segments that require significant changes to credit policies, and merger and acquisition activity. Unfortunately, the changes result in less consistency between the data used to create the rating model and the population to which the model is applied.
The time horizon of the data may be problematic because the data should be created from a full credit cycle. If it is less than a full cycle, the estimates will be biased by the favorable or unfavorable stages during the selected period within the cycle.
Other data management issues such as outliers, missing values, and unrepresentative data may create challenges with validation.
Note that data quality involves the use of samples in the model building process. It is not easy to make inferences about the population merely from the samples used in the model. To do so, it is necessary to calibrate appropriately and do out-of-sample testing. Out-of- sample testing refers to observations created from the same lending technology but not included in the development sample.
Validating a Models Data Quality
Validating data quality focuses on the stability of the lending technology and the degree of calibration required to infer sample results to the population. For example, if the observed in-sample default rate differs from that of the population, then the validation process should confirm that the calibration takes into account the difference. Or, if the lending technology changes due to a merger or acquisition, then the validation process must confirm the corresponding recalibration. The same confirmation of recalibration is required if there are material differences between borrowers profiles in the sample versus the population.
An incorrect long-term average annual default rate results in an incorrect default probability, so validation must ensure that the long-term default rate is reasonably correct. Statistical central tendency is the average value to which population characteristics converge after many iterations of a given task. In applying central tendency to defaults, given relatively few defaults (i.e., few iterations) in any year during normal periods, it is not usually possible for the validation group to properly validate central tendency for at least 18 months. The time period is dependent on the markets, the nature of the lending facilities, and the characteristics of the customer segments. Validating central tendency in the long term is conducted through backtesting and stress testing.
Page 110
2018 Kaplan, Inc.
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
The validation group should also watch market prices, consider information from the marketing department, and analyze significant transactions to determine appropriate benchmarks in which to compare the financial institution with its direct competitors.

LO 46.2: Compare qualitative and quantitative processes to validate internal

LO 46.2: Compare qualitative and quantitative processes to validate internal ratings, and describe elements of each process.
The goal of qualitative validation is to correctly apply quantitative procedures and to correctly use ratings. Qualitative and quantitative validation are complements although a greater emphasis is placed on qualitative validation given its holistic nature. In other words, neither a positive nor negative conclusion on quantitative validation is sufficient to make an overall conclusion.
Elements of Qualitative Validation
Rating systems design involves the selection of the correct model structure in context of the market segments where the model will be used. There are five key areas regarding rating systems that are analyzed during the qualitative validation process: (1) obtaining probabilities of default, (2) completeness, (3) objectivity, (4) acceptance, and (5) consistency.
Obtaining probabilities o f default (PD). Using statistical models created from actual historical data allows for the determination of the PD for separate rating classes through the calibration of results with the historical data. A direct PD calculation is possible with logistic regression, whereas other methods (e.g., linear discriminant analysis) require an adjustment. An ex post validation of the calibration of the model can be done with data obtained during the use of the model. The data would allow continuous monitoring and validation of the default parameter to ensure PDs that are consistent with true economic conditions.
2018 Kaplan, Inc.
Page 107
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
C om pleteness o f ra tin g system . All relevant information should be considered when determining creditworthiness and the resulting rating. Given that most default risk models include only a few borrower characteristics to determine creditworthiness, the validation process needs to provide assurance over the completeness of factors used for credit granting purposes. Statistical-based models allow for many borrower characteristics to be used, so there needs to be validation of the process of adding variables to the model to have greater coverage of appropriate risk factors.
O b jectivity o f ra tin g system. Objectivity is achieved when the rating system can clearly define creditworthiness factors with the least amount of interpretation required. A judgment- based rating model would likely be fraught with biases (with low discriminatory power of ratings); therefore, it requires features such as strict (but reasonable) guidelines, proper staff training, and continual benchmarking. A statistical-based ratings model analyzes borrower characteristics based on actual data, so it is a much more objective model.
A cceptance o f ra tin g system. Acceptance by users (e.g., lenders and analysts) is crucial, so the validation process must provide assurance that the models are easily understood and shared by the users. In that regard, the output from the models should be fairly close to what is expected by the users. In addition, users should be educated as to the key aspects of models, especially statistical-based ones, so that they understand them and can make informed judgments regarding acceptance. Heuristic models (i.e., expert systems) are more easily accepted since they mirror past experience and the credit assessments tend to be consistent with cultural norms. In contrast, fuzzy logic models and artificial neural networks are less easily accepted given the high technical knowledge demands to understand them and the high complexity that creates challenges when interpreting the output.
C onsistency o f ra tin g system. The validation process must ensure that the models make sense and are appropriate for their intended use. For example, statistical models may produce relationships between variables that are nonsensical, so the process of eliminating such variables increases consistency. The validation process would test such consistency. In contrast, heuristic models do not suffer from the same shortcoming since they are based on real life experiences. Statistical models used in isolation may still result in rating errors due to the mechanical nature of information processing. As a result, even though such models can remain the primary source of assigning ratings, they must be supplemented with a human element to promote the inclusion of all relevant and important information (usually qualitative and beyond the confines of the model) when making credit decisions.
Additionally, the validation process must deal with the continuity of validation processes, which includes periodic analysis of model performance and stability, analysis of model relationships, and comparisons of model outputs versus actual outcomes. In addition, the validation of statistical models must evaluate the completeness of documentation with focus on documenting the statistical foundations. Finally, validation must consider external benchmarks such as how rating systems are used by competitors.
Elements of Quantitative Validation
Quantitative validation comprises the following areas: (1) sample representativeness, (2) discriminatory power, (3) dynamic properties, and (4) calibration.
Page 108
2018 Kaplan, Inc.
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
S am ple representativeness. Sample representativeness is demonstrated when a sample from a population is taken and its characteristics match those of the total population. A key problem is that some loan portfolios (in certain niche areas or industries) have very low default rates, which frequently results in an overly low sample size for defaulting entities. The validation process would use bootstrap procedures that randomly create samples through an iterative process that combines items from a default group and items from a non-default group. The rating model is reassessed using the new samples; after analyzing a group of statistically created models, should the end result be stable and common among the models, then the reliability of the result is satisfied. If not, instability risk would still persist and further in-depth analysis would be required. Using more homogeneous subsets in the form of cluster analysis, for example, could provide a more stable result. Alternatively, the model could focus on key factors within the subsets or consider alternative calibrations.
D iscrim inatory p ow er. Discriminatory power is the relative ability of a rating model to accurately differentiate between defaulting and non-defaulting entities for a given forecast period. The forecast period is usually 12 months for PD estimation purposes but is longer for rating validation purposes. It also involves classifying borrowers by risk level on an overall basis or by specific attributes such as industry sector, size, or geographical location.
D yn am ic properties. Dynamic properties include rating systems stability and attributes of migration matrices. In fact, the use of migration matrices assists in determining ratings stability. Migration matrices are introduced after a minimum two-year operational period for the rating model. Ideal attributes of annual migration matrices include (1) ascending order of transition rates to default as rating classes deteriorate, (2) stable ratings over time (e.g., high values being on the diagonal and low values being off-diagonal), and (3) gradual rating movements as opposed to abrupt and large movements (e.g., migration rates of +/ one class are higher than those of +/ two classes). Should the validation process determine the migration matrices to be stable, then the conclusion is that ratings move slowly given their relative insensitivity to credit cycles and other temporary events.
C alibration. Calibration looks at the relative ability to estimate PD. Validating calibration occurs at a very early stage, and because of the limited usefulness in using statistical tools to validate calibration, benchmarking could be used as a supplement to validate estimates of probability of default (PD), loss given default (LGD), and exposure at default (EAD). The benchmarking process compares a financial institutions ratings and estimates to those of other comparable sources; there is flexibility permitted in choosing the most suitable benchmark.
D a t a Q u a l
i t y

LO 46.1: Explain the process of model validation and describe best practices for

LO 46.1: Explain the process of model validation and describe best practices for the roles of internal organizational units in the validation process.
According to the Basel Committee ( 2 0 0 4 ) a rating system (or a rating model) comprises all of the methods, processes, controls, and data collection and IT systems that support the assessment of credit risk, the assignment of internal risk ratings, and the quantification of default and loss estimates.
To validate a rating model, a financial institution must confirm the reliability of the results produced by the model and that the model still meets the financial institutions operating needs and any regulatory requirements. The tools and approaches to validation are regularly reassessed and revised to stay current with the changing market and operating environment. The breadth and depth of validation should be consistent with the type of credit portfolios analyzed, the complexity of the financial institution, and the level of market volatility.
The rating model validation process includes a series of formal activities and tools to determine the accuracy of the estimates for the key risk components as well as the models predictive power. The overall validation process can be divided between quantitative and qualitative validation. Quantitative validation includes comparing ex post results of risk measures to ex ante estimates, parameter calibrations, benchmarking, and stress tests. Qualitative validation focuses on non-numerical issues pertaining to model development such as logic, methodology, controls, documentation, and information technology.
The rating model validation process requires confirmation and method of use within the financial institution. Results must be sufficiently detailed in terms of weaknesses and limitations, in the form of reports that are forwarded regularly to the internal control group
1. Basel Committee on Banking Supervision (2004 and 2006), International Convergence of
Capital Measurement and Capital Standards, A Revised Framework, Basel, Switzerland.
Page 104
2018 Kaplan, Inc.
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
and to regulatory agencies. At the same time, there must be a review of any anticipated remedies should the model prove to be weak. A summary of the overall process of model validation is provided in Figure 1.
Figure 1: Model Validation Process
Source: Figure 5.1. Fundamental steps in rating systems validation process. Reprinted from Developing, Validating and Using Internal Ratings, by Giacomo De Laurentis, Renato Maino, Luca Molteni, (Hoboken, New Jersey: John Wiley & Sons, 2010), p 239.
Best Practices
The Basel Committee (2004) outlined the following two key requirements regarding corporate governance and oversight:
All material aspects of the rating and estimation processes must be approved by the banks board of directors or a designated committee thereof and senior management. Those parties must possess a general understanding of the banks risk rating system and detailed comprehension of its associated management reports. Senior management must provide notice to the board of directors or a designed committee thereof of material changes or exceptions from established policies that will materially impact the operations of the banks rating system.
Senior management must also have a good understanding of the rating systems design and operation, and must approve material differences between established procedure and actual practice. Management must also ensure, on an ongoing basis, that the rating system is operating properly. Management and staff in the credit control function must meet regularly to discuss the performance of the rating process, areas needing improvement, and the status of efforts to improve previously identified deficiencies.
2018 Kaplan, Inc.
Page 105
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
In response to these two requirements, best practices for the roles of internal organizational units in the validation process include: 1. Senior management needs to examine the recommendations that arise from the
validation process together with analyzing the reports that are prepared by the internal audit group.
2. Smaller financial institutions require, at a minimum, a manager who is appointed to
direct and oversee the validation process.
3. The validation group must be independent from the groups that are developing and
maintaining validation models and the group(s) dealing with credit risk. The validation group should also be independent of the lending group and the rating assignment group. Ultimately, the validation group should not report to any of those groups.
4. Should it not be feasible for the validation group to be independent from designing and
developing rating systems, then the internal audit group should be involved to ensure that the validation group is executing its duties with independence. In such a case, the validation group must be independent of the internal audit group.
3.
6.
In general, all staff involved in the validation process must have sufficient training to perform their duties properly.
Internal ratings must be discussed when management reports to or meets with the credit control group.
7. The internal audit group must examine the independence of the validation group and
ensure that the validation group staff is sufficiently qualified.
8. Given that validation is mainly done using documentation received by groups dealing
with model development and implementation, the quality of the documentation is important. Controls must be in place to ensure that there is sufficient breadth, transparency, and depth in the documentation provided.
A summary of the validation and control processes involving various internal organizational units is provided in Figure 2.
Page 106
2018 Kaplan, Inc.
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
Figure 2: Validation and Control Processes
Models
Procedures
Tools
Management
decision
Task: model development and backtesting Owner: credit risk models
Task: credit risk procedures maintenance Owner: lending units/internal control units Task: continuous test of models/
development unit
processes/tools performance
Owner: lending unit/internal audit Organization/
Risk management/
CRO
COO
Task: operations
maintenance Owner: lending
units/IT/ internal audit
Task: lending
policy applications
Owner: central and decentralized
units/internal control units
Task: lending policy suitability
Owner: validation unit/internal audit
Lending unit/ CLO/COO
Lending unit/ CLO/CRO
Top management/Surveillance Board/Board of Directors
Basic controls
Second controls
layer
Third controls
layer
Accountability for supervisory
purposes
Source: Table 5.1. Processes and roles o f validation and control o f internal rating system. Reprinted from Developing, Validating and Using Internal Ratings, by Giacomo De Laurentis, Renato Maino, Luca Molteni, (Hoboken, New Jersey: John Wiley & Sons, 2010), p. 241.
C o m p a r i s o n o f Q u a l
i t a t i v e a n d Q u a n t i t a t i v e V a l
i d a t i o n P r o c e s s e s

LO 45.6: Explain the importance of multivariate EVT for risk management.

LO 45.6: Explain the importance of multivariate EVT for risk management.
Multivariate EVT is important because we can easily see how extreme values can be dependent on each other. A terrorist attack on oil fields will produce losses for oil companies, but it is likely that the value of most financial assets will also be affected. We can imagine similar relationships between the occurrence of a natural disaster and a decline in financial markets as well as markets for real goods and services.
Multivariate EVT has the same goal as univariate EVT in that the objective is to move from the familiar central-value distributions to methods that estimate extreme events. The added feature is to apply the EVT to more than one random variable at the same time. This introduces the concept of tail dependence, which is the central focus of multivariate EVT. Assumptions of an elliptical distribution and the use of a covariance matrix are of limited use for multivariate EVT.
Modeling multivariate extremes requires the use of copulas. Multivariate EVT says that the limiting distribution of multivariate extreme values will be a member of the family of EV copulas, and we can model multivariate EV dependence by assuming one of these EV copulas. The copulas can also have as many dimensions as appropriate and congruous with the number of random variables under consideration. However, the increase in the dimensions will present problems. If a researcher has two independent variables and classifies univariate extreme events as those that occur one time in a 100, this means that the researcher should expect to see one multivariate extreme event (i.e., both variables taking extreme values) only one time in 100 x 100 = 10,000 observations. For a trinomial distribution, that number increases to 1,000,000. This reduces drastically the number of multivariate extreme observations to work with, and increases the number of parameters to estimate.
Page 100
2018 Kaplan, Inc.
Topic 45 Cross Reference to GARP Assigned Reading – Dowd, Chapter 7
Ke y C o n c e pt s
LO 45.1 Estimating extreme values is important since they can be very costly. The challenge is that since they are rare, many have not even been observed. Thus, it is difficult to model them.
LO 45.2 Extreme value theory (EVT) can be used to model extreme events in financial markets and to compute VaR, as well as expected shortfall.
LO 45.3 The peaks-over-threshold (POT) approach is an application of extreme value theory. It models the values that occur over a given threshold. It assumes that observations beyond the threshold follow a generalized Pareto distribution whose parameters can be estimated.
LO 45.4 The GEV and POT approach have the same goal and are built on the same general principles of extreme value theory. They even share the same shape parameter:
LO 45.5 The parameters of a generalized Pareto distribution (GPD) are the scale parameter (3 and the shape parameter Both of these can be estimated using maximum-likelihood technique
When applying the generalized Pareto distribution, the researcher must choose a threshold. There is a tradeoff because the threshold must be high enough so that the GPD applies, but it must be low enough so that there are sufficient observations above the threshold to estimate the parameters.
LO 45.6 Multivariate EVT is important because many extreme values are dependent on each other, and elliptical distribution analysis and correlations are not useful in the modeling of extreme values for multivariate distributions. Modeling multivariate extremes requires the use of copulas. Given that more than one random variable is involved, modeling these extremes can be even more challenging because of the rarity of multiple extreme values occurring at the same time.
2018 Kaplan, Inc.
Page 101
Topic 45 Cross Reference to GARP Assigned Reading – Dowd, Chapter 7
C o n c e pt Ch e c k e r s
1.
2.
3.
4.
5.
According to the Fisher-Tippett theorem, as the sample size n gets large, the distribution of extremes converges to: A. a normal distribution. B. a uniform distribution. C. a generalized Pareto distribution. D. a generalized extreme value distribution.
The peaks-over-threshold approach generally requires: A. more estimated parameters than the GEV approach and shares one parameter
B. fewer estimated parameters than the GEV approach and shares one parameter
with the GEV.
with the GEV.
C. more estimated parameters than the GEV approach and does not share any
D. fewer estimated parameters than the GEV approach and does not share any
parameters with the GEV approach.
parameters with the GEV approach.
In setting the threshold in the POT approach, which of the following statements is the most accurate? Setting the threshold relatively high makes the model: A. more applicable but decreases the number of observations in the modeling
B.
procedure. less applicable and decreases the number of observations in the modeling procedure.
C. more applicable but increases the number of observations in the modeling
D. less applicable but increases the number of observations in the modeling
procedure.
procedure.
A researcher using the POT approach observes the following parameter values: (3 = 0.9, = 0.15, u = 2% and Nu/n = 4%. The 5% VaR in percentage terms is: A. 1.034. B. 1.802. C. 2.204. D. 16.559.
Given a VaR equal to 2.56, a threshold of 1%, a shape parameter equal to 0.2, and scale parameter equal to 0.3, what is the expected shortfall? A. 3.325. B. 3.526. C. 3.777. D. 4.086.
Page 102
2018 Kaplan, Inc.
Topic 45 Cross Reference to GARP Assigned Reading – Dowd, Chapter 7
C o n c e pt C h e c k e r An s w e r s
1. D The Fisher-Tippett theorem says that as the sample size n gets large, the distribution of
extremes, denoted M , converges to a generalized extreme value (GEV) distribution.
2. B The POT approach generally has fewer parameters, but both POT and GEV approaches
share the tail parameter t
3. A There is a trade-off in setting the threshold. It must be high enough for the appropriate
theorems to hold, but if set too high, there will not be enough observations to estimate the parameters.
,
0.9 [ 1 0.15 [0.04
– a -0.95)
J – 0.15 J
-1
4. B V aR – 2 +
VaR = 1.802
5. A ES =
5
1-t
1 – t
0.3 0.2×1 2.560 ——- + ————– = 3.325 1- 0.2
1- 0.2

2018 Kaplan, Inc.
Page 103
The following is a review of the Operational and Integrated Risk Management principles designed to address the learning objectives set forth by GARP. This topic is also covered in:
Va l i d a t i n g Ra t i n g M o d e l s
Topic 46
Ex a m Fo c u s
This is a specialized and rather detailed topic that deals with rating system validation. There is broad coverage of both qualitative and quantitative validation concepts with greater importance being assigned to qualitative validation. For the exam, focus on best practices as well as the specific elements of qualitative and quantitative validation. Within the realm of quantitative validation, focus specifically on the concepts of calibration and discriminatory power. Note that this material is an extension of the Rating Assignment Methodologies topic from Book 2 (Topic 19).
M o d e l V a l i d a t i o n

LO 45.4: Compare and contrast generalized extreme value and POT.

LO 45.4: Compare and contrast generalized extreme value and POT.
Extreme value theory is the source of both the GEV and POT approaches. These approaches are similar in that they both have a tail parameter denoted There is a subtle difference in that GEV theory focuses on the distributions of extremes, whereas POT focuses on the distribution of values that exceed a certain threshold. Although very similar in concept, there are cases where a researcher might choose one over the other. Here are three considerations. 1. GEV requires the estimation of one more parameter than POT. The most popular
approaches of the GEV can lead to loss of useful data relative to the POT.
2. The POT approach requires a choice of a threshold, which can introduce additional
uncertainty.
3. The nature of the data may make one preferable to the other.
M u l t i v a r i a t e EVT

LO 45.5: Evaluate the tradeoffs involved in setting the threshold level when

LO 45.5: Evaluate the tradeoffs involved in setting the threshold level when applying the GP distribution.
The Gnedenko-Pickands-Balkema-deHaan (GPBdH) theorem says that as u gets large, the distribution Fu(x) converges to a generalized Pareto distribution (GPD), such that:
l/
if ^ exp 1 exp
if = 0
The distribution is defined for the following regions:
x > 0 for 6, > 0 and 0 < x < (3/6, for 6, VaR]. Because it gives an insight into the distribution of the size of losses greater than the VaR, it has become a popular measure to report along with VaR.
The expression for VaR using POT parameters is given as follows:
where: u = threshold (in percentage terms) n = number of observations N = number of observations that exceed threshold
The expected shortfall can then be defined as:
VaR
Example: Compute VaR and expected shortfall given POT estimates
Assume the following observed parameter values:
(3 = 0.75. = 0.25.
u = 1%. . Nu/n = 5%. Compute the 1% VaR in percentage terms and the corresponding expected shortfall measure.
Answer:
VaR 1 + 0.25[0.05 0.75 1
0.25 [0.05
, 0.99 1 0.99
-0.25
– 1
2.486%
0.75-0.25×1 2.486 ———– h—————— = 3.981% 1-0.25
1-0.25
2018 Kaplan, Inc.
Page 99
(1
confidence level)
– i
u
J
$ NL
|-CD
I I
p+(cid:0)
73

Topic 45 Cross Reference to GARP Assigned Reading – Dowd, Chapter 7
G e n e r a l
i z e d E x t r e m e V a l u e a n d P e a k s – O v e r -T h r e s h o l d

LO 45.3: Describe the peaks-over-threshold (POT) approach.

LO 45.3: Describe the peaks-over-threshold (POT) approach.
The peaks-over-threshold (POT) approach is an application of extreme value theory to the distribution of excess losses over a high threshold. The POT approach generally requires fewer parameters than approaches based on extreme value theorems. The POT approach provides the natural way to model values that are greater than a high threshold, and in this way, it corresponds to the GEV theory by modeling the maxima or minima of a large sample.
The POT approach begins by defining a random variable X to be the loss. We define u as the threshold value for positive values of x, and the distribution of excess losses over our threshold u as:
Fu (x) = P{X u u} = F(x + u) F(u) – F(u) 1 – F(u)
This is the conditional distribution for X given that the threshold is exceeded by no more than x. The parent distribution of X can be normal or lognormal, however, it will usually be unknown.
G e n e r a l
i z e d Pa r e t o D i s t r i b u t
i o n

LO 45.2: Describe extreme value theory (EVT) and its use in risk management.

LO 45.2: Describe extreme value theory (EVT) and its use in risk management.
Extreme value theory (EVT) is a branch of applied statistics that has been developed to address problems associated with extreme outcomes. EVT focuses on the unique aspects of extreme values and is different from central tendency statistics, in which the central-limit theorem plays an important role. Extreme value theorems provide a template for estimating the parameters used to describe extreme movements.
Page 96
2018 Kaplan, Inc.
Topic 45 Cross Reference to GARP Assigned Reading – Dowd, Chapter 7
One approach for estimating parameters is the Fisher-Tippett theorem (1928). According to this theorem, as the sample size n gets large, the distribution of extremes, denoted M , converges to the following distribution known as the generalized extreme value (GEV) distribution:
F ( X |
= e x p
F(X |
= exp fl+ ^ x ——- 1
>
f – X Li
l + ^ x ——- 1 l cr
> – 1 / t
J
exp{ ( x \1
exp{
Y
J.
if i = 0
For these formulas, the following restriction holds for random variable X:
The parameters (i and a are the location parameter and scale parameter, respectively, of the limiting distribution. Although related to the mean and variance, they are not the same. The symbol is the tail index and indicates the shape (or heaviness) of the tail of the limiting distribution. There are three general cases of the GEV distribution: 1. > 0, the GEV becomes a Frechet distribution, and the tails are heavy as is the
case for the ^-distribution and Pareto distributions.
2. = 0, the GEV becomes the Gumbel distribution, and the tails are light as is the
case for the normal and log-normal distributions.
3. < 0, the GEV becomes the Weibull distribution, and the tails are lighter than a
normal distribution.
Distributions where 0 and = 0. Therefore, one practical consideration the researcher faces is whether to assume either > 0 or = 0 and apply the respective Frechet or Gumbel distributions and their corresponding estimation procedures. There are three basic ways of making this choice. 1. The researcher is confident of the parent distribution. If the researcher is confident it is
a ^-distribution, for example, then the researcher should assume > 0.
2. The researcher applies a statistical test and cannot reject the hypothesis = 0. In this
case, the researcher uses the assumption = 0.
3. The researcher may wish to be conservative and assume > 0 to avoid model risk.
2018 Kaplan, Inc.
Page 97
Topic 45 Cross Reference to GARP Assigned Reading – Dowd, Chapter 7
P e a k s – O v e r -T h r e s h o l d