LO 46.3: Describe challenges related to data quality and explain steps that can be taken to validate a models data quality.
Challenges to Data Quality
Strong data quality is crucial when performing quantitative validation. General challenges involved with data quality include (1) completeness, (2) availability, (3) sample representativeness, (4) consistency and integrity, and (3) data cleansing procedures.
2018 Kaplan, Inc.
Page 109
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
Defaults are the key constraint in terms of creating sufficiently large data sets for model development, rating quantification, and validation purposes. As a result, reliability and completeness are important issues. In addition, the definition of default needs to be consistent between the potentially wide variety of data collection sources and the Basel II definition of default.
Sample size and sample homogeneity present challenges as well. Practically speaking, it is difficult to create samples from a population over a long period using the same lending technology. Lending technology refers to the information, rules, and regulations used in credit origination and monitoring. In practice, it is almost impossible to have credit rules and regulations remain stable for even five years of a credit cycle. Changes occur because of technological breakthroughs that allow for more efficient handling of the credit function, market changes and new segments that require significant changes to credit policies, and merger and acquisition activity. Unfortunately, the changes result in less consistency between the data used to create the rating model and the population to which the model is applied.
The time horizon of the data may be problematic because the data should be created from a full credit cycle. If it is less than a full cycle, the estimates will be biased by the favorable or unfavorable stages during the selected period within the cycle.
Other data management issues such as outliers, missing values, and unrepresentative data may create challenges with validation.
Note that data quality involves the use of samples in the model building process. It is not easy to make inferences about the population merely from the samples used in the model. To do so, it is necessary to calibrate appropriately and do out-of-sample testing. Out-of- sample testing refers to observations created from the same lending technology but not included in the development sample.
Validating a Models Data Quality
Validating data quality focuses on the stability of the lending technology and the degree of calibration required to infer sample results to the population. For example, if the observed in-sample default rate differs from that of the population, then the validation process should confirm that the calibration takes into account the difference. Or, if the lending technology changes due to a merger or acquisition, then the validation process must confirm the corresponding recalibration. The same confirmation of recalibration is required if there are material differences between borrowers profiles in the sample versus the population.
An incorrect long-term average annual default rate results in an incorrect default probability, so validation must ensure that the long-term default rate is reasonably correct. Statistical central tendency is the average value to which population characteristics converge after many iterations of a given task. In applying central tendency to defaults, given relatively few defaults (i.e., few iterations) in any year during normal periods, it is not usually possible for the validation group to properly validate central tendency for at least 18 months. The time period is dependent on the markets, the nature of the lending facilities, and the characteristics of the customer segments. Validating central tendency in the long term is conducted through backtesting and stress testing.
Page 110
2018 Kaplan, Inc.
Topic 46 Cross Reference to GARP Assigned Reading – De Laurentis, Maino, and Molteni, Chapter 5
The validation group should also watch market prices, consider information from the marketing department, and analyze significant transactions to determine appropriate benchmarks in which to compare the financial institution with its direct competitors.