Net reclassification improvement (NRI) is an index that attempts to quantify how well a new model reclassifies subjects - either appropriately or inappropriately - as compared to an old model.[1] While c-statistics or AUC has been the standard metric for quantifying improvements over the last few decades, several studies have analyzed the limitations of this metric including lack of clinical relevance and difficulty in interpretation of small magnitude changes.[2][3] This limitation can be best seen in the example of HDL and Framingham Risk Score (FRS). When models, with and without HDL, were analyzed with AUC regarding effect of HDL of modifying FRS, HDL was found not to have a statistical significant effect. However, when analyzed in terms of outcomes, HDL was found to be a significant predictor of heart disease and thus should affect FRS.[4] To overcome this limitation the concept of reclassification, that is how well a new model correctly reclassifies cases, was introduced through the metric of NRI.[5]
Basic Concept
editNRI attempts to quantify how well a new model correctly reclassifies subjects. Typically this comparison is between an original model (e.g. hip fractures as a function age and sex) and a new model which is the original model plus one additional component (e.g. hip fractures as a function of age, sex, and a genetic or proteomic biomarker). NRI is composed of two components, subjects without events and subjects with events. Subject without events who are correctly reclassified lower risk are assigned a +1, as are subjects with an event who are correctly reclassified higher risk. Subjects without events who are incorrectly classified as higher risk are assigned a -1, as are subjects with events who are assigned a lower risk. Subjects not reassigned (i.e., those who remain in the same risk category) are assigned a 0. Sum the scores in each group and divide by the number of subjects in that group. The sum of these two values is the NRI.
Example
editEvent | Test 1 | Total, split | Total | ||
---|---|---|---|---|---|
Non-event | Abnormal | Normal | |||
Test 2 | Abnormal | 18 | 4 | 22 | 28 |
2 | 4 | 6 | |||
Normal | 2 | 6 | 8 | 72 | |
8 | 56 | 64 | |||
Total, split | 20 | 10 | 30 | ||
10 | 60 | 70 | |||
Total | 30 | 70 | 100 |
In a perfect test, all subjects with events would be classified as abnormal and all subjects without events would be classified as normal. Bold indicates subjects correctly classified by both tests. White indicates subjects incorrectly classified by both tests. Green indicates subjects correctly reclassified by test 2. Red indicates subjects incorrectly reclassified by test 2. NRIe = (4-2)/30 = 0.067. NRIne = (8-4)/70 = 0.057. NRI is the sum which is approximately 0.12.
Continuous Net Reclassification Improvement
editContinuous net reclassification improvement is a category-less and more objective form of net reclassification improvement. Furthermore, continuous NRI is less affected by event rates. [6]
Disconfirmation
editNRI has been demonstrated to be invalid. Specifically, the NRI is likely to be positive even for uninformative markers. This is not the case for other metrics such as area-under-the-curve, Brier score or net benefit.[7]
R Packages for Statistical Coding
editPredictABEL: an R package for the assessment of risk prediction models.[8]
survIDINRI: IDI and NRI for comparing competing risk prediction models with censored survival data. [9]
nricens: calculate the net reclassification improvement (NRI) for risk prediction models with time to event and binary data.[10]
References
edit- ^ Leening MJG, Vedder MM, Witteman JCM, Pencina MJ, Steyerberg EW. Net reclassification improvement: computation, interpretation, and controversies: a literature review and clinician’s guide. Ann Intern Med. 2014;160(2):122-131.
- ^ 1. Cook NR. Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation. 2007;115(7):928-935.
- ^ Pencina MJ, D’Agostino RB, Pencina KM, Janssens ACJW, Greenland P. Interpreting incremental value of markers added to risk prediction models. Am J Epidemiol. 2012;176(6):473-481.
- ^ Steyerberg EW, Calster BV, Pencina MJ. Performance Measures for Prediction Models and Markers: Evaluation of Predictions and Classifications. Revista Española de Cardiología (English Edition). 2011;64(9):788-794
- ^ Pencina MJ, D’Agostino RB, D’Agostino RB, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27(2):157-172; discussion 207-212.
- ^ Pencina, M.J., D'Agostino Sr, R.B. and Steyerberg, E.W., 2011. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Statistics in medicine, 30(1), pp.11-21.
- ^ Pepe, M. S., Fan, J., Feng, Z., Gerds, T., & Hilden, J. (2015). The net reclassification index (NRI): a misleading measure of prediction improvement even with independent test data sets. Statistics in biosciences, 7(2), 282-295.
- ^ Kundu S, Aulchenko YS, van Duijn CM, Janssens AC. PredictABEL: an R package for the assessment of risk prediction models. European journal of epidemiology. 2011 Apr;26(4):261-4.
- ^ Uno, H. and Cai, T., 2013. survIDINRI: IDI and NRI for comparing competing risk prediction models with censored survival data. R. Package Version, p.1.
- ^ Inoue, Eisuke (2018-05-30), nricens: NRI for Risk Prediction Models with Time to Event and Binary Response Data, retrieved 2022-04-06