Section 6. Economic security
https://doi.org/10.29013/EJEMS-19-4-61-65
Wang Zongdi, Hong Kong International School, China E-mail: [email protected]
DEVELOPMENT OF A PREDICTIVE MODEL FOR DENIAL OF HOME MORTGAGE
Abstract
Objective: This study aims to build a predictive model for the denial of home mortgage in Washington state using logistic regression model.
Methods: A public database was used in this study. A logistic regression was used. Area under curve, optional cutoff point, mis-classification error, sensitivity and specificity were calculated.
Results: A total of49324(19.3%) home mortgage applications out of 255379 had were denied. According to the logistic regression, refinancing was 339.1% more likely to get denied. Home improvement was 291.8% more likely to get denied. Black were 77.8% more likely to get denied, Asian 33.5% more likely and Hispanic 36.3% more likely, and other race were 62.3% more likely to get denied. FHA, FSARHS and VA were more likely to get denied. People without co-applicants were 59.1% more likely to get a denial.
The area under curve was 0.7052. The optional cutoff point is 0.459. The mis-classification error was 0.1905. the sensitivity rate is about 5.4% and the specificity is 99.0%.
Conclusions: In this study, we identified several important predictors for the denial of home mortgage in Washington State in 2016, for example, race, mortgage type.
Keywords:
1. Instruction
There are 5 most common reasons why a home mortgage loan application could be denied: Poor Credit History; Insufficient Income/Asset Documentation; Down Payment is Too Small; Problems With the Property; Inadequate Employment History.
Recent news articles suggest that the significantly higher mortgage denial rates for black and Hispanic
borrowers establish the presence of racial discrimination in mortgage lending.
This study aims to build a predictive model for the denial of home mortgage in Washington state using logistic regression model.
2. Data and Methods:
Data:
Inside this data set contains 466.566 observations of Washington State home loans - variables
include; demographic information, area specific data, loan status, property type, loan type, loan purpose and originating agency. The data is available at: https://www.kaggle.com/miker400/washington-state-home-mortgage-hdma2016.
Optimal Cutoff for Binary Classification maximizes the accuracy.
Mis-Classification Error is the proportion of all events that were incorrectly classified, for a given probability cutoff score.
Sensitivity: probability that a test result will be positive when the disease is present (true positive rate.
Specificity: probability that a test result will be negative when the disease is not present (true negative rate, expressed as a percentage). e, expressed as a percentage).
3. Results
A total of 49324(19.3%) home mortgage applications out of 255379 had were denied.
Figure 1. Matrix of correlations between variables Table 2.- Logistic Regression for Mental Health
Estimate Std. Error z value Pr(>|z|)
1 2 3 4 5 6
(Intercept) -1.219 0.075 -16.257 < 2e-16 ***
tract to msamd income -0.003 0.000 -9.400 < 2e-16 ***
population 0.000 0.000 -6.016 0.000 ***
minority_population 0.002 0.001 3.593 0.000 ***
number of owner occupied units 0.000 0.000 -0.476 0.634
number of1to4 family units 0.000 0.000 5.633 0.000 ***
loan amount 000s 0.000 0.000 1.760 0.078
hud median family income 0.000 0.000 -20.314 < 2e-16 ***
applicant income 000s -0.001 0.000 -10.018 < 2e-16 ***
1 2 3 4 5 6
Type_FHA 0.523 0.024 22.182 < 2e-16 ***
Type_FSARHS 0.456 0.085 5.364 0.000 ***
Type_VA 0.137 0.027 5.115 0.000 ***
Hom imp 1.366 0.031 44.105 < 2e-16 ***
Refinancing 1.480 0.018 81.418 < 2e-16 ***
No co app 0.465 0.016 29.803 < 2e-16 ***
Male 0.001 0.017 0.032 0.975
Black 0.575 0.042 13.627 < 2e-16 ***
Asian 0.289 0.026 10.954 < 2e-16 ***
Other race 0.485 0.047 10.359 < 2e-16 ***
Hispanic 0.310 0.031 9.925 < 2e-16 ***
According to the logistic regression, refinancing was 339.1% more likely to get denied. Home improvement was 291.8% more likely to get denied. Black were 77.8% more likely to get denied, Asian 33.5% more
likely and Hispanic 36.3% more likely, and other race were 62.3% more likely to get denied. FHA, FSARHS and VA were more likely to get denied. People without co-applicants were 59.1% more likely to get a denial.
Table 2.- Odds Ratio According to Logistic Regression
Variable OR Risk Increase
Refinancing 4.391 3.391
Hom imp 3.918 2.918
Black 1.778 0.778
Type_FHA 1.687 0.687
Other race 1.623 0.623
No co app 1.591 0.591
Type_FSARHS 1.578 0.578
Hispanic 1.363 0.363
Asian 1.335 0.335
Type_VA 1.146 0.146
minority_population 1.002 0.002
Male 1.001 0.001
number of1to4 family units 1.000 0.000
loan amount 000s 1.000 0.000
hud median family income 1.000 0.000
number of owner occupied units 1.000 0.000
population 1.000 0.000
applicant income 000s 0.999 -0.001
tract to msamd income 0.997 -0.003
CTl on*H
moo
tH
CT1 <N
00
vHoO
1 o
rv
00 ID
00 lo"
CTl
00 h-tfl
"-loo r--
LT1
o"
ro ID rn
LO
I-
T
m m
THld
m ■ ro
1X1 t
<HtD
rM
o cd T-Tcm o
IS
o o
o
IS
o o o
o
IS
o o o
^o o
IO
I
o o o
I
o o o
I
o o o
I
en cn
CT1
I
r-.
CT1
cn
I
^o ^o JtÖ .<5o
O ^ o O o o ° ° V do Q\x do q> o -w A- ^ ^
C>*
<5>V
\
a5 <<:
o*
c
>-J o
^ ^ ¿S'
^ y o"
<5» ' <5^ r? ^
^ v° ^ ^ /
o •
Figure 2. Odds Ratio (blue) and Risk Increase (red) According to Logistic Regression
Figure 3. ROC in testing sample for Logistic Regression
The area under curve was 0.7052. The optional cutoff time is 0.459. The mis-classification error was 0.1905. the sensitivity rate is about 5.4% and the specificity is 99.0%.
4. Discussions
A total of49324(19.3%) home mortgage applications out of 255379 had were denied. According to the logistic regression, refinancing was 339.1% more likely to get denied. Home improvement was 291.8% more likely to get denied. Black were
77.8% more likely to get denied, Asian 33.5% more likely and Hispanic 36.3% more likely, and other race were 62.3% more likely to get denied. FHA, FSARHS and VA were more likely to get denied. People without co-applicants were 59.1% more likely to get a denial.
The area under curve was 0.7052. The optional cutoff point is 0.459. The mis-classification error was 0.1905. the sensitivity rate is about 5.4% and the specificity is 99.0%.
In this study, we identified several important predictors for the denial of home mortgage in Washington State in 2016, for example, race, mortgage type.
References:
1. Peng C.J., Lee K. L., Ingersoll G. M. An Introduction to Logistic Regression Analysis and Reporting. The Journal of Educational Research, 96(1),- P. 3-14.
2. Tabachnick B., and Fidell L. Using Multivariate Statistics (4th Ed.). Needham Heights, MA: Allyn & Bacon, 2001.
3. Stat Soft. Electronic Statistics Textbook. URL:http://www.statsoft.com/textbook/stathome.html. http://www.statsoft.com/textbook/stathome.html.
4. Stokes M., Davis C. S. Categorical Data Analysis Using the SAS System, SAS Institute Inc., 1995.
5. Mortgage risk assessment. URL:https://www.mortgagecompliancemagazine.com > Featured.