Crime rate prediction using house properties via artificial neural network versus linear regression models

Wang Yuzhe

Section 7. Population economics

Wang Yuzhe, Cushing Academy E-mail: [email protected]

CRIME RATE PREDICTION USING HOUSE PROPERTIES VIA ARTIFICIAL NEURAL NETWORK VERSUS LINEAR REGRESSION MODELS

Abstract:

Objective: This study aimed to build a predictive model for crime rate based on 13 house features using artificial neural network versus linear regression models.

Methods: Boston housing data was used for this study and it is publicly available at https://archive. ics.uci.edu/ml/datasets/Housing. Per capita crime rate by town was the outcome of interest and three other features were used as predictors, namely, 1) proportion of residential land zoned for lots over 25.000 sq.ft., 2): proportion of non-retail business acres per town, 3) Charles River dummy variable (= 1 if tract bounds river; 0 otherwise), 4) nitric oxides concentration (parts per 10 million), 5) average number of rooms per dwelling, 6) proportion of owner-occupied units built prior to 1940, 7) weighted distances to five Boston employment centers, 8) index of accessibility to radial highways, 9) full-value property-tax rate per $10.000.10): pupil-teacher ratio by town, 11) 1000(Bk - 0.63)A2 where Bk is the proportion ofblacks by town, 12):% lower status of the population, 13) Median value of owner-occupied homes in $1000's. All the records were randomly assigned into 2 groups: training sample (75%) and testing sample (25%). Two models were built using training sample: artificial neural network and linear regression. For artificial neural network, the input layer has 13 inputs, the two hidden layers have 5 and 3 neurons and the output layer has a single output. Mean squared errors (MSE) were calculated and compared between both models. A cross validation was conducted using a loop for the neural network and the cv. glm function in the boot package for the linear model. A package called "neuralnet" in R was used to conduct neural network analysis.

Results: For testing sample, the MSE was 93.5 for the linear regression and 64.7 for the artificial neural network. Artificial neural network performed better clearly. In cross validation, the average MSE for the neural network (37.0) is lower than the one of the linear model (43.12) although there seems to be a certain degree of variation in the MSEs of the cross validation. This may depend on the splitting of the data or the random initialization of the weights in the net.

Conclusions: In this study, we built a predictive model for crime rate using neural network and compared its performance with a more population approach-linear regression. This study suggests

that it is possible to develop a reproducible and transportable predictive instrument for crime rate using common available housing features.

Keywords: crime rate, prediction model, linear regression and neural network.

1. Introduction

Based on a review of the extant literature and discussions with various officials at all jurisdiction levels across the country, it is highly doubtful that any serious, systematic forecasting of crime rates is done anywhere. It is safe to say that the current approach to forecasting crime, insofar as it exists, is extremely crude, for example, mapping crimes by police precinct or beat, and then assigning more resources to the areas with the most hits in the past.

Public safety is the most important metric for elected officials, especially at the local level, and allocating scarce crime fighting resources efficiently is an essential element of achieving this goal [1]. There are several potential reasons for this failure that come to mind. The first reason is that existing tools may simply be insufficient to provide meaningful forecasts. Technical forecasting, using economics models, computer technology, and mapping tools, is a modern phenomenon [2] and these methods are unproven, occasionally difficult to interpret, and occasionally expensive to set up and operate, especially for communities with tight budgets.

An artificial neural network (ANN), often just called a "neural network" (NN), is a mathematical model or computational model based on biological neural networks, in other words, is an emulation of biological neural system. This model has been used in other medical areas but not used to predict retinopathy among diabetes patients to our best knowledge. We are unaware of any studies in the literature that have integrated housing features commonly available using Artificial Neural Network. We compared the

Variables:

1. CRIM per capita crime rate by town

2. ZN Proportion of residential land zoned for lots over 25.000 sq.ft.

3. INDUS proportion of non-retail business acres per town

4. CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)

performance ofArtificial Neural Network with linear regression in terms of predictive ability.

2. Data and methods

Boston housing data was used for this study and it is publicly available at https://archive.ics.uci.edu/ml/ datasets/Housing. Per capita crime rate by town was the outcome of interest and three other features were used as predictors, namely, 1) proportion of residential land zoned for lots over 25.000 sq.ft., 2) proportion of non-retail business acres per town, 3) Charles River dummy variable (= 1 if tract bounds river; 0 otherwise), 4) nitric oxides concentration (parts per 10 million), 5) average number of rooms per dwelling, 6) proportion of owner-occupied units built prior to 1940, 7) weighted distances to five Boston employment centers, 8) index of accessibility to radial highways, 9) full-value property-tax rate per $10.000, 10) pupil-teacher ratio by town, 11) 1000(Bk -0.63)A2 where Bk is the proportion of blacks by town, 12) % lower status of the population, 13) Median value ofowner-occupied homes in $1000's. All the records were randomly assigned into 2 groups: training sample (75%) and testing sample (25%). Two models were built using training sample: artificial neural network and linear regression. For artificial neural network, the input layer has 13 inputs, the two hidden layers have 5 and 3 neurons and the output layer has a single output. Mean squared errors (MSE) were calculated and compared between both models. A cross validation was conducted using a loop for the neural network and the cv. glm function in the boot package for the linear model. A package called "neuralnet" in R was used to conduct neural network analysis.

5. NOX

6. RM

7. AGE

8. DIS

9. RAD

10. TAX

11. PTRATIO

12. B

13. LSTAT

14. MEDV

nitric oxides concentration (parts per 10 million)

average number of rooms per dwelling

proportion of owner-occupied units built prior to 1940

weighted distances to five Boston employment centers

index of accessibility to radial highways

full-value property-tax rate per $10.000

pupil-teacher ratio by town

1000(Bk - 0.63)A2 where Bk is the proportion of blacks by town

% lower status of the population

Median value of owner-occupied homes in $1000's

3. Results:

The per capita crime rate by town was 3.21 in the training group, and 4.84 in testing group; overall it was 3.61. Table 2.- Crime Rate And House Properties In Training and Testing Groups

Training Group (N=380) Testing Group (N=126) Overall Group (N=506)

Variable Mean Std Dev Min Max Mean Std Dev Min Max Mean Std Dev Min Max

CRIM 3.21 7.44 0.01 73.53 4.84 11.36 0.01 88.98 3.61 8.6 0.01 88.98

ZN 12.52 24.49 0 100 7.87 19.05 0 95 11.36 23.32 0 100

INDUS 10.84 6.89 0.46 27.74 12.02 6.72 0.74 27.74 11.14 6.86 0.46 27.74

CHAS 7% 0 1 6% 0 1 7% 0 1

NOX 0.55 0.12 0.39 0.87 0.56 0.11 0.4 0.87 0.55 0.12 0.39 0.87

RM 6.28 0.72 3.56 8.78 6.31 0.66 4.93 8.4 6.28 0.7 3.56 8.78

AGE 68.47 27.65 6.2 100 68.9 29.73 2.9 100 68.57 28.15 2.9 100

DIS 3.87 2.17 1.14 12.13 3.56 1.9 1.13 9.19 3.8 2.11 1.13 12.13

RAD 9.19 8.51 1 24 10.63 9.24 1 24 9.55 8.71 1 24

TAX 401.11 164.98 187 711 429.72 177.79 188 711 408.24 168.54 187 711

PTRATIO 18.47 2.16 12.6 22 18.43 2.19 13 21.2 18.46 2.16 12.6 22

B 363.47 81.93 2.52 396.9 336.17 112.95 0.32 396.9 356.67 91.29 0.32 396.9

LSTAT 12.55 7.23 1.73 37.97 12.97 6.9 2.96 30.81 12.65 7.14 1.73 37.97

MEDV 22.54 8.98 5 50 22.51 9.86 5 50 22.53 9.2 5 50

Proportion of residential land zoned for lots to radial highways, 1000(Bk - 0.63)a2 where Bk

over 25.000 sq.ft., nitric oxides concentration is the proportion of blacks by town were signifi-

(parts per 10 million), weighted distances to five cant predictors for crime rate per capita by town

Boston employment centers, index of accessibility (p < 0.05).

Table 3.- Linear Regression Model To Predict Crime Rate Per Capita By Town

Estimate Std. Error T value Pr(> t )

1 2 3 4 5 6

ZN 0.04 0.02 2.35 0.019 *

INDUS -0.08 0.08 -1.10 0.273

CHAS -0.49 1.04 -0.47 0.641

1 2 3 4 5 6

NOX -10.25 4.79 -2.14 0.033 *

RM -0.56 0.53 -1.05 0.292

AGE 0.01 0.02 0.69 0.492

DIS -0.77 0.25 -3.08 0.002 **

RAD 0.51 0.08 6.24 0.000 ***

TAX 0.00 0.00 -0.48 0.629

PTRATIO -0.12 0.17 -0.72 0.474

B -0.02 0.00 -5.96 0.000 ***

LSTAT 0.12 0.07 1.78 0.077

MEDV -0.10 0.05 -1.79 0.074

***: <0.001; **, <0.01; <0.05; <0.10

1 ( 1 ) ( 1 ) ( 1

Figure 1. Artificial Neural Network

The black lines show the connections between each layer and the weights on each connection while the blue lines show the bias term added in each step. The bias can be thought as the intercept of a linear

For Crime Rate Per Capita By Town

model. The net is essentially a black box so we cannot say that much about the fitting, the weights and the model. Suffice to say that the training algorithm has converged and therefore the model is ready to be used.

test$ CRIM testSCRIM

Figure 2. Real vs Predicted Crime Rate In Artificial Neural Network And Linear Regression Model

Figure 3. MSE for Artificial Neural Network for Testing Group

By visually inspecting the plot we can see that the predictions made by the neural network are (in general) more concentrated around the line (a perfect alignment with the line would indicate a MSE of 0 and thus an ideal perfect prediction) than those made by the linear model.

Cross validation is another very important step of building predictive models. In cross validation, the average MSE for the neural network (40.6) is lower than the one of the linear model (43.18) although there seems to be a certain degree ofvariation in the MSEs of the cross validation. This may depend on the splitting of the data or the random initialization of the weights in the net.

4. Discussion

Crime predictions can be developed through both qualitative and quantitative methods. Qualitative approaches to forecasting crime [3], such as environmental scanning, scenario writing, or Delphi groups, are particularly useful in identifying the future nature of criminal activity. In contrast, quantitative methods are used to predict the future scope of crime, and more specifically, crime rates. A common quantitative method for developing forecasts is to extrapolate annual crime rate trends developed through time series models. This approach also involves correlating past crime trends with factors that will influence the future scope of crime, in particular demographic and macro-economic variables.

In this study, we built a predictive model for crime rate using neural network and compared its

performance with a more population approach— -linear regression. This study suggests that it is possible to develop a reproducible and transportable predictive instrument for crime rate using common available housing features.

According to the linear regression, proportion of residential land zoned for lots over 25.000 sq.ft., nitric oxides concentration (parts per 10 million), weighted distances to five Boston employment centers, index of accessibility to radial highways, 1000(Bk - 0.63)2 where Bk is the proportion of blacks by town were significant predictors for crime rate per capita by town.

There are limitations of this study. One of them was associated with artificial neural network method. This method employed deep machine learning method to explore the nonlinear association between crime rate and house properties; however the nonlinear association make it very hard to interpreter the results, specially the association between the rate and individual predictors. Other predictors of crime rate were not available in this database.

In conclusion, we used both artificial neural network and linear regression model to predict the crime rate per capita by town. We found that artificial neural network performed better than linear regression which is the traditional method when build a predictive model. We believe that deep machine learning could be used in crime rate prediction. This might be helpful to improve this public safety issue in the future via better resource allocation.

References:

1. Todd M. Henderson et al. Predicting Crime. University of Chicago Law School Chicago Unbound. 2008.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

2. Olligschlaeger A. M. Artificial Neural Networks and Crime Mapping, in D. Weisburd and T. McEwen, eds., Crime Mapping, Crime Prevention, Crime Prevention Studies 8 (1998).

3. Stephen Schneider et al. Predicting Crime: A Review of the Research. Summary Report. Research and Statistics Division 2002.

Crime rate prediction using house properties via artificial neural network versus linear regression models Текст научной статьи по специальности «Компьютерные и информационные науки»

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Wang Yuzhe

Похожие темы научных работ по компьютерным и информационным наукам , автор научной работы — Wang Yuzhe

Текст научной работы на тему «Crime rate prediction using house properties via artificial neural network versus linear regression models»