https://doi.org/10.29013/EJEMS-21-3-34-39
Xiaochuan Luo, RDF International School, China E-mail: [email protected]
DEVELOPMENT A PREDICTIVE MODEL FOR AUTISM AMONG CHILDREN
Abstract
Objective: This study aims to 1) examine the predictors of Autism 2) build a predictive model for Autism using logistic regression model.
Methods: 2017 National Survey of Children's Health data was used for this study. The National Survey of Children's Health (NSCH) is being conducted by the U. S. Census Bureau for the U. S. Department of Health and Human Services' (HHS) Health Resources and Services Administration's (HRSA) Maternal and Child Health Bureau (MCHB). It is designed to provide national and state-level information about the physical and emotional health and wellbeing of children under the age of18 living in mailable residential housing units in the United States, their families and their communities, as well as information about the prevalence and impact of children with special health care needs.
All the participants who were eligible were randomly assigned into 2 groups: training sample and testing sample. A logistic regression model was built using training sample. Receiver operating characteristic (ROC) was calculated.
Results: About 2.82% of 19047 children had Autism, about 4.29% among 9789 male children and 1.28% among 9258 female children.
According to the logistic regression, when the first adult aged by 1 year, the children were more likely to have Autism (OR=1.018). When the first adult had worse mental health, the children were more likely to have Autism (OR=1.245).
When children's age increased by 1 year, the children were more likely to have autism (OR=1.046). Female children were less likely to have autism (OR=0.292). Children with normal birth weight has less likelihood to have autism (OR=0.453).
Children in a family which is hard to cover basics like food or housing were more likely to have autism (OR=1.264). Children who lived with mentally ill were less likely to have autism (OR=0.488). Children who lived with alcohol/drug problem were more likely to have autism (OR=1.650).
The area under curve was 0.7263. The optional cutoff time is 0.288. The mis-classification error was 0.027. The sensitivity rate is about 1.16% and the specificity is 99.95%.
Conclusions: In this study, we identified important of predictors of autism among children, for example children age, sex, mental health and alcohol/drug problems of adults.
Keywords: Autism, predictive model, children, mental health, logistic regression.
1. Instruction
About 1 in 6 (17%) children aged 3-17 years were diagnosed with a developmental disability, as reported by parents, during a study period of2009-2017. These included autism, attention-deficit/hyperactivity disorder, blindness, and cerebral palsy, among others [1]. About 1 in 54 children has been identified with autism spectrum disorder (ASD) according to estimates from CDC's Autism and Developmental Disabilities Monitoring (ADDM) Network [2].
In this study, we aim to: 1) examine the predictors of the having autism among children; 2) build a predictive model for having Autism using logistic regression model.
2. Data and Methods:
Data:
2017 National Survey of Children's Health data was used for this study. The National Survey of Children's Health (NSCH) is being conducted by the U. S. Census Bureau for the U. S. Department of Health and Human Services' (HHS) Health Resources and Services Administration's (HRSA) Maternal and Child Health Bureau (MCHB). It is designed to provide national and state-level information about the physical and emotional health and wellbeing of children under the age of18 living in mailable residential housing units in the United States, their families and their communities, as well as information about the prevalence and impact of children with special health care needs.
Models:
We also used logistic regression models to calculate the predicted risk. Logistic regression is a part of a category of statistical models called generalized linear models, and it allows one to predict a discrete outcome from a set of variables that may be continuous, discrete, dichotomous, or a combination of these. Typically, the dependent variable is dichotomous and the independent variables are either categorical or continuous.
The logistic regression model can be expressed with the formula:
ln(P/l-P) = + p1*X1 + P2*X2 + ... .+ Pn*X
Model evaluation:
The discriminatory ability - the capacity of the model to separate cases from non-cases, with 1.0 and 0.5 meaning perfect and random discrimination, respectively- was determined using receiver operating characteristic (ROC) curve analysis. ROC curves are commonly used to summarize the diagnostic accuracy of risk models and to assess the improvements made to such models that are gained from adding other risk factors. Sensitivity, specificity, and accuracy will be also calculated and compared. For all these measures, there exist statistical tests to determine whether one model exceeds another in discrimination ability.
Optimal Cutoff for Binary Classification maximizes the accuracy.
Mis-Classification Error is the proportion of all events that were incorrectly classified, for a given probability cutoff score.
Sensitivity: probability that a test result will be positive when the disease is present (true positive rate.
Specificity: probability that a test result will be negative when the disease is not present (true negative rate, expressed as a percentage).
Variables:
3 Results
About 2.82% of 19047 children had Autism, about 4.29% among 9789 male children and 1.28% among 9258 female children.
Basically, a corrgram is a graphical representation of the cells of a matrix of correlations. The idea is to display the pattern of correlations in terms of their signs and magnitudes using visual thinning and correlation-based variable ordering. Moreover, the cells of the matrix can be shaded or colored to show the correlation value. The positive correlations are shown in blue, while the negative correlations are shown in red; the darker the hue, the greater the magnitude of the correlation.
:xzx_
Figure 1. Matrix of correlations between variables
According to the logistic regression, when the first adult aged by 1 year, the children were more likely to have Autism (OR=1.018). When the first adult had worse mental health, the children were more likely to have Autism (OR=1.245).
When children's age increased by 1 year, the children were more likely to have autism (OR=1.046). Female children were less likely to have autism
(OR=0.292). Children with normal birth weight has less likelihood to have autism (OR=0.453).
Children in a family which is hard to Cover Basics Like Food or Housing were more likely to have autism (OR=1.264). Children who lived with mentally ill were less likely to have autism (OR=0.488). Children who lived with alcohol/drug problem were more likely to have autism (OR=1.650).
Table 1. - Logistic Regression
Estimate Std. Error z value Pr(>|z|)
1 2 3 4 5 6
(Intercept) 0.912 1.350 0.676 0.499
HHCOUNT 0.034 0.074 0.461 0.645
A1 SEX -0.238 0.135 -1.758 0.079
A1 BORN -0.107 0.208 -0.514 0.607
A1 GRADE -0.013 0.035 -0.355 0.723
A1 MARITAL -0.002 0.062 -0.033 0.974
A1 AGE 0.017 0.008 2.261 0.024 *
A1 PHYSHEALTH 0.000 0.086 -0.006 0.996
A1 MENTHEALTH 0.219 0.085 2.568 0.010 *
SC AGE YEARS 0.045 0.015 2.973 0.003 **
SC_SEX -1.229 0.145 -8.488 <2e-16 ***
1 2 3 4 5 6
SC RACE R 0.026 0.034 0.777 0.437
AGEPOS4 -0.152 0.089 -1.715 0.086
SC HISPANIC R -0.080 0.199 -0.402 0.687
BIRTHWT L -0.791 0.174 -4.551 0.000 ***
ACE1 0.234 0.076 3.064 0.002 **
ACE3 -0.039 0.177 -0.218 0.827
ACE4 -0.299 0.306 -0.976 0.329
ACE5 0.001 0.261 0.003 0.997
ACE6 -0.155 0.259 -0.601 0.548
ACE7 -0.460 0.252 -1.824 0.068
ACE8 -0.716 0.189 -3.789 0.000 ***
ACE9 0.501 0.236 2.117 0.034 *
ACE10 -0.136 0.284 -0.480 0.631
0,400 0,600 0,S0Û
Figure 2. Odds Ratio Figure
Figure 3. ROC in testing sample for Logistic Regression
The area under curve was 0.7263. The optional cutoff time is 0.288. The mis-classification error was 0.027. The sensitivity rate is about 1.16% and the specificity is 99.95%.
Table 2.
Cut-off Sensitivity Specificity
0.1 14.39% 96.8%
0.3 0.7% 99.9%
0.5 0% 100%
4. Discussions
About 2.82% of 19047 children had Autism, about 4.29% among 9789 male children and 1.28% among 9258 female children.
According to the logistic regression, when the first adult aged by 1 year, the children were more likely to have Autism (OR=1.018). When the first adult had worse mental health, the children were more likely to have Autism (OR=1.245).
When children's age increased by 1 year, the children were more likely to have autism (OR=1.046). Female children were less likely to have autism (OR=0.292). Children with normal birth weight has less likelihood to have autism (OR=0.453).
Children in a family which is hard to cover basics like food or housing were more likely to have autism (OR=1.264). Children who lived with mentally ill were less likely to have autism (OR=0.488). Children who lived with alcohol/drug problem were more likely to have autism (OR=1.650).
The area under curve was 0.7263. The optional cutoff time is 0.288. The mis-classification error was 0.027. The sensitivity rate is about 1.16% and the specificity is 99.95%.
Various studies, together with anecdotal evidence, suggest that the ratio of autistic males to females ranges from 2:1 to 16:1. The most-up-to-date estimate is 3:1. [3] some factors that increase the risk of developing ASD include: having a sibling with asd, having older parents, having certain genetic conditions (for example, people with conditions such as down syndrome, fragile x syndrome, and rett syndrome are more likely than others to have asd.), being born with a very low birth weight [4].
Conclusions: In this study, we identified important of predictors of autism among children, for example children age, sex, mental health and alcohol/ drug problems of adults.
References:
1. Increase in Developmental Disabilities Among Children in the United States.
2. Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years - Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2016.
3. Gender and autism. URL: https://www.autism.org.uk/about/what-is/gender.aspx
4. Autism Spectrum Disorder. URL: https://www.nimh.nih.gov/health/publications/autism-spectrum-disorder/index.shtml
5. Tabachnick B. and Fidell L. Using Multivariate Statistics (4th Ed.). Needham Heights, MA: Allyn & Bacon, 2001.
6. Stat Soft, Electronic Statistics Textbook. URL: http://www.statsoft.com/textbook/stathome.html.
7. Stokes M., Davis C. S. Categorical Data Analysis Using the SAS System, SAS Institute Inc., 1995.