1 min read

Advanced models

cleaning the dataset

Jerry’s section

We will analize using the bootstrap method. With this method we are able to use a technique for resampling to estimate summary statistics as the mean income on our population by sampling our dataset with replacement.

We chose to resample 1,000 times with replacement.

model <- glm(bi_var ~ income, data = df3, family = "binomial")
set.seed(15)
cv.error = cv.glm(df3, model, K=10)$delta[1]

Bootstrap Methods

## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = df3, statistic = boot.fn, R = 1000)
## 
## 
## Bootstrap Statistics :
##       original        bias    std. error
## t1*  0.2305237  0.0030803643   0.1313720
## t2*  0.6350395  0.0003987870   0.1809877
## t3*  0.5735806 -0.0003047392   0.1736144
## t4*  1.0798860  0.0067339494   0.1905403
## t5*  0.8734506  0.0025102385   0.2034412
## t6*  1.3616228 -0.0016798272   0.1854858
## t7*  1.3470390  0.0037220630   0.2097098
## t8*  1.3605651  0.0040350529   0.2186558
## t9*  1.1630967  0.0085418295   0.2374484
## t10* 0.6804432 -0.0007132837   0.1709423