# Number of iterations in BFGS optimization: 5 # Count model coefficients (poisson with log link): # zeroinfl(formula = Response ~ Trt | 1, data = data.zip) Model.zip = zeroinfl(Response ~ Trt|1, data = data.zip) # hurdle and zeroinfl functions by Achim Zeileis # Political Science Computational Laboratory # Classes and Methods for R developed in the
First, we fit a model where we assume that the probability of zero is the same for both treatments (with ~ Trt|1). We load the pscl package to fit the zero-inflated model. SE = predict(model.nb.2, newdata = nd, type = "response", se.fit = T)$se.fit) Mean = predict(model.nb.2, newdata = nd, type = "response"), 1 - pchisq(summary(model.nb.2)$deviance, summary(model.nb.2)$df.residual)Īnd does not predict the correct means. The negative binomial model does not fit the data. # Residual deviance: 257.44 on 198 degrees of freedom # Null deviance: 296.90 on 199 degrees of freedom # (Dispersion parameter for Negative Binomial(1.6698) family taken to be 1) # glm.nb(formula = Response ~ Trt, data = data.zip, init.theta = 1.66980196, model.nb.2 = glm.nb(Response ~ Trt, data = data.zip) SE = predict(model.pois.3, newdata = nd, type = "response", se.fit = T)$se.fit) Mean = predict(model.pois.3, newdata = nd, type = "response"), The Poisson model also does not predict the correct mean counts. 1 - pchisq(summary(model.pois.3)$deviance, summary(model.pois.3)$df.residual ) # Residual deviance: 724.96 on 198 degrees of freedom # Null deviance: 901.13 on 199 degrees of freedom
# glm(formula = Response ~ Trt, family = poisson, data = data.zip) model.pois.3 = glm(Response ~ Trt, data = data.zip, family = poisson) The SE estimates are lower for the Poisson model than for the negative binomial model, which increases the likelihood of incorrectly detecting a significant treatment effect in the Poisson model. Here you see the ‘danger’ of ignoring overdispersion in the Poisson model. SE = predict(model.nb, newdata = nd, type="response", se.fit = T)$se.fit) Mean = predict(model.nb, newdata = nd, type = "response"), The GOF test indicates that the negative binomial model fits the data. 1 - pchisq(summary(model.nb)$deviance, summary(model.nb)$df.residual) The model estimates the dispersion parameter at about the value that we set for theta (i.e., 5) when generating random variates. # Residual deviance: 211.37 on 198 degrees of freedom # Null deviance: 278.72 on 199 degrees of freedom # (Dispersion parameter for Negative Binomial(4.7962) family taken to be 1) # glm.nb(formula = Response ~ Trt, data = data.nb, init.theta = 4.796214, Negative binomial model model.nb = glm.nb(Response ~ Trt, data = data.nb) Make a note of the SEs in this output because we will come back to this after we run a GLM based on a negative binomial error distribution. SE = predict(model.pois.2, newdata = nd, type = "response", se.fit = T)$se.fit) Mean = predict(model.pois.2, newdata = nd, type = "response"), Nonetheless, let’s take a look at the predicted values. 1 - pchisq(summary(model.pois.2)$deviance, summary(model.pois.2)$df.residual) # Number of Fisher Scoring iterations: 5Īs expected, the Poisson model does not fit the data (p < 0.05). # Residual deviance: 510.72 on 198 degrees of freedom # Null deviance: 681.61 on 199 degrees of freedom # (Dispersion parameter for poisson family taken to be 1) # glm(formula = Response ~ Trt, family = poisson, data = data.nb) model.pois.2 = glm(Response ~ Trt, data = data.nb, family = poisson) Remember that we are trying to simulate the steps you would take if you did not know the properties of the distribution (beyond knowing that you have integers bound at zero and infinity). We first test if a Poisson model fits this data. SE = predict(model.pois, newdata=nd, type="response", se.fit = TRUE)$se.fit)īecause we used a large sample size, the predicted means are similar to the expected means of 10 and 5. Mean = predict(model.pois, newdata=nd, type="response"), You will still want to use the model to predict mean counts for each treatment and standard errors for each parameter. If this were your actual data, you could breathe a sigh of relief because you could stop here. # glm(formula = Response ~ Trt, family = poisson, data = data.pois) model.pois = glm(Response ~ Trt, data = data.pois, family = poisson) Now, we run the GLM and set the error distribution to Poisson.