Computing Profile Confidence Intervals for a Regression Line
=====================================================
In this article, we will explore how to compute profile confidence intervals for a regression line. We will start by simulating some data and applying a Poisson regression model. Then, we will compute the profile 95% CI using the confint() function in R and compare it with the 95% CI computed using the standard error (SE). We will also discuss why the profile CIs are so large and how to improve this.
Introduction
Profile confidence intervals are a type of confidence interval that is used to construct a region around a parameter estimate. They are called “profile” because they are constructed from the profile of the likelihood function, which is the maximum likelihood estimate of the model parameters. In this article, we will focus on computing profile confidence intervals for regression models.
Simulating Data
To get started, let’s simulate some data using R. We’ll use the rpois() function to generate a Poisson distribution with mean equal to the predicted values from our regression model.
n <- 50
beta0 <- 2
beta1 <- 0.32
x <- runif(n = n, min = 0, max = 5)
mu <- exp(beta0 + beta1 * x)
y <- rpois(n = n, lambda = mu)
data <- data.frame(x = x, y = y)
# Plot the data
plot(data$x, data$y)
Applying a Poisson Regression Model
Next, let’s apply a Poisson regression model to our data using the glm() function in R.
# Fit the model
glm.pois <- glm(formula = y ~ x, data = data, family = poisson(link = "log"))
# Print the summary of the model
summary(glm.pois)
Computing Profile Confidence Intervals
Now that we have our regression model fitted, let’s compute the profile 95% CI using the confint() function in R.
# Compute the profile confidence intervals
pCI <- confint(glm.pois)
# Sort the data by x
nice.xs <- sort(data$x)
# Predict the values on the x-axis
pred.pCI <- apply(t(pCI), 1, FUN = function(x) {exp(x[1] + x[2] * nice.xs)})
Computing Confidence Intervals Using the Standard Error (SE)
Next, let’s compute the 95% CI using the standard error (SE). We’ll start by predicting the values of our model on the training data using the predict() function in R.
# Predict the values on the training data
new.dat <- data.frame(phat = predict(glm.pois, type = "response"),
x = data$x)
# Sort the data by x
new.dat <- new.dat[with(new.dat, order(x)), ]
# Get the link and inverse link functions
ilink <- poisson()$linkinv
# Add fit and se.fit on the link scale
new.dat <- bind_cols(new.dat,
setNames(as_tibble(predict(glm.pois, new.dat, se.fit = TRUE)[1:2]),
c('fit_link', 'se_link')))
# Compute the confidence intervals
new.dat <- mutate(new.dat,
fit = ilink(fit_link),
right_upr = ilink(fit_link + (2 * se_link)),
right_lwr = ilink(fit_link - (2 * se_link)))
Plotting the Results
Finally, let’s plot our results using the ggplot() function in R.
# Plot the data
ggplot(data = new.dat, aes(x = x, y = phat)) +
geom_point() +
geom_ribbon(aes(x = x, ymin = right_lwr, ymax = right_upr),
fill = "darkgreen", alpha = 0.3) +
geom_ribbon(aes(x = nice.xs, ymin = pred.pCI[, 1], ymax = pred.pCI[, 2]),
fill = "darkorange", alpha = 0.3) +
theme_classic(base_size = 15)
Discussion
Now that we’ve computed our profile confidence intervals and standard error (SE) intervals, let’s discuss why the profile CIs are so large.
The reason for this is due to the correlation between the intercept and slope of our regression model. When computing profile CIs, we’re ignoring this correlation, which results in larger intervals.
To improve this, we can use a different method to compute profile confidence intervals that takes into account the correlation between the intercept and slope. One approach is to use a bootstrap resampling method, where we resample the data with replacement and compute the CI for each resampled dataset. This will give us a more accurate estimate of the uncertainty in our parameter estimates.
Another approach is to use a variance-covariance matrix to compute the CIs. This involves estimating the variance-covariance matrix of the regression model using the vcov() function in R, and then computing the CI using this matrix.
Conclusion
In conclusion, computing profile confidence intervals for a regression line can be a bit tricky. By ignoring the correlation between the intercept and slope, we can end up with larger CIs than necessary. However, by using different methods to compute these intervals, such as bootstrap resampling or variance-covariance matrices, we can improve the accuracy of our results.
I hope this article has been helpful in understanding how to compute profile confidence intervals for a regression line. If you have any questions or need further clarification, please don’t hesitate to ask!
Last modified on 2025-04-24