If we use the same sampling method to select different samples and computed an interval estimate for each sample, we would expect the true population parameter ( \(\beta_1\) ) to fall within the interval estimates 95% of the time.
Confidence interval for \(\hat\beta_1\)
How do we calculate the confidence interval for the slope?
\[\hat\beta_1\pm t^*SE_{\hat\beta_1}\]
How do we calculate it in R?
In with the confint function:
mod <-lm(leaf_length ~ leaf_width, magnolia_data)summary(mod)
Call:
lm(formula = leaf_length ~ leaf_width, data = magnolia_data)
Residuals:
Min 1Q Median 3Q Max
-12.4544 -3.2196 -0.0287 3.1761 12.6086
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.8362 1.3956 8.481 2.36e-13 ***
leaf_width 0.4386 0.1552 2.826 0.00571 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.327 on 98 degrees of freedom
Multiple R-squared: 0.07537, Adjusted R-squared: 0.06593
F-statistic: 7.988 on 1 and 98 DF, p-value: 0.005707
There are ✌️ other types of confidence intervals we may want to calculate
The confidence interval for the mean response in \(y\) for a given \(x^*\) value
The confidence interval for an individual response\(y\) for a given \(x^*\) value
Why are these different? Which do you think is easier to estimate? It is harder to predict one response than to predict a mean response. What does this mean in terms of the standard error?
The SE of the prediction interval is going to be larger
Confidence intervals
confidence interval for\(\mu_y\) and prediction interval
\[ \hat{y}\pm t^* SE\]
\(\hat{y}\) is the predicted \(y\) for a given \(x^*\)
\(t^*\) is the critical value for the \(t_{n-2}\) density curve
\(SE\) takes ✌️ different values depending on which interval you’re interested in
\(SE_{\hat\mu}\)
\(SE_{\hat{y}}\)
Which will be larger?
Confidence intervals
confidence interval for\(\mu_y\) and prediction interval
\[\hat{y}\pm t^* SE\]
\(\hat{y}\) is the predicted \(y\) for a given \(x^*\)
\(t^*\) is the critical value for the \(t_{n-2}\) density curve
\(SE\) takes ✌️ different values depending on which interval you’re interested in
You are interested in the predicted Porsche Price for Porsche cars that have 50,000 miles previously driven on average. Calculate this value with an appropriate confidence interval.
You are interested in the predicted Porsche Price for a particular Porsche with 40,000 miles previously driven. Calculate this value with an appropriate confidence interval.