Lucy D’Agostino McGowan
glimpse() function to see all of your variables and their typesRows: 30
Columns: 3
$ Price <dbl> 69.4, 56.9, 49.9, 47.4, 42.9, 36.9, 83.0, 72.9, 69.9, 67.9, 66…
$ Age <int> 3, 3, 2, 4, 4, 6, 0, 0, 2, 0, 2, 2, 4, 3, 10, 11, 4, 4, 10, 3,…
$ Mileage <dbl> 21.50, 43.00, 19.90, 36.00, 44.00, 49.80, 1.30, 0.67, 13.40, 9…
fct: “factor” this is a type of categorical variableglimpse() function to see all of your variables and their typesRows: 87
Columns: 5
$ name <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia Or…
$ height <int> 172, 167, 96, 202, 150, 178, 165, 97, 183, 182, 188, 180, 2…
$ mass <dbl> 77.0, 75.0, 32.0, 136.0, 49.0, 120.0, 75.0, 32.0, 84.0, 77.…
$ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "brown", N…
$ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "light", "…
chr: “character” this is a type of categorical variableAn indicator variable uses two values, usually 0 and 1, to indicate whether a data case does (1) or does not (0) belong to a specific category
What does this line of code do?
What does this line of code do?
What if I wanted to model the relationship between TotalPrice and Color?
Why is ColorJ NA?
Call:
lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG +
ColorH + ColorI + ColorJ, data = Diamonds)
Coefficients:
(Intercept) ColorD ColorE ColorF ColorG ColorH
1936 3632 2423 7224 7623 6732
ColorI ColorJ
5704 NA
k categories, always include k-1What is the reference category?
Call:
lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG +
ColorH + ColorI, data = Diamonds)
Coefficients:
(Intercept) ColorD ColorE ColorF ColorG ColorH
1936 3632 2423 7224 7623 6732
ColorI
5704
D compared to color J increases the expected total price by 3632.E compared to color J increases the expected total price by 2423What is the reference category?
Call:
lm(formula = TotalPrice ~ ColorD + ColorE + ColorF + ColorG +
ColorH + ColorI, data = Diamonds)
Coefficients:
(Intercept) ColorD ColorE ColorF ColorG ColorH
1936 3632 2423 7224 7623 6732
ColorI
5704
D compared to color J increases the expected total price by 3632.F?
Call:
lm(formula = TotalPrice ~ Color, data = Diamonds)
Coefficients:
(Intercept) ColorE ColorF ColorG ColorH ColorI
5569 -1209 3592 3990 3100 2071
ColorJ
-3632
What is the reference category?
Call:
lm(formula = TotalPrice ~ Color, data = Diamonds)
Coefficients:
(Intercept) ColorE ColorF ColorG ColorH ColorI
5569 -1209 3592 3990 3100 2071
ColorJ
-3632
E now?
Call:
lm(formula = TotalPrice ~ Color, data = Diamonds)
Coefficients:
(Intercept) ColorD ColorE ColorF ColorG ColorH
1936 3632 2423 7224 7623 6732
ColorI
5704
What is the reference category?
Call:
lm(formula = Pulse ~ Emergency, data = ICU)
Coefficients:
(Intercept) Emergency
91.11 10.63
Application ExerciseDiamonds dataset?Clarity variable in the Diamonds data?TotalPrice as the outcome and Clarity as the explanatory variableSI1 and refit the modelDepth to your model. How do you interpret the coefficient for this parameter?05:00