08:00
Lucy D’Agostino McGowan
Application ExerciseOpen the file labeled data-exploration.qmd
Read in the data set into an object called dat
Examine the data, how many observations are there? How many variables? Add the description to your file.
Examine the data dictionary by opening the file called data-dictionary.csv
Find the outcome variable (the retention rate) and your variable of interest specific to your group. Create a plot to visualize the relationship between these variables. Add a description to your file
Does the data look like it needs a transformation? If so, apply one and examine the plot again. Describe this in your file.
08:00
We can build a table to examine the distribution of our variables.
| Characteristic | N = 321 |
|---|---|
| mpg | 19.2 (15.4, 22.8) |
| cyl | |
| 4 | 11 (34%) |
| 6 | 7 (22%) |
| 8 | 14 (44%) |
| disp | 196 (121, 326) |
| hp | 123 (96, 180) |
| drat | 3.70 (3.08, 3.92) |
| wt | 3.33 (2.58, 3.61) |
| qsec | 17.71 (16.89, 18.90) |
| vs | 14 (44%) |
| am | 13 (41%) |
| gear | |
| 3 | 15 (47%) |
| 4 | 12 (38%) |
| 5 | 5 (16%) |
| carb | |
| 1 | 7 (22%) |
| 2 | 10 (31%) |
| 3 | 3 (9.4%) |
| 4 | 10 (31%) |
| 6 | 1 (3.1%) |
| 8 | 1 (3.1%) |
| 1 Median (IQR); n (%) | |
Application Exercisetable-one chunk to examine a Table of your variables. Move your variable of interest to the top of the list so that it is the first rendered in the table.03:00
The ggdag package can help us display our causal assumptions you drew last week. There are three steps:
dagifyggdag function to plot themggdag_adjustment_set function to determine what you need to add to your final model.library(ggdag)
dag <- dagify(
exposure ~ variable1 + variable2 + variable3 + variable4,
outcome ~ exposure + variable1,
variable1 ~ variable3,
variable2 ~ variable3,
exposure = "exposure",
outcome = "outcome",
latent = "variable3",
labels = c(variable1 = "Variable 1",
variable2 = "Variable 2",
variable3 = "Variable 3",
variable4 = "Variable 4",
exposure = "Exposure",
outcome = "Outcome")
)🎉
Application Exercisedata-dictionary.csv file and map the available variables to the names in the equations you developed for homework 2.causal-assumptions.qmddagify code chunk after deleting the # add your equations here comment. Make sure to separate each equation with a commaggdag chunk to create the causal diagramadjustment_set chunk to see what variables you need to adjust for20:00
Loose endsGit Panel on the top right to pull in new data. Raise your hand if this doesn’t work.data-exploration.qmd Render the document. Do you see a figure of your variable on the x-axis and the outcome on the y-axis? If not, create that figure and re-render the documentdata-exploration.qmd? If not, fill it in.causal-assumptions.qmd Render the document. Do you see two figures, one with the Causal Diagram and one showing the adjustment set? If not, be sure that you have set eval: true in all of the chunks Note: if you are getting an error, raise your hand so I can come help out05:00
Application Exercisedata-dictionary.csv and examine the available variables. Are there any that you didn’t include in your causal diagram that maybe should be included? Add them nowPairs:
causal-assumptions.qmd file with these changes and observe your final adjustment set.20:00
Application Exercisefinal-model.qmd20:00
Application Exercisesensitivity-data.qmd file. This is a sensitivity analysis for including athletic data and US News Ranking dataathletics_dat. How many observations are there? Fill in the explanation with this number.bball_power_rating. Does this change your result? Add an explanation of what you see.usnews_dat. How many observations are there? Fill in the explanation with this number.usnews_ranking. Does this change your result? Add an explanation of what you see.10:00
Application ExerciseLet’s put this all together!
index.qmd10:00