This repository contains the material for the introduction seminar to Statistical Causal Inference (SCI). The purpose of the seminar is to introduce software engineering researchers with a background in analysis of quantitative data to a causal framework for inferential statistics proposed by Judea Pearl1 and Richard McElreath.2
| Version | Date | Occasion | Contributors |
|---|---|---|---|
| v1.0 | 2024-10-18 | Research visit at UPC, Barcelona | Julian Frattini |
| v2.0 | 2025-04-28 | Tutorial at the RE'25 conference | Julian Frattini, Hans-Martin Heyn, Robert Feldt, Richard Torkar |
The repository contains the following directories and files.
├── publicity : advertising material for the seminar
│ ├── banners: image files (from PowerPoint slides) for social media posts
│ ├── debriefing: summary of tutorial instances
│ └── screenshots: images from the slides for tutorial applications
├── slides : PowerPoint presentations for teaching the tutorial/seminars
│ ├── pdf: animation-free export of the presentations to PDF
│ ├── intro-bda4sci.pptx: complete introduction to both SCI and BDA (from v1.0)
│ └── intro-sci.pptx: focused introduction to SCI (from v2.0)
└── src : source code to follow along the examples
│ ├── basics : description of fundamental concepts
│ │ ├── regression.Rmd : demonstration of the basic statistical analysis tool
│ │ └── simulations.Rmd : demonstration of ground truth simulations
│ ├── bda : implementations of BDA concepts and techniques
│ │ ├── brms : code snippets using the brms package
│ │ │ └── bda-complete.Rmd : complete example of a simple Bayesian regression model
│ │ └── rethinking : code snippets using the rethinking package
│ │ ├── prior-predictive-checks.Rmd : demonstration of prior predictive checks
│ │ └── model-notation.Rmd : demonstration of statistical model specification
│ ├── exercises : collections of exercises to test the acquired skills
│ │ └── exercise-d-separation.Rmd : collection of exercises in identifying adjustment sets
│ ├── sci : implementations of SCI concepts and techniques
│ │ ├── associations : explanation of the fundamental relationships between three variables
│ │ │ ├── collider.Rmd : demonstration of a common effect
│ │ │ ├── confounder.Rmd : demonstration of a common cause
│ │ │ └── mediator.Rmd : demonstration of a pipe
│ │ ├── dag.Rmd : demonstration of causal modeling with directed, acyclic graphs
│ │ └── model-comparison.Rmd : demonstration of model comparison to identify appropriate causal models
│ └── util : utility files and scripts with reused functions
│ └── extract-coefficients.R : script to extract all coefficient distributions from two models
└── sci-intro.Rproj : project file to open the project in RStudio
In order to run the R scripts and Rmd notebooks in the src folder, ensure that you have R (version > 4.0) and an appropriate IDE like RStudio installed on your machine.
Then, ensure the following steps:
- Install the C toolchain by following the instructions for Windows, Mac OS, or Linux respectively.
- Restart RStudio and follow the instructions starting with the Installation of RStan
- Install the latest version of
stanby running the following commands
install.packages("devtools")
devtools::install_github("stan-dev/cmdstanr")
cmdstanr::install_cmdstan()- Install all required packages via
install.packages(c("tidyverse", "ggdag", "brms", "marginaleffects", "patchwork")). - Create a folder called fits within src/ such that
brmshas a location to place all Bayesian models. - Open the
sci-intro.Rprojfile with RStudio which will setup the environment correctly.
Copyright © 2024 Julian Frattini. This work is licensed under the Apache-2.0 License.
