Appendix
This appendix collects technical reference material that supports the Foundations Track.
It documents the project structure, environment setup, rendering workflow, and reusable patterns used throughout the guide.
Project Structure Overview
A standard CDI Quarto-first project looks like this:
data-science/
├── index.qmd
├── 00-preface.qmd
├── 01-setting-up-environment.qmd
├── ...
├── data/
├── figures/
├── cdi_viz/
├── scripts/
│ └── bash/
├── docs/
├── _quarto.yml
└── requirements.txt
Key Directories
data/
Stores raw and cleaned datasets.figures/
Stores saved plots generated during rendering.cdi_viz/
Contains reusable visualization utilities (e.g.,theme.py).scripts/bash/
Contains helper scripts such assetup-env.shandbuild.sh.docs/
Contains the rendered Quarto website (GitHub Pages output).
Environment Setup Reference
This project uses a local virtual environment (.venv) for reproducibility.
Create environment manually
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtRender the book
quarto renderOr use the helper script:
bash scripts/bash/build.shQuarto Workflow Summary
The workflow for this guide is:
Python code
↓
Quarto chapter (.qmd)
↓
quarto render
↓
docs/ (static site)
All figures and outputs are generated during rendering.
Reusable Pandas Patterns
Select columns
df[["col1", "col2"]]Filter rows
df[df["col"] > value]Group and aggregate
df.groupby("group_col").agg(
metric=("value_col", "mean")
)Handle missing values
df["col"] = df["col"].fillna(df["col"].median())Convert dtype
df["category_col"] = df["category_col"].astype("category")Plot Saving Pattern
All plots are saved using the CDI visualization utility:
from cdi_viz.theme import show_and_save_mpl
show_and_save_mpl(fig)This ensures figures are saved consistently into the figures/ directory.
Reproducibility Checklist
Before finalizing any chapter:
- Confirm no missing values remain (unless justified)
- Confirm no unintended duplicates
- Confirm correct data types
- Confirm figures render correctly
- Rebuild the book (
quarto render) - Open
docs/index.htmlto verify output
Closing Note
The goal of this Foundations Track is not just to teach syntax.
It is to teach structure, discipline, and reproducibility.
The same workflow you used here can be applied to:
- New datasets
- New domains
- Larger analytical projects
- Future CDI guides