Setting Up Your Environment

Published

Jun 2026

  • ID: DS-L01
  • Type: Foundation
  • Audience: Beginner / Intermediate
  • Theme: Reproducibility starts before analysis

This guide is Python-first and Quarto-first.

In CDI, environment setup is not a technical formality — it is part of the analytical process.

If your environment is inconsistent, your results can be:

You will write code inside Quarto chapter files (.qmd), but the setup commands in this lesson are shown as examples for users to run manually in the terminal.


What you are setting up

By the end of this lesson you will have:

  • Python installed and accessible from the terminal
  • Quarto installed and working
  • a manually prepared requirements.txt file
  • a project virtual environment (.venv)
  • required packages installed
  • a working build pipeline that renders into docs/

Required software

Python (3.10 or newer)

Check if Python is available:

python3 --version

If this fails, install Python:

  • macOS: brew install python
  • Windows: Install from python.org and enable “Add to PATH”
  • Linux: Use your package manager, such as apt, dnf, or pacman

Quarto

Quarto renders the book and executes code.

Check installation:

quarto --version

If Quarto is missing, install it.

On macOS with Homebrew:

brew install --cask quarto

After installation, confirm that Quarto is available:

quarto --version

Windows and Linux users can install Quarto from the official Quarto website, then rerun the version check.

Quarto should be installed before rendering the book.
The Python environment controls the analysis packages, while Quarto controls document rendering.


Project environment

This project uses a local virtual environment to ensure reproducibility.

The environment setup has two parts:

requirements.txt
        ↓
scripts/bash/setup-env.sh
        ↓
.venv/

The requirements.txt file defines the Python packages required by the project.

The setup script creates the environment and installs those packages.


Prepare requirements.txt

Before running the setup script, create a file named requirements.txt in the project root.

Use this starter version:

pandas
numpy
matplotlib
seaborn
scikit-learn
ipykernel

This file should be prepared manually so that learners understand which packages the project depends on.

In CDI systems, requirements.txt is part of reproducibility.
It records the packages needed to rerun the workflow on another machine.


Use the setup script

From the project root:

bash scripts/bash/setup-env.sh

Activate the environment:

source .venv/bin/activate

Verify:

which python
python --version

The path should point to .venv.

If you are on Windows:

  • PowerShell: .venv\Scripts\Activate.ps1
  • cmd: .venv\Scripts\activate.bat

What the script does

The setup script:

  • checks that requirements.txt exists
  • creates .venv/
  • activates the virtual environment
  • upgrades pip
  • installs packages from requirements.txt
  • registers a Jupyter kernel named CDI Data Science

CDI workflow primarily uses Quarto rendering, not notebooks. The Jupyter kernel is included because Quarto can use it to execute Python code.


Render the book

With the environment active:

bash scripts/bash/build.sh

This runs quarto render and outputs the rendered site to docs/.


Sanity check

Open the generated site.

macOS

open docs/index.html

Linux

xdg-open docs/index.html

Windows

start docs/index.html

If the site loads and navigation works, your environment is correctly configured.


Development mode

You can also preview the book while editing:

quarto preview

This launches a local live preview that reloads when files change.


CDI Insight

Environment setup is the first step in reproducibility.

Before analyzing data, you must ensure:

  • your tools are consistent
  • your dependencies are documented
  • your environment can be recreated
  • your workflow is repeatable

This is what allows your results to be trusted.


Summary

In this lesson, you prepared the project environment.

You now have:

  • Python and Quarto available
  • a manually prepared requirements.txt file
  • a reproducible .venv environment
  • required Python packages installed
  • a working Quarto book build

You are now ready to begin the data workflow.


Looking Ahead

In the next chapter, we load and inspect the first example dataset. This begins the reusable tidy-table workflow that supports analysis across CDI pathways.