Welcome to CDI Data Science Foundations in Python
Welcome to Complex Data Insights (CDI) — a practical learning pathway designed to teach real data skills through clear explanations, structured lessons, and hands-on practice. This guide focuses entirely on Python, the most widely used language in data analytics, machine learning, and scientific computing.
Whether you are new to data science or returning to data work, this guide will help you build skills progressively — from foundational concepts to professional workflows.
What You Will Learn
Across this guide, you will learn how to:
- Think analytically about data problems
- Load, clean, and structure datasets
- Explore data through visualization
- Apply essential statistical reasoning
- Prepare data for machine learning
- Work with realistic, real-world datasets
- Build reproducible, end-to-end workflows
Each lesson builds on the previous one, forming a clear and logical progression.
Course Structure
The CDI Data Science Foundations path is divided into two parts.
Free Track (Lessons 01–06)
The free track introduces the core foundations of data science with Python, including:
- Environment setup and workflow
- Loading and inspecting datasets
- Data cleaning fundamentals
- Introductory data wrangling
- Visualization basics
- Summary statistics and insight generation
These lessons build confidence with essential data tools and concepts.
How to Use This Guide
This guide is designed to be:
- Self-paced — learn at your own speed
- Hands-on — practice with real code and datasets
- Modular — revisit lessons as needed
- Accessible — explained step by step, with context
Lesson Workflow
Each lesson follows a consistent structure:
- Concept explanation
- Guided code examples
- Output interpretation
- Short exercises
- Key takeaways
- Next steps
Data and Project Structure
Throughout the guide:
- All datasets are stored in a
data/directory
- Lesson 01 generates the first dataset (
iris.csv)
- Additional datasets are introduced as needed in later lessons
This consistent structure supports reproducibility and clarity.