A Learning Map & Portfolio Guide

This appendix provides a clear, learner-focused overview of the entire CDI Data Science Foundations journey.

Use it to: - understand how lessons connect, - review what you’ve learned, - and prepare your work for portfolios, interviews, or real projects.

This appendix summarizes your journey — it does not introduce new material.


A.1 Learning Structure at a Glance

The guide is structured in two phases:

  • Foundations (Free Track) — core skills and workflow setup
  • Applied & Advanced (Premium Track) — real-world analysis, modeling, and deployment

Each lesson builds deliberately on the previous ones.


A.2 Foundations Track — Core Skills (Lessons 1–6)

These lessons establish the technical environment and essential data science workflow.

Lesson Focus What You Learn
1 Environment Setup Install Python, required libraries, and confirm your working environment
2 Load & Explore a Dataset Inspect dataset structure, variables, and initial patterns
3 Data Cleaning Handle missing values, inconsistencies, and basic corrections
4 Data Wrangling Basics Filtering, reshaping, joining, and transforming data
5 Visualization Basics Create clear, interpretable plots
6 Summary Statistics & Insights Describe distributions and extract baseline insights

By the end of this phase, you can confidently move from raw data to structured, interpretable information.


A.3 Applied & Advanced Track — Real Projects (Lessons 7–19)

These lessons focus on real-world complexity, modeling, and deployment.

Lesson Focus What You Learn
7 Intermediate Wrangling Advanced joins, reshaping, and cleanup strategies
8 Real-World Data Problems Messy data, edge cases, and practical fixes
9 Advanced Exploratory Data Analysis Multivariate analysis and deeper pattern discovery
10 Advanced Visualization Layered, faceted, and story-driven visuals
11 Advanced Statistical Analysis Hypothesis testing, confidence intervals, effect sizes
12 Feature Engineering Transformations, interactions, and encoding
13 Machine Learning Basics Train/test splits and baseline models
14 Classification Models Logistic regression, random forests, ROC/AUC
15 Regression Models Predicting numeric outcomes and diagnostics
16 Model Evaluation Metrics, validation, and honest assessment
17 Hyperparameter Tuning Improving models through systematic search
18 End-to-End ML Project Complete workflow from data to predictions
19 Model Deployment Saving models and exposing predictions via an application

By the end of this track, you have completed a full, production-style data science pipeline.


A.4 End-to-End Project Workflow

Across the lessons, you’ve constructed a professional workflow:

  1. Environment Setup — ensured tools and libraries work correctly before analysis begins
  2. Data Understanding — loaded, explored, and described real datasets
  3. Data Cleaning & Wrangling — resolved missing values, inconsistencies, and structure issues
  4. Exploratory Analysis & Visualization — discovered patterns and communicated insights visually
  5. Statistical Reasoning — used formal methods to support conclusions
  6. Feature Engineering — transformed raw data into model-ready inputs
  7. Machine Learning Modeling — built classification and regression models
  8. Evaluation & Tuning — compared models and improved performance responsibly
  9. Deployment — saved models and exposed predictions through an application

This mirrors how real data science projects are executed in practice.


A.5 Portfolio Readiness Checklist

A.5.1 Environment & Reproducibility

A.5.2 Data Preparation

A.5.3 Analysis & Visualization

A.5.4 Modeling

A.5.5 Evaluation & Improvement

A.5.6 Deployment


A.6 Final Note

Completing this guide means you can now:

  • set up a working data science environment,
  • take a dataset from raw form to deployment,
  • reason statistically about results,
  • build and evaluate machine learning models,
  • and communicate insights clearly.

These are real, transferable professional skills.