top of page

Data Science Curriculum

12 week Course Curriculum

Here’s a 3 month Cloud Computing curriculum designed in the form basics to advanced, with weekly topics, monthly tests, assignments, and projects, aligned to current industry standards (AWS centric, with optional Azure/GCP exposure).

Assumptions :
4–5 hours per day, 5 day weeks → 12 weeks ≈ 3 months.
Platform :
AWS (students can optionally add Azure/GCP later).
Tools :
AWS free tier, EC2, S3, RDS, Lambda, IAM, VPC, CloudWatch, Terraform (optional), GitHub/GitLab.
Outcome :
Junior Cloud Engineer / Cloud Associate / DevOps ready.

Overall Assessment Plan Weekly assignments :
2–3 hands on labs (e.g., deploy EC2, S3, IAM policies, autoscaling). Monthly tests: 1 hour MCQ + practical (e.g., “fix this broken VPC configuration”). Monthly projects: End to end cloud deployments (e.g., web app, serverless workflow)
 

Wireframe Sphere

Month 1 – Foundations: Python, SQL, Statistics & EDA (Weeks 1–4)

Weeks & Topics
Daily Task (4-5 hrs)
Assignments
Milestones
week 1 : Intro to data science & tools

What is data science, roles (analyst, scientist, engineer), data pipeline (collect → clean → analyze → model → deploy), Python & SQL setup, Jupyter/Colab.

1) Install Python, pandas, numpy, matplotlib, Jupyter; 2) Write a “Hello Data” notebook loading a CSV and printing basic info.
Environment setup + first notebook
week 2 : Python for data

Variables, functions, lists/dicts, list comprehensions, lambda, file I/O, working with CSV/JSON, pandas Series/DataFrame basics.

1) Clean a messy CSV (handle missing values, types); 2) Compute basic stats (mean, median, std) manually + with pandas
Cleaned dataset + stats notebook.
week 3 : SQL basics for data

Relational databases, SELECT, WHERE, GROUP BY, JOINs, subqueries, aggregations, basic window functions

1) Install SQLite / use online DB; 2) Write 10+ queries on a sample DB (sales, students, orders).
SQL query set + results.
week 4 : Statistics & EDA I

Descriptive stats (mean, median, variance, distribution), correlation, histograms, boxplots, outlier detection, Monthly Test 1 (Python + SQL + stats/EDA).

1) Perform EDA on a real dataset (e.g., Titanic, housing, sales); 2) Create 5–7 visualizations with Matplotlib/Seaborn.
Project 1: “EDA & Insights Notebook” – A comprehensive notebook: data loading → cleaning → EDA → visualizations → key insights + a short written report (1–2 pages).
ds 1.png

Month 2 – Machine Learning Fundamentals & Modeling (Weeks 5–8)

Weeks & topics
Daily task (4-5 hrs)
Assignment
Milestones
week 5 : ML basics & supervised learning

ML types (supervised/unsupervised), train/test split, overfitting/underfitting, biasvariance, linear regression, logistic regression.

1) Implement linear/logistic regression from scratch (concept) + scikitlearn; 2) Compare models on a small dataset.
Regression/classification notebook.
week 6 : Classification & evaluation

kNN, decision trees, random forests, SVM (concept), metrics: accuracy, precision, recall, F1, ROCAUC, confusion matrix

1) Build a classifier on an imbalanced dataset; 2) Tune threshold to optimize precision/recall.
Classification report + plots
week 7 : Unsupervised learning & feature engineering

Clustering (kmeans, hierarchical), dimensionality reduction (PCA), feature scaling, encoding, handling missing data, feature selection.

1) Cluster customers into segments; 2) Apply PCA and visualize reduced dimensions.
Clustering + PCA notebook.
week 8 : Model tuning & pipelines

Hyperparameter tuning (GridSearch/RandomSearch), crossvalidation, scikitlearn pipelines, model persistence (pickle/joblib), Monthly Test 2 (ML algorithms + evaluation).

1) Tune a random forest / xgboost model; 2) Build a pipeline (preprocess → model → evaluate).
Project 2: “Predictive Modeling Project” – A complete supervised ML project (e.g., churn prediction, house price prediction, fraud detection) with: data cleaning, EDA, model building, evaluation, hyperparameter tuning, and a short businessfocused report.
ds 2.png

Month 3 – Advanced Analytics, Big Data Basics & Capstone (Weeks 9–12)

Weeks & Topics
Daily task (4-5 hrs)
Assignments
Milestones
week 09 : Time series & forecasting

Time series components (trend, seasonality), moving average, ARIMA (concept), simple forecasting with sklearn/prophet.

1) Load a time series dataset (sales, stock, weather); 2) Build a simple forecasting model and plot forecast.
Time series notebook + forecast plot.
week 10 : Intro to big data & Spark (concept)

Big data concepts (volume, velocity, variety), Hadoop/Spark basics, PySpark DataFrame API (concept + small demos), distributed computing idea

1) Run a small PySpark job on a sample dataset; 2) Compare performance with pandas on larger data (if possible).
PySpark demo + notes.
week 11 : Data storytelling & deployment

Communicating insights, dashboards (Tableau/Power BI or Plotly Dash/Streamlit), basic model deployment (Flask/FastAPI), CI/CD concept, ethics in data science. 

1) Build a simple dashboard (Streamlit/Dash) showing model predictions; 2) Create a Flask/FastAPI endpoint for a trained model (local).
Dashboard + API demo.
week 12 : Capstone & portfolio

Endtoend data science product: problem definition → data → EDA → modeling → deployment/dashboard → presentation; Final Test (MCQ + practical capstone review).

1) Polish capstone notebook + dashboard + report; 2) Prepare a 5–7 slide presentation + GitHub repo.
Project 3: “Data Science Capstone” – A complete endtoend project such as:- “Sales forecasting dashboard”- “Customer churn prediction system with dashboard”- “Fraud detection system with API” Includes data cleaning, EDA, ML model, evaluation, dashboard or API, Git repo, README, and a 2–3 page report.
ds 3.png
bottom of page