Data Science Curriculum
12 week Course Curriculum
Here’s a 3 month Cloud Computing curriculum designed in the form basics to advanced, with weekly topics, monthly tests, assignments, and projects, aligned to current industry standards (AWS centric, with optional Azure/GCP exposure).
Assumptions :
4–5 hours per day, 5 day weeks → 12 weeks ≈ 3 months.
Platform :
AWS (students can optionally add Azure/GCP later).
Tools :
AWS free tier, EC2, S3, RDS, Lambda, IAM, VPC, CloudWatch, Terraform (optional), GitHub/GitLab.
Outcome :
Junior Cloud Engineer / Cloud Associate / DevOps ready.
Overall Assessment Plan Weekly assignments :
2–3 hands on labs (e.g., deploy EC2, S3, IAM policies, autoscaling). Monthly tests: 1 hour MCQ + practical (e.g., “fix this broken VPC configuration”). Monthly projects: End to end cloud deployments (e.g., web app, serverless workflow)

Month 1 – Foundations: Python, SQL, Statistics & EDA (Weeks 1–4)
Weeks & Topics | Daily Task (4-5 hrs) | Assignments | Milestones |
|---|---|---|---|
week 1 : Intro to data science & tools | What is data science, roles (analyst, scientist, engineer), data pipeline (collect → clean → analyze → model → deploy), Python & SQL setup, Jupyter/Colab. | 1) Install Python, pandas, numpy, matplotlib, Jupyter; 2) Write a “Hello Data” notebook loading a CSV and printing basic info. | Environment setup + first notebook |
week 2 : Python for data | Variables, functions, lists/dicts, list comprehensions, lambda, file I/O, working with CSV/JSON, pandas Series/DataFrame basics. | 1) Clean a messy CSV (handle missing values, types); 2) Compute basic stats (mean, median, std) manually + with pandas | Cleaned dataset + stats notebook. |
week 3 : SQL basics for data | Relational databases, SELECT, WHERE, GROUP BY, JOINs, subqueries, aggregations, basic window functions | 1) Install SQLite / use online DB; 2) Write 10+ queries on a sample DB (sales, students, orders). | SQL query set + results. |
week 4 : Statistics & EDA I | Descriptive stats (mean, median, variance, distribution), correlation, histograms, boxplots, outlier detection, Monthly Test 1 (Python + SQL + stats/EDA). | 1) Perform EDA on a real dataset (e.g., Titanic, housing, sales); 2) Create 5–7 visualizations with Matplotlib/Seaborn. | Project 1: “EDA & Insights Notebook” – A comprehensive notebook: data loading → cleaning → EDA → visualizations → key insights + a short written report (1–2 pages). |

Month 2 – Machine Learning Fundamentals & Modeling (Weeks 5–8)
Weeks & topics | Daily task (4-5 hrs) | Assignment | Milestones |
|---|---|---|---|
week 5 : ML basics & supervised learning | ML types (supervised/unsupervised), train/test split, overfitting/underfitting, biasvariance, linear regression, logistic regression. | 1) Implement linear/logistic regression from scratch (concept) + scikitlearn; 2) Compare models on a small dataset. | Regression/classification notebook. |
week 6 : Classification & evaluation | kNN, decision trees, random forests, SVM (concept), metrics: accuracy, precision, recall, F1, ROCAUC, confusion matrix | 1) Build a classifier on an imbalanced dataset; 2) Tune threshold to optimize precision/recall. | Classification report + plots |
week 7 : Unsupervised learning & feature engineering | Clustering (kmeans, hierarchical), dimensionality reduction (PCA), feature scaling, encoding, handling missing data, feature selection. | 1) Cluster customers into segments; 2) Apply PCA and visualize reduced dimensions. | Clustering + PCA notebook. |
week 8 : Model tuning & pipelines | Hyperparameter tuning (GridSearch/RandomSearch), crossvalidation, scikitlearn pipelines, model persistence (pickle/joblib), Monthly Test 2 (ML algorithms + evaluation). | 1) Tune a random forest / xgboost model; 2) Build a pipeline (preprocess → model → evaluate). | Project 2: “Predictive Modeling Project” – A complete supervised ML project (e.g., churn prediction, house price prediction, fraud detection) with: data cleaning, EDA, model building, evaluation, hyperparameter tuning, and a short businessfocused report. |

Month 3 – Advanced Analytics, Big Data Basics & Capstone (Weeks 9–12)
Weeks & Topics | Daily task (4-5 hrs) | Assignments | Milestones |
|---|---|---|---|
week 09 : Time series & forecasting | Time series components (trend, seasonality), moving average, ARIMA (concept), simple forecasting with sklearn/prophet. | 1) Load a time series dataset (sales, stock, weather); 2) Build a simple forecasting model and plot forecast. | Time series notebook + forecast plot. |
week 10 : Intro to big data & Spark (concept) | Big data concepts (volume, velocity, variety), Hadoop/Spark basics, PySpark DataFrame API (concept + small demos), distributed computing idea | 1) Run a small PySpark job on a sample dataset; 2) Compare performance with pandas on larger data (if possible). | PySpark demo + notes. |
week 11 : Data storytelling & deployment | Communicating insights, dashboards (Tableau/Power BI or Plotly Dash/Streamlit), basic model deployment (Flask/FastAPI), CI/CD concept, ethics in data science. | 1) Build a simple dashboard (Streamlit/Dash) showing model predictions; 2) Create a Flask/FastAPI endpoint for a trained model (local). | Dashboard + API demo. |
week 12 : Capstone & portfolio | Endtoend data science product: problem definition → data → EDA → modeling → deployment/dashboard → presentation; Final Test (MCQ + practical capstone review). | 1) Polish capstone notebook + dashboard + report; 2) Prepare a 5–7 slide presentation + GitHub repo. | Project 3: “Data Science Capstone” – A complete endtoend project such as:- “Sales forecasting dashboard”- “Customer churn prediction system with dashboard”- “Fraud detection system with API” Includes data cleaning, EDA, ML model, evaluation, dashboard or API, Git repo, README, and a 2–3 page report. |

