Learn Data Science in 90 Days: A Complete Roadmap

Verified Sources

Jun 15, 2026

Data science is one of the most transformative and in-demand disciplines of the 21st century. By combining statistics, programming, and domain expertise, data scientists extract meaningful insights from data to drive informed decision-making across industries — from healthcare to finance, from e-commerce to public policy .

This 90-day roadmap is designed for dedicated learners willing to commit 3–4 hours daily (or 20–25 hours per week) to go from absolute beginner to job-ready data scientist. The plan is structured into three 30-day phases, each building on the previous one, ensuring a progressive, scaffolded learning experience.

Who is this for? Career switchers, self-taught programmers, recent graduates, or anyone who wants a structured, time-bound path into data science.

The core areas you'll master include:

Python programming (86% of data science jobs require it )
SQL for data querying and manipulation
Statistics & probability — the mathematical backbone of all modeling
Machine learning — from linear regression to ensemble methods
Data visualization & storytelling — turning analysis into impact
Portfolio building & career preparation

Below is a high-level visual of your 90-day journey:

Coursera – Data Science Learning Roadmap for Beginners - Comprehensive beginner to expert roadmap covering essential topics and courses. ↩
Dawn Choo / LinkedIn – Most In-Demand Data Science Skills in 2025 - Analysis of 101 data science job postings showing Python at 86%, ML at 65%, and SQL demand trends. ↩

Become a Data Scientist in 90 Days — Step-by-Step Roadmap

90-Day Data Science Learning Lifecycle

Python Fundamentals

Days 1–10

Set up your environment, learn Python syntax, data types, control flow, and functions. Install Jupyter Notebook and run your first scripts."

Data Manipulation with Pandas & NumPy

Days 11–20

Master the core data science libraries: NumPy for numerical operations and Pandas for data wrangling, cleaning, and transformation."

SQL & Statistics Foundations

Days 21–30

Learn SQL queries (SELECT, JOIN, GROUP BY, window functions) and core statistics: descriptive stats, probability distributions, and hypothesis testing."

Exploratory Data Analysis & Visualization

Days 31–45

Perform EDA on real datasets with Matplotlib, Seaborn, and Plotly. Learn to tell stories with data and build your first dashboards."

Machine Learning Fundamentals

Days 46–60

Implement supervised and unsupervised learning algorithms with scikit-learn: linear/logistic regression, decision trees, clustering, and model evaluation."

Advanced ML & Deployment

Days 61–75

Explore ensemble methods, feature engineering, model tuning with cross-validation, and deploy models using Flask, FastAPI, or Streamlit."

Portfolio, Networking & Job Prep

Days 76–90

Complete 3 capstone projects, publish to GitHub, build your personal brand on LinkedIn, craft your resume, and practice interview questions."

Weekly Time Allocation Across 90 Days

Recommended hours per week for each skill area

Common Pitfall: Tutorial Hell

Watching tutorials endlessly without practicing is the #1 reason learners fail. For every hour of video you watch, spend at least 2 hours coding on your own. Use platforms like Kaggle, LeetCode (SQL), and HackerRank to practice actively.

Phase 1: Foundations (Days 1–30)

The first 30 days are about building a rock-solid foundation. You cannot build a house on sand — and you cannot do data science without fluency in programming, data manipulation, and mathematical reasoning.

1.1 Python Programming (Days 1–10)

Python is the lingua franca of data science. According to recent job market analysis, 86% of data science job postings require Python . Start with the fundamentals before touching any data science library.

Core topics to cover:

Topic	Key Concepts	Estimated Hours
Python Basics	Variables, data types, operators, strings	6
Control Flow	if/elif/else, for loops, while loops	4
Data Structures	Lists, tuples, dictionaries, sets	6
Functions	def, lambda, args/kwargs, scope	4
OOP Basics	Classes, objects, methods, inheritance	4

Essential libraries to learn early:

pandas — the workhorse of data manipulation
numpy — fast numerical computation
matplotlib — basic plotting

Use keywordJupyter Notebook as your primary development environment.

1.2 Data Manipulation (Days 11–20)

Once you know Python basics, shift to keyworddata wrangling — this is what you'll spend 60-80% of your time doing in real data science roles .

Pandas mastery checklist:

Importing data: pd.read_csv(), pd.read_excel(), pd.read_sql()
Inspection: .head(), .info(), .describe(), .shape
Selection & filtering: .loc[], .iloc[], Boolean indexing
Cleaning: handling missing values (.dropna(), .fillna()), duplicates (.drop_duplicates())
Transformation: .groupby(), .agg(), .pivot_table(), .merge(), .join()
Apply functions: .apply(), .map(), vectorized operations

1.3 SQL (Days 21–25)

SQL remains one of the most critical skills — approximately 70-80% of data science roles list it as a requirement . You'll use SQL to extract, filter, and aggregate data from relational databases.

SQL learning path:

\text{SQL Proficiency} = \underbrace{\text{Basic Queries}}_{\text{SELECT, WHERE, ORDER BY}} + \underbrace{\text{Aggregation}}_{\text{GROUP BY, HAVING}} + \underbrace{\text{Joins}}_{\text{INNER, LEFT, FULL}} + \underbrace{\text{Advanced}}_{\text{Window Functions, CTEs}}

1.4 Statistics & Probability (Days 26–30)

Statistics is the intellectual foundation of data science. Without it, you're just guessing. Key areas:

Descriptive statistics: mean ( $\mu$ ), median, mode, variance ( $\sigma^2$ ), standard deviation
Probability distributions: Normal, Binomial, Poisson
Central Limit Theorem: understanding why sample means converge to $\mu$
Hypothesis testing: null/alternative hypotheses, p-values, confidence intervals
Correlation vs. causation: Pearson's $r$ , Spearman's $\rho$

Dawn Choo / LinkedIn – Most In-Demand Data Science Skills in 2025 - Analysis of 101 data science job postings showing Python at 86%, ML at 65%, and SQL demand trends. ↩ ↩²
Databricks – Uncovering Data Science: Skills, Careers, and Education - Research on data science career pathways, required skills, and hiring signals. ↩

Phase 1 Daily Study Protocol (Days 1–30)

1
Step 1
Watch tutorials or read documentation on the day's core topic. Take handwritten notes for retention. Focus on understanding, not memorizing.
2
Step 2
Open Jupyter Notebook and implement every concept yourself. Modify examples, break things, fix errors. Use LeetCode (Python) or HackerRank for structured drills.
3
Step 3
Apply what you learned to a small, self-contained task. For example, after learning Pandas groupby, analyze a Kaggle dataset and find top categories.
4
Step 4
Revisit your notes, summarize the day's learning in 3-5 bullet points, and preview tomorrow's topics. Spaced repetition dramatically improves retention.

Phase 2: Core Skills (Days 31–60)

With foundations in place, you now shift to the heart of data science: analysis, visualization, and machine learning.

2.1 Exploratory Data Analysis (Days 31–40)

keywordExploratory Data Analysis (EDA) is the disciplined practice of understanding your data before modeling. A thorough EDA typically reveals:

Data distributions and outliers
Missing data patterns
Relationships between variables
Potential feature engineering opportunities

EDA workflow tools: Use pandas-profiling for automated reports, then manually explore with seaborn (pairplots, heatmaps, boxplots, violin plots).

2.2 Data Visualization & Storytelling (Days 41–50)

Data without visualization is just noise. The most effective data scientists are not just analysts — they are storytellers.

Visualization stack:

Library	Best For	Difficulty
Matplotlib	Custom, publication-quality plots	Medium
Seaborn	Statistical plots with minimal code	Low
Plotly	Interactive, web-based visualizations	Medium
Tableau / Power BI	Business dashboards, stakeholder reporting	Low

Key principles:

Choose the right chart type — bar charts for comparison, line charts for trends, scatter plots for relationships
Minimize chart junk — remove unnecessary grid lines, borders, 3D effects
Use color intentionally — highlight the key insight, not decoration
Always label axes and provide context

2.3 Machine Learning Fundamentals (Days 46–60)

Machine learning is where many learners get most excited — but remember, without solid EDA and clean data, models are useless ("garbage in, garbage out").

Supervised Learning algorithms to master:

\hat{y} = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_n x_n

The equation above represents keywordlinear regression — the simplest and most interpretable model. Start here, then progress to:

Algorithm	Type	Use Case	scikit-learn Class
Linear Regression	Supervised	Predicting continuous values	`LinearRegression`
Logistic Regression	Supervised	Binary classification	`LogisticRegression`
Decision Tree	Supervised	Interpretable classification/regression	`DecisionTreeClassifier`
Random Forest	Supervised	Ensemble, robust	`RandomForestClassifier`
K-Means Clustering	Unsupervised	Customer segmentation	`KMeans`

Model evaluation metrics you must know:

Regression: $R^2$ , MAE, RMSE
Classification: Accuracy, Precision, Recall, F1-Score, ROC-AUC

The 80/20 Rule of Machine Learning

In practice, 80% of your time is spent on data cleaning and feature engineering, and only 20% on model building. Don't skip EDA to reach modelling faster — the quality of your data determines the quality of your model. A simple linear regression on clean, well-engineered features will outperform a complex neural network trained on noisy, unprocessed data.

Phase 3: Specialization & Portfolio (Days 61–90)

The final phase transitions you from learner to practitioner. Your goal is to build a portfolio of 3 substantial projects that demonstrate real-world data science skills, then prepare strategically for the job market.

3.1 Advanced ML & Feature Engineering (Days 61–72)

Go beyond the basics:

Ensemble methods: Bagging (Random Forests), Boosting (XGBoost, LightGBM, AdaBoost)
Cross-validation: k-fold CV, stratified k-fold for reliable model assessment
Hyperparameter tuning: GridSearchCV, RandomizedSearchCV, Bayesian optimization
Feature engineering: creating interaction terms, binning, encoding cyclical features, target encoding
Handling imbalanced data: SMOTE, class weights, under-sampling

Modern additions (2024–2025): Many data science roles now require familiarity with keywordGenerative AI and LLM tools . Consider adding:

prompt engineering basics
using OpenAI/HuggingFace APIs for text analysis tasks
RAG (Retrieval-Augmented Generation) fundamentals

3.2 Model Deployment (Days 73–80)

A model on your laptop is not a product. Deploying your work demonstrates engineering maturity:

Streamlit — fastest way to build ML web apps (under 1 hour)
Flask / FastAPI — lightweight REST API frameworks
Docker — containerize your application for reproducibility
Cloud platforms: AWS (SageMaker), GCP (Vertex AI), or Heroku for hosting

3.3 Portfolio Projects (Days 81–88)

Your portfolio is your resume in data science. Build 3 projects covering different skill areas:

End-to-End EDA Project — analyze a public dataset (e.g., NYC taxi, WHO health data), clean it, visualize insights, write a blog post
Predictive Modeling Project — pick a Kaggle competition (e.g., House Prices, Titanic), build and tune a model, document your process
Applied ML Dashboard — wrap a trained model in a Streamlit app and deploy it

Every project must include:

A README.md with problem statement, methodology, and results
Clean, commented code in Jupyter notebooks
Visualizations with clear takeaways
A deployed link or demo

3.4 Career Preparation (Days 89–90)

Write a targeted resume highlighting projects and technical skills
Optimize your LinkedIn profile with data science keywords
Practice common interview questions: SQL queries, probability brainteasers, ML theory, and take-home case studies
Network: attend local meetups, join data science Discord communities, contribute to open-source projects

Towards Data Science – The 5 Data Science Skills You Can't Ignore in 2024 - Evolving skill requirements including deep learning, GenAI, and cloud/ML engineering overlap. ↩

1import pandas as pd
2import numpy as np
3from sklearn.model_selection import train_test_split
4from sklearn.ensemble import RandomForestClassifier
5from sklearn.metrics import classification_report
6
7# Load and inspect data
8df = pd.read_csv('dataset.csv')
9print(df.info())
10print(df.describe())
11
12# Train / test split
13X = df.drop('target', axis=1)
14y = df['target']
15X_train, X_test, y_train, y_test = train_test_split(
16    X, y, test_size=0.2, random_state=42, stratify=y
17)
18
19# Train model
20model = RandomForestClassifier(n_estimators=100, random_state=42)
21model.fit(X_train, y_train)
22
23# Evaluate
24y_pred = model.predict(X_test)
25print(classification_report(y_test, y_pred))

End-to-End Data Science Project Workflow

1
Step 1
Start with a clear, measurable question. For example: 'Can we predict customer churn with >80% accuracy using transaction history?' A well-defined problem drives every downstream decision.
2
Step 2
Gather data from databases (SQL), APIs, CSVs, or web scraping. Assess data quality: check for missing values, duplicates, inconsistent formatting. Document every assumption.
3
Step 3
Compute summary statistics. Plot distributions, correlations, and outliers. Ask: What patterns exist? What anomalies need investigation? EDA is where real insight lives.
4
Step 4
Transform raw columns into model-ready features: encode categoricals, scale numeric variables, create interaction terms, handle datetime fields, and address skewness with log transforms.
5
Step 5
Start with a simple baseline (e.g., Logistic Regression for classification). Then iterate: try Random Forests, XGBoost, or gradient boosting. Use k-fold cross-validation and compare metrics systematically.
6
Step 6
Build a Streamlit or Flask app to make predictions available. Write a clear report explaining methodology, limitations, and business impact. Communication separates good data scientists from great ones.

Data Science Job Skill Requirements (2025)

Based on analysis of 101 data science job postings

Frequently Asked Questions

Avoid These Common Mistakes

Skipping SQL — SQL is used daily by 70-80% of data professionals. Don't assume Python replaces it.
Ignoring statistics — ML algorithms are statistics dressed in code. Without statistical literacy, you cannot interpret results correctly.
Too many tools, too little depth — Master Python + SQL + one BI tool deeply before adding more.
No projects — Certificates without projects don't get interviews. Hiring managers want to see what you've built.
Applying too late — Start networking and submitting applications in Phase 3, not after.

Recommended Resources & Weekly Schedule

Below is a suggested weekly structure to keep you on track across all 90 days:

Day	Morning (60–90 min)	Afternoon/Evening (60–90 min)
Mon	New concept learning	Coding practice
Tue	Coding practice	Project work
Wed	New concept learning	Coding practice
Thu	Tutorial / documentation	Project work
Fri	Weekly review & quizzes	Blog / note-taking
Sat	Deep-dive project session (3–4 hrs)	—
Sun	Rest or light review	—

Top free resources:

Python: Python.org tutorial, Codecademy, Automate the Boring Stuff
SQL: W3Schools SQL, StrataScratch, SQLZoo
Statistics: Khan Academy Statistics, StatQuest (YouTube)
ML: Andrew Ng's Coursera course, scikit-learn documentation
Projects: Kaggle datasets, UCI ML Repository, data.gov

The key metric for success isn't hours studied — it's projects completed and concepts understood deeply enough to explain to others.

Knowledge Check

Question 1 of 5

Q1Single choice

Approximately what percentage of data science job postings require Python, according to recent analysis?

50%

65%

86%

95%

Explore Related Topics

Learn SQL in 30 Days: From Zero to Query Master

SQL (Structured Query Language) is the standard language for creating, managing, updating, and retrieving data from relational databases such as MySQL, PostgreSQL, SQL Server, and Oracle. It is widely used across industries — from software engineering to data analytics — making it one of the most in

How to Become a Data Scientist

Becoming a data scientist requires a multidisciplinary foundation in math, statistics, programming, machine learning, domain knowledge, and communication, combined with hands‑on projects that demonstrate the full data‑science lifecycle.

Master core competencies: probability & inference, Python + SQL, data cleaning/EDA, modeling (regression, classification, clustering) and storytelling.
Follow the iterative CRISP‑DM process: business understanding → data preparation → modeling → evaluation → deployment.
Build 2–4 end‑to‑end portfolio projects with messy real data, clear documentation, and business impact to outweigh certificates.
A typical 12‑month pathway allocates ~20% effort to math & stats, 25% to Python/SQL, and the remainder to cleaning, ML, and portfolio work.
Employers usually require at least a bachelor’s degree, but strong projects and communication often outweigh advanced degrees.

Learn JavaScript in 30 Days

The course provides a 30‑day roadmap that guides beginners from core JavaScript syntax to building interactive, async web applications with vanilla JavaScript.

Daily coding sessions of $60$ ‑ $120$ minutes, followed by brief $10$ ‑ $15$ minute reviews.
Structured timeline: Days 1‑5 syntax & data types, 6‑10 functions & structures, 11‑15 DOM & events, 16‑20 modern ES6+, 21‑25 async / fetch, 26‑30 projects.
Covers variables, control flow, functions, arrays/objects, DOM manipulation, modules, promises, and async/await.
Project sequence builds confidence: calculator → counter → to‑do list → API‑driven app → mini dashboard.
Key habits: use const by default, learn APIs by building, finish each topic with a working example.

Browse all research articles

Learn Data Science in 90 Days: A Complete Roadmap

AI Summary

Footnotes

Become a Data Scientist in 90 Days — Step-by-Step Roadmap

90-Day Data Science Learning Lifecycle

Python Fundamentals

Data Manipulation with Pandas & NumPy

SQL & Statistics Foundations

Exploratory Data Analysis & Visualization

Machine Learning Fundamentals

Advanced ML & Deployment

Portfolio, Networking & Job Prep

Weekly Time Allocation Across 90 Days

Common Pitfall: Tutorial Hell

Phase 1: Foundations (Days 1–30)

1.1 Python Programming (Days 1–10)

1.2 Data Manipulation (Days 11–20)

1.3 SQL (Days 21–25)

1.4 Statistics & Probability (Days 26–30)

Footnotes

Phase 1 Daily Study Protocol (Days 1–30)

Phase 2: Core Skills (Days 31–60)

2.1 Exploratory Data Analysis (Days 31–40)

2.2 Data Visualization & Storytelling (Days 41–50)

2.3 Machine Learning Fundamentals (Days 46–60)

The 80/20 Rule of Machine Learning

Phase 3: Specialization & Portfolio (Days 61–90)

3.1 Advanced ML & Feature Engineering (Days 61–72)

3.2 Model Deployment (Days 73–80)

3.3 Portfolio Projects (Days 81–88)

3.4 Career Preparation (Days 89–90)

Footnotes

End-to-End Data Science Project Workflow

Data Science Job Skill Requirements (2025)

Frequently Asked Questions

Avoid These Common Mistakes

Recommended Resources & Weekly Schedule

Knowledge Check

Explore Related Topics