Data Structures and Algorithms (DSA) — Comprehensive Course

Data Structures and Algorithms (DSA) — Comprehensive Course

Verified Sources
Jun 25, 2026

Data Structures and Algorithms (DSA) form the backbone of computer science and software engineering. A Data Structure defines how data is arranged in memory, while an Algorithm defines the steps to process that data. Together, they determine how efficiently software operates — from the speed of a Google search to the responsiveness of a video game.

Understanding DSA enables you to write code that is not just correct, but efficient — handling large inputs gracefully, minimizing memory usage, and scaling gracefully under load. According to industry consensus, DSA fluency is the single most important skill evaluated in technical interviews at major technology companies 2.

"Algorithms + Data Structures = Programs." — Niklaus Wirth, Turing Award recipient

Footnotes

  1. DSA Tutorial - GeeksforGeeks - Comprehensive DSA learning guide with topic hierarchy.

  2. DSA Introduction - W3Schools - Beginner-friendly DSA overview with terminology and examples.

Data Structures Easy to Advanced — Full Course by a Google Engineer

Why DSA Matters

Every software system — from operating systems to web browsers to databases — relies on DSA. A poorly chosen data structure can make a program 1000× slower; a well-chosen algorithm can make it 1000× faster .

DomainDSA Application
DatabasesB-Trees for indexing, Hashing for lookups
Operating SystemsQueues for process scheduling, Stacks for interrupts
Web SearchGraphs (PageRank), Tries for autocomplete
NavigationDijkstra's / A* for shortest paths
Social NetworksGraphs for connections, Heaps for feed ranking
CompilersStacks for parsing, Trees for ASTs

Footnotes

  1. DSA Tutorial - GeeksforGeeks - Comprehensive DSA learning guide with topic hierarchy.

The #1 Interview Skill

According to Tech Interview Handbook and multiple FAANG engineers, Arrays, HashMaps, Trees, and Graphs are the four most tested data structures. If you're short on time, master these first along with BFS, DFS, Binary Search, and Dynamic Programming .

Footnotes

  1. 12 Must-Know Data Structures for Coding Interviews - LinkedIn - Ranked list of essential data structures for interviews.

Algorithm Analysis: Big O Notation

Big O Notation describes how an algorithm's performance scales with input size nn. It provides an asymptotic upper bound — meaning the algorithm will never perform worse than the stated complexity for sufficiently large inputs .

Formal Definition

Given two functions f(n)f(n) and g(n)g(n), we say f(n)=O(g(n))f(n) = O(g(n)) if there exist constants c>0c > 0 and n00n_0 \geq 0 such that:

f(n)cg(n)  nn0f(n) \leq c \cdot g(n) \quad \forall \; n \geq n_0

Big O Simplification Rules

  1. Drop constants: 5n2+3nO(n2)5n^2 + 3n \to O(n^2)
  2. Drop lower-order terms: n3+n2+nO(n3)n^3 + n^2 + n \to O(n^3)
  3. Combine same-input variables: 5m+6nO(m+n)5m + 6n \to O(m + n)
  4. Always assume worst case unless otherwise noted

Footnotes

  1. Mastering Big O Notation - Medium - Detailed explanation of time/space complexity and Big O rules. 2

Common Time Complexities — Relative Growth Rate

Comparing operations for increasing input sizes

Big O Complexity Classes — Detailed Reference

How to Analyze Time Complexity of Any Code

  1. 1
    Step 1

    Determine what nn represents — usually the length of an array, number of nodes, or size of input data.

  2. 2
    Step 2

    Count comparisons, assignments, arithmetic operations, and loop iterations as functions of nn.

  3. 3
    Step 3

    A single loop over nn elements → O(n)O(n). A nested loop → O(n2)O(n^2). A loop halving the range each step → O(logn)O(\log n).

  4. 4
    Step 4

    Use the Master Theorem for recurrences of the form T(n)=aT(n/b)+f(n)T(n) = aT(n/b) + f(n). For Merge Sort: T(n)=2T(n/2)+O(n)T(n) = 2T(n/2) + O(n) gives O(nlogn)O(n \log n).

  5. 5
    Step 5

    Drop constants and lower-order terms. 3n2+7n+100O(n2)3n^2 + 7n + 100 \to O(n^2). Always express in simplest Big O form.

  6. 6
    Step 6

    Count all extra memory allocated. Auxiliary arrays of size nnO(n)O(n). Recursive call stack of depth logn\log nO(logn)O(\log n). In-place operations → O(1)O(1).

Common Pitfall: Confusing Best Case with Average Case

Quick Sort has O(nlogn)O(n \log n) average time but O(n2)O(n^2) worst case. Bubble Sort can achieve O(n)O(n) best case on a nearly-sorted array, but its average and worst case are O(n2)O(n^2). Always clarify which case you're analyzing, and in interviews, be prepared to discuss all three .

Footnotes

  1. Time Complexity of Merge Sort - Codecademy - Sorting algorithm comparison with time and space complexity table.

Core Data Structures

1. Arrays

An Array is the most fundamental data structure. Elements occupy consecutive memory locations, enabling O(1)O(1) random access by index. However, insertion and deletion in the middle require shifting elements, costing O(n)O(n) time .

OperationTime Complexity
Access by indexO(1)O(1)
Search (unsorted)O(n)O(n)
Search (sorted)O(logn)O(\log n)
Insert at end (dynamic)O(1)O(1) amortized
Insert at middleO(n)O(n)
Delete from middleO(n)O(n)

Dynamic arrays (like Python list, Java ArrayList, C++ vector) automatically resize when full — typically doubling capacity, which gives O(1)O(1) amortized append performance.

2. Linked Lists

A Linked List stores elements in non-contiguous memory connected by pointers. This eliminates the need for shifting during insertion/deletion, but sacrifices random access .

TypeAccessSearchInsert (head)Insert (tail)Delete
SinglyO(n)O(n)O(n)O(n)O(1)O(1)O(n)O(n) or O(1)O(1)*O(n)O(n)
DoublyO(n)O(n)O(n)O(n)O(1)O(1)O(1)O(1)O(1)O(1)

*With tail pointer †When node reference is given

3. Stacks

A Stack operates on the Last-In, First-Out principle. Think of a stack of plates — you can only add or remove from the top.

  • Push (insert): O(1)O(1)
  • Pop (remove): O(1)O(1)
  • Peek/Top: O(1)O(1)
  • Search: O(n)O(n)

Applications: Function call management (call stack), undo operations, expression evaluation, DFS implementation, bracket matching.

4. Queues

A Queue operates on the First-In, First-Out principle — like a line at a store.

  • Enqueue (insert at rear): O(1)O(1)
  • Dequeue (remove from front): O(1)O(1)
  • Peek/Front: O(1)O(1)
  • Search: O(n)O(n)

Applications: BFS traversal, task scheduling, print job management, message queues.

Footnotes

  1. Time Complexities of Different Data Structures - GeeksforGeeks - Comprehensive complexity table for all major data structures across operations. 2

Evolution of Data Structures

Arrays & Early Storage

1940s

Von Neumann architecture introduces contiguous memory access. Arrays become the first formalized data structure."

Linked Lists & Stacks

1950s

Allen Newell and Herbert Simon introduce linked lists for AI research. Stacks formalized for compiler implementations."

Trees & Graphs

1960s

Binary search trees and graph algorithms formalized. Dijkstra's shortest path (1959), BFS/DFS established."

Balanced Trees & Hashing

1970s

AVL trees refined. Hash tables gain widespread use. Donald Knuth publishes 'The Art of Computer Programming.'"

Standard Template Library

1990s

C++ STL formalizes containers (vector, list, map, set). Java Collections Framework follows."

Specialized Structures

2000s+

Tries for autocomplete, Bloom filters for big data, skip lists, and concurrent data structures for multi-core systems."

5. Hash Tables

A Hash Table provides near-constant-time operations by using a hash function to compute an index into an array of buckets. It is arguably the most commonly used data structure in algorithm problems .

OperationAverageWorst Case
SearchO(1)O(1)O(n)O(n)
InsertO(1)O(1)O(n)O(n)
DeleteO(1)O(1)O(n)O(n)

Worst case occurs when all keys collide into the same bucket. This is mitigated by:

  • Chaining: Each bucket holds a linked list of colliding entries
  • Open Addressing: Probe for the next available slot (linear probing, quadratic probing, double hashing)

Applications: Frequency counting, memoization, two-sum problems, caching (LRU cache with doubly-linked list + hashmap).

6. Trees

A Tree models hierarchical relationships. The most important variants:

Binary Search Tree (BST): Enables O(logn)O(\log n) search/insert/delete when balanced. Warning: An unbalanced BST degrades to O(n)O(n) — this is why self-balancing trees like AVL Trees and Red-Black Trees exist .

OperationBST (avg)BST (worst)AVL Tree
SearchO(logn)O(\log n)O(n)O(n)O(logn)O(\log n)
InsertO(logn)O(\log n)O(n)O(n)O(logn)O(\log n)
DeleteO(logn)O(\log n)O(n)O(n)O(logn)O(\log n)

7. Heaps (Priority Queues)

A Heap provides efficient access to the minimum or maximum element. Used extensively in priority-based systems.

OperationTime Complexity
Get Min/MaxO(1)O(1)
InsertO(logn)O(\log n)
Delete Min/MaxO(logn)O(\log n)
Heapify (build from array)O(n)O(n)

Applications: Top-K elements, Dijkstra's algorithm, Huffman encoding, task scheduling by priority, median finding.

8. Graphs

A Graph generalizes trees by allowing cycles and arbitrary connections.

Representations:

  • Adjacency Matrix: O(V2)O(V^2) space, O(1)O(1) edge lookup — good for dense graphs
  • Adjacency List: O(V+E)O(V + E) space — good for sparse graphs (most practical)
OperationAdj. MatrixAdj. List
Add EdgeO(1)O(1)O(1)O(1)
Remove EdgeO(1)O(1)O(E)O(E)
Check EdgeO(1)O(1)O(degree)O(degree)
SpaceO(V2)O(V^2)O(V+E)O(V + E)

Footnotes

  1. 12 Must-Know Data Structures for Coding Interviews - LinkedIn - Ranked list of essential data structures for interviews.

  2. Time Complexities of Different Data Structures - GeeksforGeeks - Comprehensive complexity table for all major data structures across operations.

Data Structures Comparison — Operation Efficiency

Relative efficiency across key operations (5=best, 1=worst)

Core Algorithms

Searching Algorithms

AlgorithmTime (Best)Time (Avg)Time (Worst)SpaceCondition
Linear SearchO(1)O(1)O(n)O(n)O(n)O(n)O(1)O(1)Unsorted
Binary SearchO(1)O(1)O(logn)O(\log n)O(logn)O(\log n)O(1)O(1)Sorted

Binary Search is one of the most important patterns. It works on any monotonic search space — not just sorted arrays, but also search spaces where you can determine if the answer lies left or right .

Sorting Algorithms

AlgorithmBestAverageWorstSpaceStable
Merge SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n)O(n)
Quick SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n2)O(n^2)O(logn)O(\log n)
Heap SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(nlogn)O(n \log n)O(1)O(1)
Bubble SortO(n)O(n)O(n2)O(n^2)O(n2)O(n^2)O(1)O(1)
Insertion SortO(n)O(n)O(n2)O(n^2)O(n2)O(n^2)O(1)O(1)

Key Insight: Merge Sort guarantees O(nlogn)O(n \log n) in all cases but requires O(n)O(n) extra space. Quick Sort is often faster in practice due to cache locality and in-place partitioning, but degrades to O(n2)O(n^2) on already-sorted input if pivot selection is naive .

Graph Traversal Algorithms

AlgorithmData StructureTimeSpaceUse Case
BFSQueueO(V+E)O(V + E)O(V)O(V)Shortest path (unweighted), level-order
DFSStack / RecursionO(V+E)O(V + E)O(V)O(V)Topological sort, cycle detection, connected components

Dynamic Programming

Dynamic Programming solves complex problems by breaking them into overlapping subproblems and caching results. Two approaches:

  1. Top-Down (Memoization): Start with the original problem, recurse, and cache results
  2. Bottom-Up (Tabulation): Solve smallest subproblems first, build up to the original

Classic DP Problems: Fibonacci, Knapsack, Longest Common Subsequence, Edit Distance, Coin Change, Longest Increasing Subsequence.

Footnotes

  1. DSA Introduction - W3Schools - Beginner-friendly DSA overview with terminology and examples.

  2. Time Complexity of Merge Sort - Codecademy - Sorting algorithm comparison with time and space complexity table.

1# Binary Search 2def binary_search(arr, target): 3 left, right = 0, len(arr) - 1 4 while left <= right: 5 mid = left + (right - left) // 2 6 if arr[mid] == target: 7 return mid 8 elif arr[mid] < target: 9 left = mid + 1 10 else: 11 right = mid - 1 12 return -1 13 14# Merge Sort 15def merge_sort(arr): 16 if len(arr) <= 1: 17 return arr 18 mid = len(arr) // 2 19 left = merge_sort(arr[:mid]) 20 right = merge_sort(arr[mid:]) 21 return merge(left, right) 22 23def merge(left, right): 24 result = [] 25 i = j = 0 26 while i < len(left) and j < len(right): 27 if left[i] <= right[j]: 28 result.append(left[i]) 29 i += 1 30 else: 31 result.append(right[j]) 32 j += 1 33 result.extend(left[i:]) 34 result.extend(right[j:]) 35 return result 36 37# BFS on a Graph (Adjacency List) 38from collections import deque 39 40def bfs(graph, start): 41 visited = set() 42 queue = deque([start]) 43 visited.add(start) 44 while queue: 45 node = queue.popleft() 46 for neighbor in graph[node]: 47 if neighbor not in visited: 48 visited.add(neighbor) 49 queue.append(neighbor) 50 return list(visited)

DSA Core Concepts

1 / 9
11%
Question · Term

What does Big O notation describe?

Click to reveal
Answer · Definition

The upper bound of an algorithm's time or space complexity — how performance scales as input size nn grows toward infinity.

Choosing the Right Data Structure

The art of DSA lies in choosing the right tool for each problem. Here is a decision framework:

Problem PatternRecommended Data StructureKey Algorithm
"Find if element exists"Hash Set / Hash MapHashing
"Kth largest/smallest"Heap (Priority Queue)Heapify
"Sorted data, fast lookup"Balanced BST (TreeMap)Binary Search
"Undo/Redo operations"StackLIFO operations
"Task processing in order"QueueFIFO operations
"Shortest path"Graph + Priority QueueDijkstra's
"Autocomplete/prefix matching"TrieTrie traversal
"Range queries/min-max"Segment Tree / BITO(logn)O(\log n) queries
"Connected components"Graph + DFS/BFSUnion-Find
"Two-Sum / frequency count"Hash MapOne-pass hashing

Input Size ↔ Algorithm Feasibility

A critical intuition for interviews :

Max Input Size nnRequired ComplexityAlgorithm Ideas
n10n \leq 10O(n!)O(n!)Permutations, brute force
n20n \leq 20O(2n)O(2^n)Subset generation, DP with bitmask
n500n \leq 500O(n3)O(n^3)Floyd-Warshall, 3D DP
n104n \leq 10^4O(n2)O(n^2)Nested loops, 2D DP
n105n \leq 10^5O(nlogn)O(n \log n)Sort + Binary Search, Merge Sort
n106n \leq 10^6O(n)O(n)Linear scan, two-pointer, sliding window
n1018n \leq 10^{18}O(logn)O(\\log n)Binary search on answer, matrix exponentiation

Footnotes

  1. DSA Introduction - W3Schools - Beginner-friendly DSA overview with terminology and examples.

Problem-Solving Framework for DSA Interviews

  1. 1
    Step 1

    Read carefully. Clarify constraints, input/output format, edge cases. Ask: Can the array be empty? Are values positive only? What's the max input size?

  2. 2
    Step 2

    Is it a search problem? Sorting? Counting? Graph traversal? Pattern recognition narrows data structure and algorithm choices dramatically.

  3. 3
    Step 3

    Always start with the simplest correct solution, even if it's O(n2)O(n^2) or O(2n)O(2^n). This ensures understanding before optimizing.

  4. 4
    Step 4

    Ask: Can a Hash Map replace a nested loop? Can a Heap avoid full sorting? Can Binary Search replace linear search? This often reduces complexity by a factor of nn.

  5. 5
    Step 5

    State both time and space complexity. Verify against the input size constraint. If n=105n = 10^5, an O(n2)O(n^2) solution will time out.

  6. 6
    Step 6

    Write modular code with descriptive variable names. Handle edge cases. Prefer pure functions. Test with small inputs, boundary cases, and a large random case.

Advanced DSA Topics for Further Study

Interview Red Flags

Writing O(n2)O(n^2) code when n=105n = 10^5 (will TLE). Not clarifying if input is sorted before choosing an algorithm. Forgetting to handle empty input or single-element arrays. Modifying shared references unintentionally. Not stating time/space complexity when asked.

Typical DSA Interview Topic Distribution

Approximate frequency across FAANG interviews

Knowledge Check

Question 1 of 5
Q1Single choice

What is the worst-case time complexity of searching in a balanced Binary Search Tree?

Explore Related Topics

1

The Comprehensive Data Scientist Roadmap: From Foundations to Specialization

Data science sits at the intersection of mathematics, computer science, and domain expertise. A modern Data Scientist must navigate a complex ecosystem of tools, algorithms, and business strategies to transform raw data into actionable intelligence. This roadmap provides a structured, rigorous pathw

2

Master Class: System Design for Software Engineers

The master class teaches software engineers how to design scalable, reliable distributed systems, covering architecture, scaling, trade‑offs, and interview techniques.

  • Horizontal scaling introduces state, partition, and consistency challenges.
  • CAP vs PACELC guide consistency, availability, and latency choices.
  • Redundancy, load balancers, caches, and async messaging avoid SPOFs.
  • SQL provides ACID with vertical scaling; NoSQL offers BASE and horizontal scaling.
  • Interview steps: clarify requirements, sketch design, deep‑dive components, add resiliency, optimize P99 latency.
3

Distributed Systems: Architecture, Coordination, and Consensus

The course covers distributed system fundamentals, consistency‑availability trade‑offs, consensus via Raft, and data partitioning methods.

  • Key traits: concurrent components, no global clock, independent failures; network partitions reveal common fallacies.
  • CAP forces a consistency vs. availability choice during partitions; PACELC adds latency vs. consistency when no partition (e.g., Cassandra prefers latency).
  • Raft election: followers timeout, become candidates, request votes, and win leadership with a quorum of ⌊N/2⌋+1, avoiding split‑brain.
  • Consistent hashing minimizes reshuffling to ~K/n keys on node addition, while range sharding speeds range queries but can hotspot.