All Roadmaps
Data Engineer
Build and maintain data infrastructure — pipelines, warehouses, lakes, streaming systems, and the tooling that powers analytics and ML at scale.
Programming Foundations
Python and SQL are the two most important languages for a data engineer.
Databases & Storage
Internal
Choose the right storage for every problem — OLTP for transactions, OLAP for analytics, object stores for raw data.
Data Pipeline Orchestration
Internal
Schedule, monitor, and manage complex multi-step data pipelines with proper dependency management.
Batch Processing — Apache Spark
Internal
Process terabytes of data efficiently with distributed computing.
Streaming & Real-Time Data
Process events as they happen — from user clicks to financial transactions.
Data Modelling
Internal
Design data models that are both query-efficient and easy to maintain over time.
Cloud Data Platforms
Build end-to-end data infrastructure on AWS, GCP, or Azure.