vinod sharma .in Solution Architect, Author & Educator
Courses, books, roadmaps, and tutorials to help developers build real-world skills.
Β© 2026 Vinod Sharma. All rights reserved.
Back to RoadmapsDevOps Engineer A structured roadmap to becoming a DevOps engineer. This path covers Linux administration, scripting, version control, CI/CD, containers, orchestration, infrastructure as code, cloud platforms, monitoring, security, site reliability, and modern platform engineering practices.
12 milestones in this roadmap
Step 1 beginner 6-8 weeks
Linux & Networking Fundamentals Gain fluency with the Linux command line and TCP/IP networking fundamentals that underpin every server and container in production.
Curriculum
1 File systems, permissions (chmod/chown), processes, and systemd service management 2 Shell navigation, pipes, redirection, and text processing (grep, awk, sed) 3 TCP/IP stack, DNS resolution, HTTP/HTTPS, and TLS handshake 4 Firewalls (iptables/nftables), network namespaces, and port forwarding 5 SSH configuration, key management, and tunnelling 6 Troubleshooting with netstat, ss, tcpdump, dig, and traceroute Tools & Platforms
Ubuntu / CentOS / Alpine Linux curl / wget / httpie Wireshark / tcpdump systemd / journalctl
Step 1 beginner 6-8 weeks
Linux & Networking Fundamentals Gain fluency with the Linux command line and TCP/IP networking fundamentals that underpin every server and container in production.
Curriculum
1 File systems, permissions (chmod/chown), processes, and systemd service management 2 Shell navigation, pipes, redirection, and text processing (grep, awk, sed) 3 TCP/IP stack, DNS resolution, HTTP/HTTPS, and TLS handshake 4 Firewalls (iptables/nftables), network namespaces, and port forwarding 5
Step 2 beginner 4-6 weeks
Scripting (Bash & Python) Automate infrastructure and operational tasks with Bash scripting and Python for more complex tooling.
Curriculum
1 Bash scripting: variables, loops, conditionals, functions, and exit codes 2 Text processing pipelines with grep, awk, sed, and jq 3 Cron jobs, at scheduling, and systemd timers 4 Python for automation: os, subprocess, requests, paramiko 5
Step 3 beginner 3-4 weeks
Version Control (Advanced Git) Master advanced Git workflows, branching strategies, and repository management for large engineering teams.
Curriculum
1 Branching strategies: Git Flow, trunk-based development, and release branches 2 Rebasing, interactive rebase, cherry-picking, and bisecting 3 Merge conflict resolution and rerere (reuse recorded resolution) 4 Git hooks (pre-commit, pre-push) and commit message conventions
Step 4 intermediate 6-8 weeks
CI/CD Pipelines (Jenkins & GitHub Actions) Design and implement CI/CD pipelines that automate testing, security scanning, artifact building, and deployment with proper approval gates.
Curriculum
1 Pipeline architecture: stages, jobs, parallelism, and dependencies 2 Jenkins declarative pipelines and shared libraries 3 GitHub Actions: workflows, composite actions, matrix builds, and caching 4 Artifact management: container images, packages, and versioning
Step 5 intermediate 4-6 weeks
Containerisation (Docker Deep Dive) Master Docker beyond basics with multi-stage builds, security hardening, networking, and production-grade container workflows.
Curriculum
1 Dockerfile best practices: multi-stage builds, layer caching, and .dockerignore 2 Container networking: bridge, host, overlay, and DNS resolution 3 Volume management, bind mounts, and tmpfs for stateful containers 4 Security hardening: non-root users, read-only filesystems, seccomp profiles
Step 6 intermediate 8-10 weeks
Container Orchestration (Kubernetes) Learn Kubernetes architecture and deploy, scale, and manage containerised workloads on managed clusters.
Curriculum
1 Kubernetes architecture: control plane, kubelet, kube-proxy, and etcd 2 Core resources: Pods, Deployments, Services, Ingress, ConfigMaps, Secrets 3 Namespaces, RBAC, and network policies for multi-tenant clusters 4 Helm charts: templating, values, hooks, and chart repositories
Step 7 intermediate 6-8 weeks
Infrastructure as Code (Terraform & Ansible) Define and provision infrastructure declaratively with Terraform and manage configuration with Ansible for reproducible environments.
Curriculum
1 Terraform HCL: providers, resources, data sources, and variables 2 Terraform modules, workspaces, and state management (remote backends) 3 Terraform plan/apply lifecycle, drift detection, and import 4 Ansible playbooks, roles, inventories, and Jinja2 templating
Step 8 intermediate 8-10 weeks
Cloud Platforms (AWS/Azure/GCP) Develop deep expertise in at least one major cloud provider covering compute, storage, networking, and managed services.
Curriculum
1 AWS core: EC2, S3, RDS, Lambda, VPC, IAM, CloudFormation 2 Cloud networking: VPCs, subnets, security groups, load balancers, VPN 3 Managed services: databases, queues, caches, and serverless compute 4 Cost management: reserved instances, savings plans, right-sizing, tagging
Step 9 intermediate 6-8 weeks
Monitoring & Observability (Prometheus/Grafana/ELK) Implement the three pillars of observability (metrics, logs, traces) to diagnose production incidents and maintain system health.
Curriculum
1 Metrics collection with Prometheus: exporters, PromQL, recording rules 2 Grafana dashboards: panels, variables, annotations, and alerting 3 Centralised logging with ELK Stack (Elasticsearch, Logstash, Kibana) or Loki 4 Distributed tracing with Jaeger or Tempo and OpenTelemetry instrumentation
Step 10 advanced 4-6 weeks
Security & Compliance (DevSecOps) Integrate security into every pipeline stage with automated scanning, secret management, and compliance-as-code.
Curriculum
1 SAST (static analysis) and DAST (dynamic analysis) in CI pipelines 2 Dependency scanning and software composition analysis (SCA) 3 Container image scanning and admission controllers 4 Secret management: HashiCorp Vault, AWS Secrets Manager, sealed secrets
Step 11 advanced 6-8 weeks
Site Reliability Engineering Apply SRE principles including error budgets, SLOs, chaos engineering, and incident management to balance reliability with velocity.
Curriculum
1 SLIs, SLOs, and SLAs: defining and measuring service reliability 2 Error budgets and their role in balancing feature velocity and stability 3 Incident management: detection, response, mitigation, and blameless post-mortems 4 Chaos engineering: fault injection, game days, and resilience testing
Step 12 advanced 6-8 weeks
GitOps & Platform Engineering Adopt GitOps workflows and build internal developer platforms that abstract infrastructure complexity for engineering teams.
Curriculum
1 GitOps principles: declarative configuration, Git as single source of truth 2 ArgoCD: application definitions, sync policies, and multi-cluster management 3 Flux CD: GitRepository sources, Kustomizations, and HelmReleases 4 Internal developer platforms (IDPs): self-service portals and golden paths Ready to start this journey? Browse our courses and books to begin your learning path.
SSH configuration, key management, and tunnelling
6 Troubleshooting with netstat, ss, tcpdump, dig, and traceroute Tools & Platforms
Ubuntu / CentOS / Alpine Linux curl / wget / httpie Wireshark / tcpdump systemd / journalctl
Writing idempotent scripts and configuration generators
6 Error handling, logging, and script testing practices Tools & Platforms
Bash / Zsh Python 3 jq for JSON processing ShellCheck for linting
Step 2 beginner 4-6 weeks
Scripting (Bash & Python) Automate infrastructure and operational tasks with Bash scripting and Python for more complex tooling.
Curriculum
1 Bash scripting: variables, loops, conditionals, functions, and exit codes 2 Text processing pipelines with grep, awk, sed, and jq 3 Cron jobs, at scheduling, and systemd timers 4 Python for automation: os, subprocess, requests, paramiko 5 Writing idempotent scripts and configuration generators 6 Error handling, logging, and script testing practices Tools & Platforms
Bash / Zsh Python 3 jq for JSON processing ShellCheck for linting
5
Monorepo tooling: Nx, Turborepo, and sparse checkout
6 Git internals: object model, packfiles, and garbage collection Tools & Platforms
Git CLI GitHub / GitLab pre-commit framework Nx / Turborepo
Step 3 beginner 3-4 weeks
Version Control (Advanced Git) Master advanced Git workflows, branching strategies, and repository management for large engineering teams.
Curriculum
1 Branching strategies: Git Flow, trunk-based development, and release branches 2 Rebasing, interactive rebase, cherry-picking, and bisecting 3 Merge conflict resolution and rerere (reuse recorded resolution) 4 Git hooks (pre-commit, pre-push) and commit message conventions 5 Monorepo tooling: Nx, Turborepo, and sparse checkout 6 Git internals: object model, packfiles, and garbage collection Tools & Platforms
Git CLI GitHub / GitLab pre-commit framework Nx / Turborepo
5
Security scanning integration: SAST, dependency checks, and image scanning
6 Deployment strategies: blue-green, canary, rolling, and feature flags Tools & Platforms
Jenkins GitHub Actions / GitLab CI Artifactory / Nexus SonarQube / Snyk
Step 4 intermediate 6-8 weeks
CI/CD Pipelines (Jenkins & GitHub Actions) Design and implement CI/CD pipelines that automate testing, security scanning, artifact building, and deployment with proper approval gates.
Curriculum
1 Pipeline architecture: stages, jobs, parallelism, and dependencies 2 Jenkins declarative pipelines and shared libraries 3 GitHub Actions: workflows, composite actions, matrix builds, and caching 4 Artifact management: container images, packages, and versioning 5 Security scanning integration: SAST, dependency checks, and image scanning 6 Deployment strategies: blue-green, canary, rolling, and feature flags Tools & Platforms
Jenkins GitHub Actions / GitLab CI Artifactory / Nexus SonarQube / Snyk
5
Docker Compose for multi-service local development and testing
6 Image scanning, signing (cosign), and supply chain security (SBOM) Tools & Platforms
Docker Engine & Docker CLI Docker Compose Dive (image layer analysis) Trivy / Grype
Step 5 intermediate 4-6 weeks
Containerisation (Docker Deep Dive) Master Docker beyond basics with multi-stage builds, security hardening, networking, and production-grade container workflows.
Curriculum
1 Dockerfile best practices: multi-stage builds, layer caching, and .dockerignore 2 Container networking: bridge, host, overlay, and DNS resolution 3 Volume management, bind mounts, and tmpfs for stateful containers 4 Security hardening: non-root users, read-only filesystems, seccomp profiles 5 Docker Compose for multi-service local development and testing 6 Image scanning, signing (cosign), and supply chain security (SBOM) Tools & Platforms
Docker Engine & Docker CLI Docker Compose Dive (image layer analysis) Trivy / Grype
5
StatefulSets, DaemonSets, Jobs, and CronJobs
6 Horizontal Pod Autoscaler, cluster autoscaling, and resource limits/requests Tools & Platforms
kubectl Helm Minikube / kind / k3s EKS / GKE / AKS
Step 6 intermediate 8-10 weeks
Container Orchestration (Kubernetes) Learn Kubernetes architecture and deploy, scale, and manage containerised workloads on managed clusters.
Curriculum
1 Kubernetes architecture: control plane, kubelet, kube-proxy, and etcd 2 Core resources: Pods, Deployments, Services, Ingress, ConfigMaps, Secrets 3 Namespaces, RBAC, and network policies for multi-tenant clusters 4 Helm charts: templating, values, hooks, and chart repositories 5 StatefulSets, DaemonSets, Jobs, and CronJobs 6 Horizontal Pod Autoscaler, cluster autoscaling, and resource limits/requests Tools & Platforms
kubectl Helm Minikube / kind / k3s EKS / GKE / AKS
5
Ansible Galaxy, vault for secrets, and idempotent task design
6 Pulumi and CDK as code-first IaC alternatives Tools & Platforms
Terraform Ansible Terraform Cloud / Spacelift Packer for image building
Step 7 intermediate 6-8 weeks
Infrastructure as Code (Terraform & Ansible) Define and provision infrastructure declaratively with Terraform and manage configuration with Ansible for reproducible environments.
Curriculum
1 Terraform HCL: providers, resources, data sources, and variables 2 Terraform modules, workspaces, and state management (remote backends) 3 Terraform plan/apply lifecycle, drift detection, and import 4 Ansible playbooks, roles, inventories, and Jinja2 templating 5 Ansible Galaxy, vault for secrets, and idempotent task design 6 Pulumi and CDK as code-first IaC alternatives Tools & Platforms
Terraform Ansible Terraform Cloud / Spacelift Packer for image building
5
Cloud certifications: AWS Solutions Architect Associate or equivalent
6 Multi-account strategy, Organizations, and landing zones Tools & Platforms
AWS Console & CLI Azure / GCP (secondary) AWS Cost Explorer CloudWatch / Cloud Monitoring
Step 8 intermediate 8-10 weeks
Cloud Platforms (AWS/Azure/GCP) Develop deep expertise in at least one major cloud provider covering compute, storage, networking, and managed services.
Curriculum
1 AWS core: EC2, S3, RDS, Lambda, VPC, IAM, CloudFormation 2 Cloud networking: VPCs, subnets, security groups, load balancers, VPN 3 Managed services: databases, queues, caches, and serverless compute 4 Cost management: reserved instances, savings plans, right-sizing, tagging 5 Cloud certifications: AWS Solutions Architect Associate or equivalent 6 Multi-account strategy, Organizations, and landing zones Tools & Platforms
AWS Console & CLI Azure / GCP (secondary) AWS Cost Explorer CloudWatch / Cloud Monitoring
5 Alert design: signal vs noise, escalation policies, and runbooks
6 SLI/SLO-based monitoring and golden signals (latency, traffic, errors, saturation) Tools & Platforms
Prometheus & Grafana ELK Stack / Loki Jaeger / Tempo PagerDuty / Opsgenie
Step 9 intermediate 6-8 weeks
Monitoring & Observability (Prometheus/Grafana/ELK) Implement the three pillars of observability (metrics, logs, traces) to diagnose production incidents and maintain system health.
Curriculum
1 Metrics collection with Prometheus: exporters, PromQL, recording rules 2 Grafana dashboards: panels, variables, annotations, and alerting 3 Centralised logging with ELK Stack (Elasticsearch, Logstash, Kibana) or Loki 4 Distributed tracing with Jaeger or Tempo and OpenTelemetry instrumentation 5 Alert design: signal vs noise, escalation policies, and runbooks 6 SLI/SLO-based monitoring and golden signals (latency, traffic, errors, saturation) Tools & Platforms
Prometheus & Grafana ELK Stack / Loki Jaeger / Tempo PagerDuty / Opsgenie
5
Policy-as-code with OPA/Gatekeeper and Kyverno
6 Compliance frameworks: SOC 2, HIPAA, PCI-DSS, and audit automation Tools & Platforms
SonarQube / Semgrep Snyk / Trivy / Grype HashiCorp Vault OPA / Gatekeeper
Step 10 advanced 4-6 weeks
Security & Compliance (DevSecOps) Integrate security into every pipeline stage with automated scanning, secret management, and compliance-as-code.
Curriculum
1 SAST (static analysis) and DAST (dynamic analysis) in CI pipelines 2 Dependency scanning and software composition analysis (SCA) 3 Container image scanning and admission controllers 4 Secret management: HashiCorp Vault, AWS Secrets Manager, sealed secrets 5 Policy-as-code with OPA/Gatekeeper and Kyverno 6 Compliance frameworks: SOC 2, HIPAA, PCI-DSS, and audit automation Tools & Platforms
SonarQube / Semgrep Snyk / Trivy / Grype HashiCorp Vault OPA / Gatekeeper
5
Capacity planning, load testing, and performance benchmarking
6 On-call best practices, runbook automation, and toil reduction Tools & Platforms
k6 / Locust / Gatling Chaos Monkey / Litmus / Gremlin Statuspage / incident.io PagerDuty / Opsgenie
Step 11 advanced 6-8 weeks
Site Reliability Engineering Apply SRE principles including error budgets, SLOs, chaos engineering, and incident management to balance reliability with velocity.
Curriculum
1 SLIs, SLOs, and SLAs: defining and measuring service reliability 2 Error budgets and their role in balancing feature velocity and stability 3 Incident management: detection, response, mitigation, and blameless post-mortems 4 Chaos engineering: fault injection, game days, and resilience testing 5 Capacity planning, load testing, and performance benchmarking 6 On-call best practices, runbook automation, and toil reduction Tools & Platforms
k6 / Locust / Gatling Chaos Monkey / Litmus / Gremlin Statuspage / incident.io PagerDuty / Opsgenie
5
Backstage: software catalog, templates, TechDocs, and plugin ecosystem
6 Platform engineering maturity model and measuring developer productivity Tools & Platforms
ArgoCD / Flux CD Backstage Crossplane Port / Humanitec
Step 12 advanced 6-8 weeks
GitOps & Platform Engineering Adopt GitOps workflows and build internal developer platforms that abstract infrastructure complexity for engineering teams.
Curriculum
1 GitOps principles: declarative configuration, Git as single source of truth 2 ArgoCD: application definitions, sync policies, and multi-cluster management 3 Flux CD: GitRepository sources, Kustomizations, and HelmReleases 4 Internal developer platforms (IDPs): self-service portals and golden paths 5 Backstage: software catalog, templates, TechDocs, and plugin ecosystem 6 Platform engineering maturity model and measuring developer productivity Tools & Platforms
ArgoCD / Flux CD Backstage Crossplane Port / Humanitec