Agentic & Autonomous AI

A cutting-edge roadmap for building autonomous AI agents. This path covers the mathematical foundations, machine learning, deep learning, large language models, prompt engineering, agent frameworks, tool use, multi-agent systems, memory management, autonomous reasoning, safety, and production deployment of AI agents.

12 milestones in this roadmap

Step 1beginner6-8 weeks

Python & Mathematics Foundation

Build a strong foundation in Python programming and the mathematics (linear algebra, calculus, probability) that underpins all of AI.

Curriculum

1Python fundamentals: data structures, functions, OOP, and type hints
2NumPy for vectorised computation and matrix operations
3Linear algebra: vectors, matrices, eigenvalues, and singular value decomposition
4Multivariable calculus: gradients, chain rule, and optimisation (gradient descent)
5Probability distributions, Bayes theorem, and maximum likelihood estimation
6Pandas and Matplotlib for data manipulation and visualisation

Tools & Platforms

Python 3 / Jupyter NotebooksNumPy / Pandas / MatplotlibSymPy for symbolic mathGoogle Colab

🧮

Step 1beginner6-8 weeks

Python & Mathematics Foundation

Build a strong foundation in Python programming and the mathematics (linear algebra, calculus, probability) that underpins all of AI.

Curriculum

1Python fundamentals: data structures, functions, OOP, and type hints
2NumPy for vectorised computation and matrix operations
3Linear algebra: vectors, matrices, eigenvalues, and singular value decomposition
4Multivariable calculus: gradients, chain rule, and optimisation (gradient descent)
5

Step 2beginner6-8 weeks

Machine Learning Fundamentals

Master classical machine learning algorithms, model evaluation, and the bias-variance tradeoff with hands-on Scikit-learn projects.

Curriculum

1Supervised learning: linear regression, logistic regression, decision trees, random forests, SVMs
2Unsupervised learning: k-means clustering, hierarchical clustering, PCA, and DBSCAN
3Model evaluation: cross-validation, precision, recall, F1, ROC-AUC, and confusion matrices
4Bias-variance tradeoff, overfitting, regularisation (L1/L2), and ensemble methods

Step 3intermediate8-10 weeks

Deep Learning (PyTorch & TensorFlow)

Build and train neural networks with PyTorch covering CNNs, RNNs, transfer learning, and GPU-accelerated training.

Curriculum

1Perceptrons, activation functions, and universal approximation theorem
2Backpropagation, loss functions (cross-entropy, MSE), and gradient flow
3Optimisers: SGD, Adam, AdaGrad, learning rate schedules and warm-up
4CNNs for image classification: convolutions, pooling, ResNet, EfficientNet

Step 4intermediate6-8 weeks

NLP & Large Language Models

Understand natural language processing from tokenisation through the Transformer architecture and how modern LLMs are trained.

Curriculum

1Tokenisation (BPE, WordPiece, SentencePiece) and embedding representations
2Word2Vec, GloVe, and contextual embeddings (ELMo)
3Transformer architecture: self-attention, multi-head attention, positional encoding
4Pre-training objectives: masked language modelling, causal language modelling

Step 5intermediate4-6 weeks

Prompt Engineering & RAG

Master prompt engineering techniques and build RAG pipelines that ground LLM responses in factual, domain-specific knowledge.

Curriculum

1Zero-shot, few-shot, chain-of-thought, and tree-of-thought prompting
2System prompts, role prompting, and structured output (JSON mode)
3Document chunking strategies: fixed-size, semantic, recursive, and parent-child
4Embedding models, vector similarity search, and re-ranking

Step 6intermediate6-8 weeks

AI Agent Frameworks (LangChain/CrewAI/AutoGen)

Build AI agents with popular frameworks using ReAct loops, structured outputs, and composable chain and graph workflows.

Curriculum

1Agent loop patterns: ReAct (reason + act), Plan-and-Execute, and reflection
2LangChain: chains, agents, output parsers, and callback handlers
3LangGraph: state machines, conditional edges, and human-in-the-loop nodes
4CrewAI: role-based agents, tasks, tools, and collaboration patterns

Step 7intermediate4-6 weeks

Tool Use & Function Calling

Enable agents to interact with external systems through function calling, API integration, code execution, and browser control.

Curriculum

1Function calling: JSON schema definitions, parameter validation, and error handling
2Web browsing agents: HTTP requests, HTML parsing, and headless browser control
3Database query agents: natural language to SQL, result interpretation
4Code execution sandboxing: Docker containers, E2B, and security boundaries

Step 8advanced4-6 weeks

Multi-Agent Systems

Design systems where multiple specialised agents collaborate using orchestration patterns to solve complex problems.

Curriculum

1Orchestration patterns: hierarchical supervisor, peer-to-peer, and debate
2Agent handoffs, shared context, and message passing protocols
3Specialised agent roles: researcher, coder, reviewer, planner
4Conflict resolution and consensus mechanisms between agents

Step 9advanced4-6 weeks

Memory & State Management

Implement short-term, long-term, and episodic memory systems that make agents contextually aware across sessions.

Curriculum

1Conversation history management: sliding window, summarisation, and compression
2Long-term memory with vector stores and knowledge graph retrieval
3Episodic memory: storing and retrieving past task outcomes and lessons
4Working memory: scratchpads, chain-of-thought buffers, and context assembly

Step 10advanced4-6 weeks

Autonomous Decision Making

Build agents that plan, reason, and self-correct autonomously over extended tasks using goal decomposition and reflection.

Curriculum

1Task decomposition: hierarchical planning and sub-goal generation
2Self-reflection loops: critique, revision, and iterative improvement
3Monte Carlo Tree Search and beam search for planning under uncertainty
4Reward modelling, heuristic evaluation, and action quality assessment

Step 11advanced3-4 weeks

Safety & Alignment

Understand AI safety risks and implement guardrails, content filtering, and alignment techniques for responsible agent deployment.

Curriculum

1Prompt injection attacks: direct injection, indirect injection, and defences
2Hallucination detection, grounding, and factuality verification
3Output validation, content filtering, and toxicity detection
4Guardrail frameworks: input/output validators, topic restrictions, PII detection

Step 12advanced4-6 weeks

Production Deployment of AI Agents

Deploy AI agents to production with proper serving infrastructure, observability, cost tracking, and evaluation pipelines.

Curriculum

1API serving with FastAPI, streaming responses, and WebSocket support
2Model hosting: vLLM, TGI, Ollama for self-hosted inference
3Observability: tracing agent runs, token usage, latency, and error rates
4Cost management: prompt caching, model routing, and token budgets

Ready to start this journey?

Browse our courses and books to begin your learning path.

Browse Courses Browse Books

Probability distributions, Bayes theorem, and maximum likelihood estimation

6Pandas and Matplotlib for data manipulation and visualisation

Tools & Platforms

Python 3 / Jupyter NotebooksNumPy / Pandas / MatplotlibSymPy for symbolic mathGoogle Colab

5Feature engineering: encoding, scaling, selection, and dimensionality reduction

6Hyperparameter tuning: grid search, random search, and Bayesian optimisation

Tools & Platforms

Scikit-learnXGBoost / LightGBMWeights & Biases / MLflowKaggle datasets

Step 2beginner6-8 weeks

Machine Learning Fundamentals

Master classical machine learning algorithms, model evaluation, and the bias-variance tradeoff with hands-on Scikit-learn projects.

Curriculum

1Supervised learning: linear regression, logistic regression, decision trees, random forests, SVMs
2Unsupervised learning: k-means clustering, hierarchical clustering, PCA, and DBSCAN
3Model evaluation: cross-validation, precision, recall, F1, ROC-AUC, and confusion matrices
4Bias-variance tradeoff, overfitting, regularisation (L1/L2), and ensemble methods
5Feature engineering: encoding, scaling, selection, and dimensionality reduction
6Hyperparameter tuning: grid search, random search, and Bayesian optimisation

Tools & Platforms

Scikit-learnXGBoost / LightGBMWeights & Biases / MLflowKaggle datasets

RNNs, LSTMs, and GRUs for sequence modelling

6Transfer learning, fine-tuning, regularisation (dropout, batch norm), and data augmentation

Tools & Platforms

PyTorchTensorFlow / KerasCUDA / cuDNNWeights & Biases

Step 3intermediate8-10 weeks

Deep Learning (PyTorch & TensorFlow)

Build and train neural networks with PyTorch covering CNNs, RNNs, transfer learning, and GPU-accelerated training.

Curriculum

1Perceptrons, activation functions, and universal approximation theorem
2Backpropagation, loss functions (cross-entropy, MSE), and gradient flow
3Optimisers: SGD, Adam, AdaGrad, learning rate schedules and warm-up
4CNNs for image classification: convolutions, pooling, ResNet, EfficientNet
5RNNs, LSTMs, and GRUs for sequence modelling
6Transfer learning, fine-tuning, regularisation (dropout, batch norm), and data augmentation

Tools & Platforms

PyTorchTensorFlow / KerasCUDA / cuDNNWeights & Biases

Fine-tuning techniques: LoRA, QLoRA, prefix tuning, and adapter layers

6RLHF, DPO, and instruction tuning for alignment

Tools & Platforms

Hugging Face TransformersOpenAI API / Anthropic APIOllama / vLLM for local inferenceHugging Face Datasets & Tokenizers

Step 4intermediate6-8 weeks

NLP & Large Language Models

Understand natural language processing from tokenisation through the Transformer architecture and how modern LLMs are trained.

Curriculum

1Tokenisation (BPE, WordPiece, SentencePiece) and embedding representations
2Word2Vec, GloVe, and contextual embeddings (ELMo)
3Transformer architecture: self-attention, multi-head attention, positional encoding
4Pre-training objectives: masked language modelling, causal language modelling
5Fine-tuning techniques: LoRA, QLoRA, prefix tuning, and adapter layers
6RLHF, DPO, and instruction tuning for alignment

Tools & Platforms

Hugging Face TransformersOpenAI API / Anthropic APIOllama / vLLM for local inferenceHugging Face Datasets & Tokenizers

Vector databases: indexing (HNSW, IVF), filtering, and hybrid search

6RAG evaluation: faithfulness, relevance, and answer correctness metrics

Tools & Platforms

Pinecone / Weaviate / Chroma / QdrantLangChain / LlamaIndexOpenAI Embeddings / Cohere EmbedRagas / DeepEval for evaluation

Step 5intermediate4-6 weeks

Prompt Engineering & RAG

Master prompt engineering techniques and build RAG pipelines that ground LLM responses in factual, domain-specific knowledge.

Curriculum

1Zero-shot, few-shot, chain-of-thought, and tree-of-thought prompting
2System prompts, role prompting, and structured output (JSON mode)
3Document chunking strategies: fixed-size, semantic, recursive, and parent-child
4Embedding models, vector similarity search, and re-ranking
5Vector databases: indexing (HNSW, IVF), filtering, and hybrid search
6RAG evaluation: faithfulness, relevance, and answer correctness metrics

Tools & Platforms

Pinecone / Weaviate / Chroma / QdrantLangChain / LlamaIndexOpenAI Embeddings / Cohere EmbedRagas / DeepEval for evaluation

5AutoGen: conversational agents, group chat, and function calling

6Structured output parsing, retry logic, and fallback strategies

Tools & Platforms

LangChain / LangGraphCrewAIAutoGen / AG2Claude Code / OpenAI Assistants API

Step 6intermediate6-8 weeks

AI Agent Frameworks (LangChain/CrewAI/AutoGen)

Build AI agents with popular frameworks using ReAct loops, structured outputs, and composable chain and graph workflows.

Curriculum

1Agent loop patterns: ReAct (reason + act), Plan-and-Execute, and reflection
2LangChain: chains, agents, output parsers, and callback handlers
3LangGraph: state machines, conditional edges, and human-in-the-loop nodes
4CrewAI: role-based agents, tasks, tools, and collaboration patterns
5AutoGen: conversational agents, group chat, and function calling
6Structured output parsing, retry logic, and fallback strategies

Tools & Platforms

LangChain / LangGraphCrewAIAutoGen / AG2Claude Code / OpenAI Assistants API

5File system operations, document processing, and multi-modal inputs

6Tool selection strategies: dynamic tool loading and tool description optimisation

Tools & Platforms

OpenAI Function Calling / Claude Tool UsePlaywright / Selenium for browser automationE2B / Modal for sandboxed executionComposio / Toolhouse

Step 7intermediate4-6 weeks

Tool Use & Function Calling

Enable agents to interact with external systems through function calling, API integration, code execution, and browser control.

Curriculum

1Function calling: JSON schema definitions, parameter validation, and error handling
2Web browsing agents: HTTP requests, HTML parsing, and headless browser control
3Database query agents: natural language to SQL, result interpretation
4Code execution sandboxing: Docker containers, E2B, and security boundaries
5File system operations, document processing, and multi-modal inputs
6Tool selection strategies: dynamic tool loading and tool description optimisation

Tools & Platforms

OpenAI Function Calling / Claude Tool UsePlaywright / Selenium for browser automationE2B / Modal for sandboxed executionComposio / Toolhouse

Communication protocols: synchronous, asynchronous, and event-driven

6Evaluating multi-agent vs single-agent performance and cost trade-offs

Tools & Platforms

CrewAI (multi-agent)LangGraph (multi-agent graphs)AutoGen (group chat)Swarm / OpenAI Agents SDK

Step 8advanced4-6 weeks

Multi-Agent Systems

Design systems where multiple specialised agents collaborate using orchestration patterns to solve complex problems.

Curriculum

1Orchestration patterns: hierarchical supervisor, peer-to-peer, and debate
2Agent handoffs, shared context, and message passing protocols
3Specialised agent roles: researcher, coder, reviewer, planner
4Conflict resolution and consensus mechanisms between agents
5Communication protocols: synchronous, asynchronous, and event-driven
6Evaluating multi-agent vs single-agent performance and cost trade-offs

Tools & Platforms

CrewAI (multi-agent)LangGraph (multi-agent graphs)AutoGen (group chat)Swarm / OpenAI Agents SDK

Memory indexing, relevance scoring, and forgetting strategies

6Persistent state management across sessions and agent restarts

Tools & Platforms

Chroma / Weaviate / PineconeNeo4j / MemGraph for knowledge graphsLangGraph checkpointingRedis for session state

Step 9advanced4-6 weeks

Memory & State Management

Implement short-term, long-term, and episodic memory systems that make agents contextually aware across sessions.

Curriculum

1Conversation history management: sliding window, summarisation, and compression
2Long-term memory with vector stores and knowledge graph retrieval
3Episodic memory: storing and retrieving past task outcomes and lessons
4Working memory: scratchpads, chain-of-thought buffers, and context assembly
5Memory indexing, relevance scoring, and forgetting strategies
6Persistent state management across sessions and agent restarts

Tools & Platforms

Chroma / Weaviate / PineconeNeo4j / MemGraph for knowledge graphsLangGraph checkpointingRedis for session state

Autonomous error recovery, replanning, and adaptive strategies

6Human-in-the-loop escalation policies and confidence thresholds

Tools & Platforms

LangGraph (planning graphs)AutoGPT / BabyAGI patternsTree of Thoughts implementationsAnthropic Claude (extended thinking)

Step 10advanced4-6 weeks

Autonomous Decision Making

Build agents that plan, reason, and self-correct autonomously over extended tasks using goal decomposition and reflection.

Curriculum

1Task decomposition: hierarchical planning and sub-goal generation
2Self-reflection loops: critique, revision, and iterative improvement
3Monte Carlo Tree Search and beam search for planning under uncertainty
4Reward modelling, heuristic evaluation, and action quality assessment
5Autonomous error recovery, replanning, and adaptive strategies
6Human-in-the-loop escalation policies and confidence thresholds

Tools & Platforms

LangGraph (planning graphs)AutoGPT / BabyAGI patternsTree of Thoughts implementationsAnthropic Claude (extended thinking)

Red-teaming, adversarial testing, and jailbreak prevention

6AI alignment principles: helpful, harmless, honest, and constitutional AI

Tools & Platforms

Guardrails AI / NeMo GuardrailsAnthropic Constitutional AI principlesLakera Guard / RebuffOWASP LLM Top 10

Step 11advanced3-4 weeks

Safety & Alignment

Understand AI safety risks and implement guardrails, content filtering, and alignment techniques for responsible agent deployment.

Curriculum

1Prompt injection attacks: direct injection, indirect injection, and defences
2Hallucination detection, grounding, and factuality verification
3Output validation, content filtering, and toxicity detection
4Guardrail frameworks: input/output validators, topic restrictions, PII detection
5Red-teaming, adversarial testing, and jailbreak prevention
6AI alignment principles: helpful, harmless, honest, and constitutional AI

Tools & Platforms

Guardrails AI / NeMo GuardrailsAnthropic Constitutional AI principlesLakera Guard / RebuffOWASP LLM Top 10

Evaluation pipelines: automated benchmarks, regression testing, and A/B testing

6Semantic caching, rate limiting, and graceful degradation strategies

Tools & Platforms

FastAPI / LitServeLangSmith / Langfuse / Arize PhoenixvLLM / OllamaDocker / Kubernetes for deployment

Step 12advanced4-6 weeks

Production Deployment of AI Agents

Deploy AI agents to production with proper serving infrastructure, observability, cost tracking, and evaluation pipelines.

Curriculum

1API serving with FastAPI, streaming responses, and WebSocket support
2Model hosting: vLLM, TGI, Ollama for self-hosted inference
3Observability: tracing agent runs, token usage, latency, and error rates
4Cost management: prompt caching, model routing, and token budgets
5Evaluation pipelines: automated benchmarks, regression testing, and A/B testing
6Semantic caching, rate limiting, and graceful degradation strategies

Tools & Platforms

FastAPI / LitServeLangSmith / Langfuse / Arize PhoenixvLLM / OllamaDocker / Kubernetes for deployment