LangChain for AI Agents: Guide and Code Examples
LangChain is the most widely used agent framework. It provides abstractions for chains, agents, tools, memory, and retrieval — along with a large ecosystem of integrations. If a tool, API, or service has an AI integration, there’s probably a LangChain connector for it.
The framework has evolved considerably since its early versions. LangChain’s current architecture uses langchain-core for base types, with provider-specific packages like langchain-anthropic for model integrations. This modular structure reduces dependency bloat compared to the older monolithic package.
Install
pip install langchain langchain-anthropicFor web search in the example below, you’ll also need:
pip install langchain-communityAnd a Tavily API key (free tier available) for the search tool.
Basic Agent
from langchain_anthropic import ChatAnthropicfrom langchain.agents import AgentExecutor, create_tool_calling_agentfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_community.tools.tavily_search import TavilySearchResults
llm = ChatAnthropic(model="claude-opus-4-6")tools = [TavilySearchResults(max_results=3)]
prompt = ChatPromptTemplate.from_messages([ ("system", "You are a helpful assistant."), ("human", "{input}"), ("placeholder", "{agent_scratchpad}"),])
agent = create_tool_calling_agent(llm, tools, prompt)executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
result = executor.invoke({"input": "What is the latest news about AI agents?"})print(result["output"])Let’s break down what each piece does. ChatAnthropic is the LangChain wrapper for the Claude API. It exposes Claude as a standard LangChain chat model, which means it works with any LangChain component that accepts a chat model.
TavilySearchResults is a pre-built tool that calls the Tavily search API. From LangChain’s perspective, a tool is any callable that the agent can invoke by name with a string input. The max_results=3 parameter limits how many search results come back in each call.
The ChatPromptTemplate defines the agent’s system prompt and conversation structure. The {agent_scratchpad} placeholder is where LangChain injects the agent’s intermediate reasoning and tool call results — it’s the equivalent of the agent’s working memory within a single turn.
create_tool_calling_agent wires the LLM, tools, and prompt together into an agent. This function produces an agent that uses the model’s native tool-calling capability (structured function calls), which is more reliable than older ReAct-style agents that tried to parse tool calls from plain text.
AgentExecutor is the runtime that drives the agent loop: call the agent, execute tool calls, feed results back, repeat until done or max iterations reached.
Key Concepts
Understanding these five concepts covers most of what LangChain does:
Chain: A sequence of operations. The simplest chain is: prompt → LLM → output parser. More complex chains string together multiple LLM calls, tools, or retrieval steps.
Agent: A special chain where the LLM decides which tool to call (and with what arguments) at each step. The agent loop runs until the model produces a final answer rather than a tool call.
Tool: Any function the agent can call. Tools have a name, description, and input schema. The description is crucial — it tells the LLM when and why to use the tool. A poorly written description leads to incorrect or missed tool usage.
Memory: Persists conversation state across calls. Without memory, each invocation starts from a blank slate. With memory, the agent can reference earlier exchanges in the same conversation.
Retriever: Fetches relevant documents from a vector store or other data source. Used in RAG (retrieval-augmented generation) pipelines where the agent needs access to a large corpus of documents that can’t all fit in the context window.
Building a Custom Tool
LangChain makes it straightforward to wrap any Python function as a tool:
from langchain.tools import tool
@tooldef calculate_compound_interest(principal: float, rate: float, years: int) -> str: """Calculate compound interest.
Args: principal: Initial investment amount in dollars rate: Annual interest rate as a decimal (e.g., 0.05 for 5%) years: Number of years to compound
Returns: Final amount after compounding """ amount = principal * (1 + rate) ** years return f"${amount:.2f}"The docstring serves as the tool description that LangChain sends to the model. A well-written docstring with clear argument descriptions leads to more accurate tool usage. The type annotations on the function arguments are used to generate the JSON schema that the model uses to construct tool calls.
Adding Memory
To give an agent conversation memory across multiple turns:
from langchain.memory import ConversationBufferMemoryfrom langchain.agents import AgentExecutor, create_tool_calling_agent
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
prompt = ChatPromptTemplate.from_messages([ ("system", "You are a helpful assistant."), ("placeholder", "{chat_history}"), ("human", "{input}"), ("placeholder", "{agent_scratchpad}"),])
agent = create_tool_calling_agent(llm, tools, prompt)executor = AgentExecutor(agent=agent, tools=tools, memory=memory, verbose=True)
# These two calls share conversation contextresponse1 = executor.invoke({"input": "My name is Alex."})response2 = executor.invoke({"input": "What's my name?"}) # Agent will say "Alex"ConversationBufferMemory stores the full conversation history. For long conversations, ConversationSummaryMemory instead summarizes older messages to keep the context manageable.
RAG Pipeline with LangChain
LangChain is particularly strong for retrieval-augmented generation:
from langchain_anthropic import ChatAnthropicfrom langchain_community.vectorstores import Chromafrom langchain_anthropic import AnthropicEmbeddingsfrom langchain.chains import RetrievalQAfrom langchain.text_splitter import RecursiveCharacterTextSplitter
# Load and split documentswith open("my_docs.txt") as f: text = f.read()
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)chunks = splitter.create_documents([text])
# Create a vector storevectorstore = Chroma.from_documents(chunks, AnthropicEmbeddings())
# Build a RAG chainchain = RetrievalQA.from_chain_type( llm=ChatAnthropic(model="claude-opus-4-6"), retriever=vectorstore.as_retriever(search_kwargs={"k": 4}),)
answer = chain.invoke("What does the document say about token limits?")The text splitter breaks documents into chunks with some overlap (to avoid cutting off context mid-sentence). Each chunk is embedded as a vector and stored in Chroma. When a question comes in, the retriever finds the 4 most semantically similar chunks, which are injected into the prompt before the model generates its answer.
Pros and Cons
Pros:
- Massive ecosystem with integrations for hundreds of services
- Well-documented with a large community
- Battle-tested patterns for common tasks (RAG, tool calling, conversation memory)
- Strong support for structured output and output parsing
- LCEL (LangChain Expression Language) makes it easy to compose chains
Cons:
- Abstractions can obscure what’s actually happening, making debugging harder
- Breaking API changes between versions are common
- The framework does a lot, which means it has a lot of surface area to understand
- For simple agents, LangChain adds complexity without clear benefit
- Dependency chain is heavy
When to Use LangChain
Reach for LangChain when:
- You need many integrations quickly: LangChain has pre-built connectors for databases, vector stores, APIs, and document types that would take time to build from scratch.
- You’re building a RAG pipeline: LangChain’s document loaders, text splitters, and retrievers are well-designed for RAG use cases.
- You’re prototyping: LangChain lets you assemble a working agent from pre-built pieces quickly. For prototypes, speed of iteration matters more than abstraction cost.
- You need conversation memory across many turns: LangChain’s memory integrations cover most common patterns.
Skip LangChain when:
- You have a simple, predictable workflow: A direct API call plus a for loop is often cleaner and more maintainable than the LangChain equivalent.
- You need precise control over every API call: LangChain’s abstractions make it hard to customize things like retry logic, streaming behavior, or error handling.
- You’re building something for production where debuggability matters: The abstraction stack makes it harder to trace exactly what prompt was sent and what response came back.
See Also
- Framework Comparison — LangChain vs CrewAI vs AutoGen at a glance
- Code Examples — Direct API examples without a framework