LangChain Guide: Mastering Language Models in Python
Written on
LangChain serves as a framework for developing applications that leverage language models. In this tutorial, you'll learn how to create an application that utilizes Large Language Models (LLMs) and explore the vital features of this framework.
Key Resources: - GitHub Repository - Official Documentation
Table of Contents: 1. Installation 2. LLMs 3. Prompt Templates 4. Chains 5. Agents and Tools 6. Memory 7. Document Loaders 8. Indexes
# Installation
To get started, install the LangChain package:
pip install langchain
# LLMs:
Large Language Models (LLMs) are advanced natural language processing (NLP) technologies that use deep learning techniques to generate human-like text. A well-known example of an LLM is ChatGPT, developed by OpenAI, which is trained on extensive text datasets to recognize patterns and respond to queries.
LangChain is a Python framework that provides various models for NLP, including LLMs tailored for handling unstructured text data and responding to user inquiries.
To utilize an LLM, first, you need to install the required package:
pip install openai
Then, set up your environment:
import os os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_TOKEN" from langchain.llms import OpenAI
Now, you can instantiate the LLM and generate text:
LLM = OpenAI(temperature=0.9) # model_name="text-DaVinci-003" text = "give me 5 python project ideas" print(LLM(text))
For Hugging Face models, install the necessary package:
pip install huggingface_hub
Then configure the environment:
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "YOUR_HF_TOKEN" from langchain import HuggingFaceHub llm = HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature": 0, "max_length": 64}) llm("Who won the FIFA World Cup in the year 1994?")
# Prompt Templates
Prompt templates are essential for structuring your inputs, guiding the AI model to produce more consistent and relevant outputs. They are particularly beneficial when you require more control or complexity than just interacting with the bot directly.
Instead of posing a question directly:
llm("Can Joe Biden have a conversation with George Washington?")
You can utilize a PromptTemplate to create a more organized input:
from langchain import PromptTemplate
template = """Question: {question} Let's think step by step. Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"]) prompt.format(question="Can Joe Biden have a conversation with George Washington?")
To send the prompt to the LLM, you'll need to implement a Chain.
# Chains
Chains allow you to combine different components into a cohesive application. For instance, you can create a chain that takes user input, processes it with a PromptTemplate, and then sends the structured output to an LLM. More complex chains can be formed by connecting various chains and other components.
from langchain import LLMChain
llm_chain = LLMChain(prompt=prompt, llm=llm) question = "What are the steps to start a successful online business?" print(llm_chain.run(question))
# Agents and Tools
Agents decide which actions to execute and the order in which to perform them. When used effectively, agents can be remarkably powerful. To harness agents, it's essential to understand the following concepts:
- Tool: A function that performs a specific task, such as executing a Google Search or utilizing another chain.
- LLM: The language model that powers the agent.
- Agent: The specific agent being used.
from langchain.agents import load_tools from langchain.agents import initialize_agent
Install the Wikipedia package to enhance your tools:
pip install wikipedia
Then, set up your LLM and agent:
from langchain.llms import OpenAI
llm = OpenAI(temperature=0) tools = load_tools(["Wikipedia", "LLM-math"], llm=llm) agent = initialize_agent(tools, LLM, agent="zero-shot-react-description", verbose=True) agent.run("Can you explain the concept of blockchain technology?")
# Memory
Memory represents the idea of maintaining a persistent state between calls to a chain or agent. LangChain offers a standardized interface for memory, along with various memory implementations and examples of chains and agents that utilize memory.
Memory enables the model to retain conversational context. Without it, each user prompt would be treated in isolation.
from langchain import OpenAI, ConversationChainM
llm = OpenAI(temperature=0) conversation = ConversationChainM(llm=llm, verbose=True) conversation.predict(input="Hi there!") conversation.predict(input="Can we talk about Blockchain?") conversation.predict(input="I'm interested in Solana.")
# Document Loaders
Integrating language models with your text data is a powerful way to enhance their functionality. The initial step is to load your data into "Documents," which simply refers to pieces of text. The document loader simplifies this process.
from langchain.document_loaders import NotionDirectoryLoader
loader = NotionDirectoryLoader("Notion_DB") docs = loader.load()
# Indexes
Indexes provide methods to structure documents for optimal interaction with LLMs. This module includes utility functions for managing documents, various types of indexes, and examples of using those indexes within chains.
- Embeddings: These represent the relatedness of text strings as vectors of floating-point numbers.
- Text Splitters: Essential for breaking down long texts into manageable chunks.
- Vector databases: Store and organize codes that encapsulate the meaning and context of words, aiding search engines in delivering relevant results.
import requests
url = "https://raw.githubusercontent.com/hwchase17/langchain/master/docs/modules/state_of_the_union.txt" res = requests.get(url) with open("state_of_the_union.txt", "w") as f:
f.write(res.text)
Document Loader:
from langchain.document_loaders import TextLoader
loader = TextLoader('./state_of_the_union.txt') documents = loader.load()
Text Splitter:
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents)
Install additional packages as needed:
pip install sentence_transformers
Embeddings:
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings()
Install the FAISS library:
pip install faiss-cpu
Vector Store:
from langchain.vectorstores import FAISS
db = FAISS.from_documents(docs, embeddings) query = "What did the president say about Ketanji Brown Jackson" docs = db.similarity_search(query) print(docs[0].page_content)
Save and Load:
db.save_local("faiss_index") new_db = FAISS.load_local("faiss_index", embeddings) docs = new_db.similarity_search(query) print(docs[0].page_content)
# Conclusion
LangChain provides a thorough approach for constructing applications that harness generative models and LLMs. By integrating fundamental data science concepts, developers can devise innovative solutions that extend beyond conventional metrics through the use of various components and prompt templates.
As technology evolves, more complex features, including chat interfaces, are integrated into agents, offering enhanced support across numerous applications.
Whether you're creating chatbots, sentiment analysis tools, or any NLP application, LangChain is your ideal partner for unlocking the full potential of your data. With ongoing advancements in Natural Language Processing (NLP) technology, platforms like LangChain will become increasingly vital.
All code examples are available in the GitHub repository.
References: - LangChain Documentation - LangChain GitHub - https://www.allabtai.com/chatgpt-gpt4-system-prompt-engineering-ultimate-guide/
> Stay connected for more insights on trending AI implementations and discussions on my personal blog. If you're not a Medium member and wish to access unlimited articles, consider using my referral link to sign up — it's just $5 a month, less than a fancy coffee! Dive in; knowledge awaits!
> We are AI application experts! For collaboration inquiries, feel free to reach out here, visit our website, or send us a direct email.
> Explore more of my articles: