Memory management in LLMs using LangMem

LangMem Memory Management

Github link

Overview

This blog contains a simple implementation of memory management in LLMs using LangMem. The goal is to demonstrate how to use LangMem to manage memory in LLMs and improve their performance. There are 2 implementations:

Memory Management per user: It demonstrates how to use LangMem to store and retrieve information from memory. When using different threads, LLM can access the same memory. But when user_id is different, LLM will maintain different memory as per user. We are using dynamic value in namespace to store the memory per user. This implementation can be found in chat_memory_users.py file.

from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.store.memory import InMemoryStore
from langmem import create_manage_memory_tool, create_search_memory_tool

checkpointer = InMemorySaver()
store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small"
    }
)

namespace = ("agent_memories", "{user_id}")
memory_tools = [
    create_manage_memory_tool(namespace),
    create_search_memory_tool(namespace)
]

agentObj = create_react_agent("openai:gpt-4o", tools=memory_tools, store=store,
                              checkpointer=checkpointer)


def chat(agent, txt, thread_id, user_id):
    result_state = agent.invoke({"messages": [{"role": "user", "content": txt}]},
                                config={"configurable": {"thread_id": thread_id, "user_id": user_id}})
    return result_state["messages"][-1].content

Optimized memory: In last implementation, for every user input, the agent decides whether to search memory or not. In this implementation, we are using user’s last message to search through the memory and give the output as context to the LLM via System Message. It makes the LLM more efficient and faster in responding to user queries.

from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.store.memory import InMemoryStore
from langmem import create_manage_memory_tool, create_search_memory_tool
from langgraph.config import get_store

checkpointer = InMemorySaver()
store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small"
    }
)

namespace = ("agent_memories",)
memory_tools = [
    create_manage_memory_tool(namespace),
    create_search_memory_tool(namespace)
]
def prompt(state):
    # Search over memories based on the messages
    store_obj = get_store()
    items = store_obj.search(namespace, query=state["messages"][-1].content)
    print("Items")
    print(items)
    memories = "\n\n".join(str(item) for item in items)
    system_msg = {"role": "system", "content": f"## Memories:\n\n{memories}"}
    return [system_msg] + state["messages"]

agentObj = create_react_agent("openai:gpt-4o", prompt=prompt, tools=memory_tools, store=store,
                              checkpointer=checkpointer)


def chat(agent, txt, thread_id):
    result_state = agent.invoke({"messages": [{"role": "user", "content": txt}]},
                                config={"configurable": {"thread_id": thread_id}})
    return result_state["messages"][-1].content

How to run the code

Clone the repository
Install the required packages
```
pip install -r requirements.txt
```
Add environment variables with the help on .env.example file
```
cp .env.example .env
```
Run the code
```
 python main.py
```

Response

Memory Management per user

Choose one of the options:
1. Chat with user-specific memory
2. Chat with optimized memory

Answer: 1
user_id: rishab-bahal
thread_id: 1
You: My name is Rishab. I am born and raised in New Delhi. I am currently based in Ottawa, Canada.
Agent: Great! I have noted that your name is Rishab, you were born and raised in New Delhi, and you are currently based in Ottawa, Canada. If there`s anything else you`d like to share or need assistance with, feel free to let me know!
user_id: rishab-bahal
thread_id: 1
You: Also, I am currently working with LLM memories. I don`t like Pizza`s but I love sushi and chole Bhature.
Agent: I have noted your preferences: you`re currently working with LLM memories, you don`t like pizza, and you love sushi and chole bhature. Let me know if there's anything else you'd like to add or discuss!
user_id: rishab-bahal
thread_id: 1 #note: same thread_id (Remembers)
You: Do you remember my name?
Agent: Yes, I remember your name is Rishab. If there`s anything else you`d like to know or update, just let me know!
user_id: rishab-bahal
thread_id: 2 #note: different thread_id (Remembers)
You: Do you remember my food choices?
Agent: Yes, I remember some of your food choices. You love sushi and chole bhature but do not like pizza. Considering this, sushi might be the better choice for lunch today. Enjoy your meal!

user_id: adam-gilchrist #note: different user_id
thread_id: 3 # make sure to use different thread_id from last user
You: Remember my name?
Agent: I don`t have your name saved in my memory. Could you please tell me your name so I can remember it for you?
user_id: adam-gilchrist
thread_id: 3
You: My name is Adam. I am from Australia.
Agent: Got it, Adam! I`ve noted that you`re from Australia. How can I assist you today? # Stores new memory for new user Adam
user_id: adam-gilchrist
thread_id: 4
You: Remember me?
Agent: Yes, I remember you! Your name is Adam. How can I assist you today? # Remembers Adam
user_id: adam-gilchrist
thread_id: 4
You: What is the capital of my country?
Agent: The capital of your country, Australia, is Canberra.
user_id: rishab-bahal
thread_id: 2
You: Do you remember my country?
Agent: Yes, I remember that you were born and raised in New Delhi, and you are currently based in Ottawa, Canada. # Also remembers the last user's data
user_id: