
Written by asad
Last updated 2 months ago
See It In Action
Here’s what Chat History Memory does for your conversations: Without MemoryPlugin:

How It Works Behind the Scenes
When chat history context is requested, MemoryPlugin performs intelligent retrieval and summarization:- Query Understanding - Your query is analyzed by an AI to generate multiple search variations and extract temporal filters (like “last month”)
- Hybrid Search - Runs semantic (meaning-based) and keyword searches in parallel across your chat history
- Reranking - Results are reranked by relevance to your specific query
- Context Expansion - For each match, surrounding messages are fetched for complete context
- Intelligent Summarization - Expanded context is summarized to fit your token budget (e.g., 50,000 tokens of raw conversation might be summarized into 2,000 tokens)
Each inject call typically takes 3-4 seconds to complete all these steps.
Browser Extension
The Browser Extension works across all major AI platforms. Supported Platforms: ChatGPT • Claude • Gemini • Google AI Studio • Grok • Poe • DeepSeek • Qwen • And more… How it works:- Performs one round of context retrieval per message
- Can be configured for automatic injection (adds context to every message) or manual (only when you click the MemoryPlugin button)
- Uses default token limits configured in extension settings
Chat History Memory is completely separate from Regular Memory in the browser
extension. They use different systems and buttons.
Remote MCP Server (More Powerful)
The Remote MCP Server offers the most powerful and flexible way to use Chat History Memory. Supported Platforms: Claude Desktop • Claude Web • Claude Mobile • Mistral AI • Cursor • Continue • Other MCP-compatible clients Why it’s more powerful:- Multiple rounds of retrieval - Can fetch context multiple times in a single conversation
- Parallel queries - Send an array of queries that all run simultaneously (e.g., 12 different searches about a topic)
- User control - You tell the AI exactly when and how to fetch context through natural language
- Token control - Specify how many tokens of context to fetch (e.g., “fetch 2000 tokens”)
- Fetch large amounts - Can retrieve 20,000-30,000 tokens of context when needed
Best platforms: Claude (Desktop, Web, or Mobile) and Mistral AI with the
Remote MCP Server provide the best experience. These models excel at agentic
tool use and naturally incorporating retrieved context into responses.
Controlling What Gets Added
Use exclusions to remove specific chats from being used as context:- Sensitive conversations you want to keep private
- Off-topic chats that aren’t useful for context
- Outdated discussions no longer relevant
Searching and Analyzing Your History
Beyond adding context to conversations, you can actively search and analyze your chat history:Search Your History
Perform powerful semantic searches across all your conversations. Find specific discussions by topic, keyword, or timeframe.Go to Dashboard → Search
Ask Questions
Get AI-powered answers synthesized from your entire chat history. Ask natural language questions and receive comprehensive responses with citations.Go to Dashboard → Ask
Tips for Best Results
Be specific in your instructions
Be specific in your instructions
When using the MCP server, tell the AI exactly what context you want. Instead of just activating MemoryPlugin, say “fetch context about my React projects” or “get information about my cooking preferences from past conversations.”
Use parallel queries for complex topics
Use parallel queries for complex topics
For comprehensive coverage, request multiple parallel queries: “Make 10 inject
queries to fetch context about this topic from different angles.” The AI can
search for technical details, design decisions, user feedback, and more
simultaneously.
Adjust based on your needs
Adjust based on your needs
Sometimes you want extensive context (20+ queries, high token count).
Sometimes you want none. Useful for all kinds of tasks—even creative writing
or documentation where past context helps maintain consistency.
Control timing
Control timing
You can tell the AI when to fetch context: at the start of every conversation, only when you mention it, for specific types of questions, etc. It’s entirely up to you.
Limitations
Doesn't track evolution of facts
Doesn't track evolution of facts
Chat History currently doesn’t understand how facts have changed over time. If you struggled with understanding photosynthesis in March but mastered it by June, the AI might not know which is current. This is something we’re working to improve.
Performance varies by model
Performance varies by model
Results depend on how well your AI model uses the retrieved context. Some models are better at incorporating external context than others.
Troubleshooting
Context Not Appearing
Context Not Appearing
Solutions:
- Ensure Browser Extension or Remote MCP Server is installed and signed in
- For browser extension: Check that Chat History is enabled in settings
- For MCP: Make sure you’re explicitly telling the AI to use the inject tool
- Verify you have uploaded and processed chat history
Irrelevant Context
Irrelevant Context
Solutions:
- Be more specific in your prompts or instructions to the AI
- Exclude irrelevant chats from the Chats tab
- For MCP: Tell the AI to use fewer tokens or be more specific in its queries
Next Steps
View Dashboard
Search your history, ask questions, and manage your conversations
Browser Extension Setup
Install and configure the Browser Extension
Remote MCP Server Setup
Set up the Remote MCP Server for Claude Desktop and other clients
Back to Introduction
Review what Chat History Memory is and why it matters