Recall Chat History
Search and synthesize context from the user’s past AI conversations. This is the primary recall endpoint — it performs semantic search, then uses AI to synthesize matching results into a coherent summary with source citations. This is the same function powering the recall_chat_history tool in the MCP server.
Supports parallel queries for complex topics that benefit from multiple search angles (e.g., timeline, decisions, people).
Overview
This is the primary endpoint for retrieving context from a user’s past AI conversations. It searches across all imported chat history, then uses AI to synthesize the results into a coherent summary with source citations. This is the same function that powers therecall_chat_history tool in the MCP server.
How it works
- Your query is embedded and searched against all conversation chunks (vector + keyword hybrid search)
- Results are filtered and ranked by relevance
- An AI model synthesizes the top results into a coherent narrative
- Source citations are attached so you can trace back to specific conversations
When to use this vs. Search
| Endpoint | Returns | Best for |
|---|---|---|
Recall (/inject) | AI-synthesized summary + sources | Building context for AI conversations, getting a coherent answer |
Search (/search) | Raw matched chunks with scores | Building custom UIs, debugging, fine-grained control |
Parallel queries
For complex topics, use thequeries array to search from multiple angles simultaneously. Each query is searched and synthesized independently.
Single query
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Natural-language description of what you're looking for in the user's chat history.
"What decisions did I make about the database schema?"
Maximum tokens to allocate for the synthesized context. More tokens = richer detail but larger response.
x <= 2000Optional hint about the downstream chat platform to influence formatting.
claude, chatgpt, typingmind Short plaintext summary of the current conversation to ground retrieval.
Recent dialogue turns to help ground the search.
Array of parallel queries for complex topics (max 15). Each query is searched and synthesized independently. Use when a topic benefits from multiple search angles.
15Response
Synthesized context from chat history
AI-synthesized summary of relevant chat history. Ready to inject into a conversation as context.
Source conversations that contributed to the synthesized content.
Actual token count of the synthesized content.
Total processing time (e.g., '342ms').
AI model used for synthesis.