Building AI Agents with Volary
Volary is a system of persistent memory that can integrate with any AI agent. Route your LLM requests through Volary and it automatically learns from past conversations, injecting relevant knowledge into future sessions.
How it works
- 1Proxy requests through Volary– point your agent at a Volary agent endpoint instead of directly at OpenAI or Anthropic. Volary forwards the request to your configured backend.
- 2Reflections are extracted– after successful conversations, Volary analyses the exchange and extracts factual knowledge (where to find things) and procedural knowledge (how to avoid mistakes).
- 3Knowledge is injected– on future requests, Volary retrieves relevant reflections via vector similarity search and injects them as context before forwarding to your LLM provider.
Setup
To start using Volary with your agent, sign up in the Volary dashboard and follow the setup steps to create an “API Provider” agent.
Once you're done, you can configure your agent with its URL and access token to communicate through Volary as you would with your normal LLM backend.
Integration examples
These agents show common agent SDKs connected to Volary, with the Volary agent backend acting as a proxy, and the MCP tools connected to it to give the agent full access to its memories as it works through its tasks.
import asyncio
from agents import Agent, AsyncOpenAI, OpenAIChatCompletionsModel, Runner, set_tracing_disabled
from agents.mcp import MCPServerStreamableHttp
set_tracing_disabled(True) # don't upload traces to the backend
# Route the model through your Volary agent.
client = AsyncOpenAI(base_url="https://dev-api.volary.ai/v0/orgs/{ORG_ID}/agents/{AGENT_ID}/v1", api_key="{YOUR_TOKEN}")
model = OpenAIChatCompletionsModel(model="gpt-4o", openai_client=client)
async def main():
async with MCPServerStreamableHttp(
name="volary",
params={
"url": "https://dev-api.volary.ai/v0/orgs/{ORG_ID}/agents/{AGENT_ID}/v0/mcp",
"headers": {"Authorization": "Bearer {YOUR_TOKEN}"},
},
) as volary:
agent = Agent(
name="Assistant",
instructions="Use your Volary memory tools as the task evolves.",
model=model,
mcp_servers=[volary],
)
result = await Runner.run(agent, "Hello!")
print(result.final_output)
asyncio.run(main())Replace {ORG_ID}, {AGENT_ID}, and {YOUR_TOKEN} with the values from your dashboard. Keep the MCP server named volary as shown – once Volary detects the memory tools, it injects your agent's root memory index on the first turn and reminds the model to consult its reflections as the task evolves.
Key endpoints
Each agent exposes API-compatible endpoints that you can use as drop-in replacements:
| Endpoint | Description |
|---|---|
POST /v0/orgs/{org}/agents/{agent}/v1/chat/completions | OpenAI-compatible chat completions with reflection injection |
POST /v0/orgs/{org}/agents/{agent}/v1/messages | Anthropic-compatible messages with reflection injection |
POST /v0/orgs/{org}/agents/{agent}/v1/responses | OpenAI Responses API with stateful conversation chains |
GET /v0/orgs/{org}/agents/{agent}/v0/mcp | Act as a Model Context Protocol server for your agent. |
See the API Reference for complete endpoint documentation.
