A few months ago I wrote about InvestPal — an AI-powered investment advisor built on top of the Model Context Protocol. The core idea was simple: wrap financial data sources in MCP tools and let an LLM agent use them to answer investment questions with real market data.
Since then the system has grown in ways that are worth writing about separately. This post covers three significant engineering additions:
- Brokerage integrations via two new MCP servers (Alpaca Markets and Coinbase)
- A dual-agent architecture where a second agent manages long-term user memory in the background
- An agent-driven reminder system with proactive Telegram delivery
Updated Architecture
The original system had three services: MarketDataMcpServer (Go), InvestPal (Python/FastAPI), and InvestPalTelegramBot. The ecosystem now has five:
┌──────────────┐ ┌──────────────────┐
│ Telegram Bot │ │ REST API clients │
└──────┬───────┘ └────────┬─────────┘
│ REST API │ REST API
┌────────────────────────▼────────────────────▼───────────────────┐
│ InvestPal Core (FastAPI) │
│ Session Manager | Chat Router | User Context & Reminders │
│ REST API (port 8000) MCP App (port 9000) │
└──────┬──────────────────┬────────────────┬───────────────────────┘
│ MCP │ MCP │ MCP
┌──────▼──────┐ ┌────────▼────────┐ ┌───▼────────────┐
│ MarketData │ │ AlpacaMcp │ │ CoinbaseMcp │
│ MCP Server │ │ Server │ │ Server │
│ (Go) │ │ (Python) │ │ (Python) │
│ port 8080 │ │ port 9091 │ │ port 9090 │
└──────▲──────┘ └────────▲────────┘ └───▲────────────┘
│ MCP │ MCP │ MCP
├──────────────────┼────────────────┤
│ ┌───────▼────────┐ │
└──────────┤ Claude Desktop ├───────┘
│ (MCP Client) │
└────────────────┘
Claude Desktop connects directly to all four MCP servers (InvestPal MCP, MarketData, Alpaca, Coinbase) as a native MCP client — it does not go through the REST API. The Telegram Bot and REST API clients connect through InvestPal Core, which orchestrates the MCP servers on their behalf.
The InvestPal core service also exposes its own internal MCP app on port 9000, which provides tools for user context management and reminders — tools only the agents themselves use.
Brokerage MCP Servers
Why MCP for brokerages?
The same reason it works well for market data: the agent gets a clean, tool-based interface with no brokerage-specific logic leaking into the orchestration layer. Adding Alpaca support meant writing a standalone MCP server; the core agent service picked it up automatically when configured.
AlpacaMcpServer
Built with FastMCP (Python), the Alpaca server exposes six tools:
getAlpacaAssets — search tradeable assets by name or symbol
getAlpacaAccountInformation — portfolio value, cash, account status
getAlpacaOpenPositions — current holdings with P&L
getAlpacaOrders / getAlpacaOrderById — order history
createAlpacaOrder — place a market order (disabled in read-only mode)
The Coinbase server (CoinbaseMcpServer) follows the same pattern for crypto portfolios and spot orders.
Stateless credential design
This is the most interesting engineering decision in both servers. The servers never store credentials. Instead, they are extracted from the HTTP request headers on every tool call:
# AlpacaMcpServer/mcp_app/dependencies.py
from fastmcp.dependencies import CurrentHeaders
def get_alpaca_client(headers: dict = CurrentHeaders()) -> AlpacaClient:
api_key = headers.get(settings.alpaca_api_key_header.lower())
api_secret = headers.get(settings.alpaca_api_secret_header.lower())
if not api_key or not api_secret:
raise ValueError("Missing Alpaca API key or secret headers")
return AlpacaRestClient(
base_url=settings.alpaca_api_base_url,
api_key=api_key,
api_secret=api_secret,
)
The client (InvestPalTelegramBot) forwards the user’s brokerage credentials as headers on every request to InvestPal, which in turn passes them through when calling the brokerage MCP servers. No credentials ever land in a database. The servers become naturally multi-tenant: the same running instance can serve multiple users, since all state is in the request.
Portfolio-aware advice
With live brokerage data available as tools, the agent’s system prompt now instructs it to cross-reference actual positions before giving advice:
When the user asks about a stock or asset, always cross-reference their actual positions first. For example — if they ask “should I buy more NVDA?”, check whether they already hold it, what their current allocation looks like, and how adding more would affect concentration and risk.
The agent decides at runtime whether Alpaca or Coinbase tools are available — both are optional. If neither MCP server is configured, the agent proceeds without them.
A Dual-Agent Architecture for Memory
The problem
LLM context windows are finite. Stuffing every prior conversation into the prompt doesn’t scale, and it degrades response quality. The solution in InvestPal 2.0 is a dedicated memory agent that runs alongside the main investment advisor.
Two agents, two responsibilities
InvestmentManagerAgent focuses entirely on answering the user. It has access to market data tools, brokerage tools, and reminder tools.
UserContextMemoryManagerAgent is a separate agent with a completely different system prompt and a different (smaller) tool set — only context and notes tools. Its job is to read the conversation that just happened and decide what’s worth persisting:
- Stable profile facts (risk tolerance, goals, age) →
updateUserContext
- Session-specific notes (topics discussed, decisions taken, follow-ups) →
updateUserConversationNotes
The memory manager’s prompt is explicit about when not to write:
Only update when there is genuinely useful new information — information that the investment manager would find valuable to provide personalized answers and recommendations. Do not update if the conversation contains nothing new.
Background execution
The key implementation detail is that the memory manager runs as a background task — it does not block the response to the user:
# InvestPal/services/agent_service.py
async def generate_agent_text_response(self, user_id: str, conversation: list[Message]) -> str:
user_context = await self._user_context_service.get_user_context(user_id)
# Investment manager responds synchronously
agent_response = await self._investment_manager_agent.generate_response(
conversation=conversation,
runtime_context=InvestmentManagerRuntimeContext(...),
system_prompt_placeholder_values=InvestmentManagerPromptVars(
client_profile=user_context.model_dump(),
)
)
# Memory manager runs in the background — does not block the response
asyncio.create_task(
self._update_context_memory_safely(user_id=user_id, conversation=conversation)
)
return agent_response.response
Exceptions in the background task are caught and logged — a background failure never surfaces as a user-facing error.
Session initialization
At the start of every session, the investment manager calls three tools in parallel before responding to the user:
getUserContext — load the client profile
getUserConversationNotes — recall key insights from prior sessions
getAgentReminders — surface any pending reminders
Notes are stored by date (keyed YYYY-MM-DD in MongoDB), so the agent retrieves only recent, relevant context rather than the full history. The user never sees any of this loading — the prompt explicitly forbids the agent from mentioning it.
Agent Reminders
The idea
A good financial advisor doesn’t just answer questions in isolation — they track follow-ups. If you mention that you want to revisit your bond allocation next week, they note it. InvestPal 2.0 adds a tool-driven reminder system that works the same way.
The investment manager creates reminders proactively whenever the user mentions anything time-sensitive. It does not ask for permission when the intent is clear.
Data model
Each reminder is a simple document in MongoDB:
class AgentReminderMongoDoc(BaseModel):
user_id: str
reminder_id: str # UUID
reminder_description: str
created_at: str # ISO datetime
due_date: str | None # YYYY-MM-DD, optional
The agent has full CRUD: createAgentReminder, getAgentReminders, updateAgentReminder, deleteAgentReminder.
Proactive delivery via Telegram
Reminders are delivered via a standalone workflow script that can be scheduled independently (e.g., via cron):
# InvestPalTelegramBot/workflows/agent_reminders.py
async def main():
bot_service = BotService()
await bot_service.send_reminders_if_due()
The send_reminders_if_due method checks whether any reminders exist for the user. If they do, it triggers a regular conversation turn — "Hello, please give me an update on my reminders" — which causes the investment manager to load and reason about the reminders naturally, then send the response as an unsolicited Telegram message.
This keeps the delivery mechanism simple and reuses the existing agent response pipeline rather than building a separate notification formatter.
Deploying on Railway
The full stack runs on Railway as a single project containing five services plus a managed MongoDB instance. A few engineering decisions shaped how the deployment is structured.
Private networking is mandatory, not optional
Because brokerage credentials travel as plain HTTP headers between services, none of the inter-service traffic should touch the public internet. Railway’s private network (*.railway.internal) handles this: services within the same project reach each other by hostname without ever leaving Railway’s infrastructure.
# InvestPal environment variables (private URLs)
MARKET_DATA_MCP_SERVER_URL=http://market-data-mcp-server.railway.internal:8080
ALPACA_MCP_SERVER_URL=http://alpaca-mcp-server.railway.internal:9091
COINBASE_MCP_SERVER_URL=http://coinbase-mcp-server.railway.internal:9090
# InvestPalTelegramBot reaches InvestPal the same way
INVESTPAL_BACKEND_URL=http://investpal.railway.internal:<PORT>
Service names in Railway are not just labels — they become the private hostname. market-data-mcp-server is reachable at market-data-mcp-server.railway.internal. This means the names assigned at deploy time are load-bearing configuration.
InvestPal has no authentication
InvestPal’s REST API has no auth layer. It is designed to be an internal service — its threat model assumes it is only reachable by trusted clients on the same private network. The only service that should have a public domain is InvestPalTelegramBot, because Telegram needs a reachable HTTPS URL to deliver webhook events.
If you expose InvestPal’s REST API publicly (e.g., to connect a web frontend), you need to add authentication at that point. For personal use where the Telegram bot is the only client, keeping InvestPal internal is the right default.
Two different models for two different agents
The deployment config reflects a deliberate cost/capability tradeoff. The investment manager — the user-facing agent — runs on a capable model. The memory manager runs on a cheaper, faster one:
INVESTMENT_MANAGER_LLM_MODEL=claude-sonnet-4-6
USER_CONTEXT_MEMORY_MANAGER_LLM_MODEL=claude-haiku-4-5
The memory manager’s job (read conversation, extract facts, write structured updates) is straightforward enough that a smaller model handles it well. Running it on the same model as the investment manager would roughly double the LLM cost per conversation turn for no meaningful gain.
Ephemeral cache
MarketDataMcpServer uses an in-memory Badger cache. It does not persist across restarts or redeploys — cache misses hit Alpha Vantage and CoinGecko, which are rate-limited APIs. This is worth knowing before triggering a redeploy during market hours if your API quota is tight.
What’s Next
The brokerage integrations currently support market orders. Limit orders and more sophisticated order types are a natural next step. On the memory side, the date-keyed notes model works well for recent context, but retrieving relevant notes from months ago would benefit from a semantic search layer rather than date filtering.
Repository Links
Orestis Stefanou
Machine Learning Engineer, currently working at Plum Fintech
Comments