Harper's AI Stack: Models API, Agent Loop, and Built-in Agent

Harper 5.1 ships three related AI capabilities that are worth describing together because they build on each other: a provider-agnostic Models API, a built-in agent loop on scope.models.generate(), and an opt-in Harper Agent component. Each can be used independently, but the full picture is an application platform where AI is a first-class runtime concern rather than an external service you bolt on.

The Models API

scope.models is a provider-agnostic LLM interface available in Harper resource handlers and component code. You configure providers in YAML and call them by logical name:

models: main: provider: anthropic model: claude-sonnet-4-6 embedder: provider: openai model: text-embedding-3-small local: provider: ollama model: llama3

In a resource handler:

export class ArticleSummary extends Resource { async get(query) { const article = await Article.get(query.id); const result = await scope.models.generate({ model: 'main', messages: [{ role: 'user', content: `Summarize: ${article.body}` }], maxTokens: 256 }); return { summary: result.content }; } }

Supported providers in 5.1: OpenAI, Anthropic, AWS Bedrock (via AWS SDK as an optional peer dependency), and Ollama. All four share the same generate / generateStream / embed interface — switching providers is a config change, not a code change.

Model usage is recorded in Harper's analytics system, so embed and generate call counts and token volumes show up in the standard analytics tables alongside your other operational metrics.

toolMode: 'auto' — the agent loop

Previously scope.models.generate() was single-shot. In 5.1, passing toolMode: 'auto' enables a built-in agentic loop: the model receives tools, the loop dispatches tool calls, appends results to the conversation, and continues until the model stops requesting tools or a budget is exhausted.

const result = await scope.models.generate({ model: 'main', messages, toolMode: 'auto', tools: myTools, maxToolIterations: 10, maxCostUsd: 0.50, includeToolTrace: true }); if (result.trace) { // Array of { tool, input, output } for each call in the loop console.log(result.trace); }

Budget parameters act as hard stops:

maxToolIterations — maximum number of tool-call rounds
maxToolTokens — cumulative token budget across the loop (distinct from per-call maxTokens)
maxCostUsd — estimated cost ceiling

When a budget is exceeded, generate throws a BudgetExceededError that includes the partial tool trace — useful for logging what the model was doing when it ran out of budget.

toolParallelism: 'parallel' lets the loop dispatch independent tool calls concurrently within each iteration. For tool-heavy prompts where calls don't depend on each other, this can meaningfully reduce end-to-end latency.

The built-in Harper Agent

For cases where you want a general-purpose agent without writing your own loop or tool dispatch, 5.1 ships a built-in harper-agent component. It's disabled by default:

agent: enabled: true autoApprove: false # require explicit approval for destructive tools

Once enabled, six operations are available:

agent_prompt — submit a prompt and receive a session ID
get_agent_session — fetch the full transcript for a session
list_agent_sessions — list recent sessions
cancel_agent_run — stop a running session
approve_agent_action — approve or deny a pending destructive tool call
set_agent_config — update runtime config

Transcripts are persisted in system.hdb_agent_session, so sessions survive restarts and are queryable like any other Harper table.

The built-in tools available to the agent include filesystem access (scoped to componentsRoot, logDir, and configDir), schedule_followup, and http_fetch. Destructive filesystem operations require explicit approval via approve_agent_action when autoApprove: false, which is the default. When a destructive tool call is pending, the agent loop halts and waits.

What this is and isn't for

The built-in agent is primarily useful as an operator tool: deploying components, reading logs, diagnosing issues, adjusting configuration. It runs with operator-level access, so it should not be exposed directly to end users without careful consideration of the trust boundary.

For application-facing AI features — summarization, classification, RAG — the scope.models API and toolMode: 'auto' are the right primitives. You write the resource handler and control exactly what tools are available.

The two paths converge over time: the agent component will eventually use the same MCP tool registry as the server-side MCP implementation, meaning any tools you expose over MCP are also available to the built-in agent. That integration is ongoing work.

Click Below to Get the Code