Harper 5.1 ships three related AI capabilities that are worth describing together because they build on each other: a provider-agnostic Models API, a built-in agent loop on scope.models.generate(), and an opt-in Harper Agent component. Each can be used independently, but the full picture is an application platform where AI is a first-class runtime concern rather than an external service you bolt on.
The Models API
scope.models is a provider-agnostic LLM interface available in Harper resource handlers and component code. You configure providers in YAML and call them by logical name:
models:
main:
provider: anthropic
model: claude-sonnet-4-6
embedder:
provider: openai
model: text-embedding-3-small
local:
provider: ollama
model: llama3
In a resource handler:
export class ArticleSummary extends Resource {
async get(query) {
const article = await Article.get(query.id);
const result = await scope.models.generate({
model: 'main',
messages: [{ role: 'user', content: `Summarize: ${article.body}` }],
maxTokens: 256
});
return { summary: result.content };
}
}
Supported providers in 5.1: OpenAI, Anthropic, AWS Bedrock (via AWS SDK as an optional peer dependency), and Ollama. All four share the same generate / generateStream / embed interface — switching providers is a config change, not a code change.
Model usage is recorded in Harper's analytics system, so embed and generate call counts and token volumes show up in the standard analytics tables alongside your other operational metrics.
toolMode: 'auto' — the agent loop
Previously scope.models.generate() was single-shot. In 5.1, passing toolMode: 'auto' enables a built-in agentic loop: the model receives tools, the loop dispatches tool calls, appends results to the conversation, and continues until the model stops requesting tools or a budget is exhausted.
const result = await scope.models.generate({
model: 'main',
messages,
toolMode: 'auto',
tools: myTools,
maxToolIterations: 10,
maxCostUsd: 0.50,
includeToolTrace: true
});
if (result.trace) {
// Array of { tool, input, output } for each call in the loop
console.log(result.trace);
}
Budget parameters act as hard stops:
maxToolIterations— maximum number of tool-call roundsmaxToolTokens— cumulative token budget across the loop (distinct from per-callmaxTokens)maxCostUsd— estimated cost ceiling
When a budget is exceeded, generate throws a BudgetExceededError that includes the partial tool trace — useful for logging what the model was doing when it ran out of budget.
toolParallelism: 'parallel' lets the loop dispatch independent tool calls concurrently within each iteration. For tool-heavy prompts where calls don't depend on each other, this can meaningfully reduce end-to-end latency.
The built-in Harper Agent
For cases where you want a general-purpose agent without writing your own loop or tool dispatch, 5.1 ships a built-in harper-agent component. It's disabled by default:
agent:
enabled: true
autoApprove: false # require explicit approval for destructive tools
Once enabled, six operations are available:
agent_prompt— submit a prompt and receive a session IDget_agent_session— fetch the full transcript for a sessionlist_agent_sessions— list recent sessionscancel_agent_run— stop a running sessionapprove_agent_action— approve or deny a pending destructive tool callset_agent_config— update runtime config
Transcripts are persisted in system.hdb_agent_session, so sessions survive restarts and are queryable like any other Harper table.
The built-in tools available to the agent include filesystem access (scoped to componentsRoot, logDir, and configDir), schedule_followup, and http_fetch. Destructive filesystem operations require explicit approval via approve_agent_action when autoApprove: false, which is the default. When a destructive tool call is pending, the agent loop halts and waits.
What this is and isn't for
The built-in agent is primarily useful as an operator tool: deploying components, reading logs, diagnosing issues, adjusting configuration. It runs with operator-level access, so it should not be exposed directly to end users without careful consideration of the trust boundary.
For application-facing AI features — summarization, classification, RAG — the scope.models API and toolMode: 'auto' are the right primitives. You write the resource handler and control exactly what tools are available.
The two paths converge over time: the agent component will eventually use the same MCP tool registry as the server-side MCP implementation, meaning any tools you expose over MCP are also available to the built-in agent. That integration is ongoing work.






.webp)



