Hermit Kernel Spec v0.1¶
Status: Draft Last updated: 2026-03-15
This document defines the target kernel architecture for the next major iteration of Hermit.
It is a forward-looking specification, not a full description of the current repository state.
Read this document alongside:
architecture.mdfor the current implementationroadmap.mdfor current maturity and convergence status
Safe interpretation:
- the current repository already contains real kernel objects and control paths
- this spec defines the stronger target architecture those implementation paths are converging toward
- this document should not be read as a claim that every runtime surface already fully matches the spec
Hermit vNext is not a chat shell with tools. It is a local-first governed agent kernel for durable, governed, evidence-bound work.
1. Design Position¶
Hermit Kernel v0.1 is defined as:
A local-first, event-backed agent kernel where durable tasks advance through recoverable step attempts, compile artifact-native context, maintain bounded working state plus evidence-backed beliefs and durable memory, gate side effects through policy and approval, execute with least-privilege capability grants, and close every important action with a structured receipt.
Hermit’s target competitive scope is narrow and intentional:
-
local-first
-
long-running
-
trust-heavy
-
developer-grade
-
auditable
-
explainable after the fact
Hermit does not need to be the best agent for every scenario. It needs to be unusually strong for offline-capable, stateful, high-trust work where a user may later ask:
-
What exactly happened?
-
Why did it happen?
-
What evidence was used?
-
What authority allowed it?
-
What changed?
-
Can it be replayed, verified, or rolled back?
2. Architectural Thesis¶
Hermit Kernel v0.1 is built around five architectural theses:
1. Tasks are the durable unit of work. Nothing meaningful begins outside a task.
2. Events are the durable unit of truth. Durable state is not mutated directly; it is derived from an append-only event log.
3. Artifacts are the default unit of context and evidence. Message history is a projection, not the primary substrate.
4. Models propose; the kernel authorizes and executes. The model has reasoning authority, not execution authority.
5. Important actions are only complete when they are receipted. A log line is not proof. A side effect without a receipt is not durably complete.
3. Goals¶
Hermit Kernel v0.1 has ten primary goals:
-
Make every meaningful unit of work start from a
Task. -
Make every durable state change flow through immutable
Events. -
Make every recoverable execution boundary explicit as a
StepAttempt. -
Make
Artifactthe default unit of context assembly, lineage, and evidence binding. -
Separate bounded
WorkingState, revisableBelief, and durableMemoryRecord. -
Remove direct model-to-tool execution from the kernel path.
-
Route consequential actions through
Decision,Policy, andApprovalwhen required. -
Execute with explicit, scoped
CapabilityGrants instead of ambient authority. -
Emit a durable
Receiptfor every important action. -
Ensure every task is either replayable, observable, or explainable after the fact.
4. Non-Goals¶
This version of the spec does not require:
-
a distributed cluster architecture
-
multi-machine consensus
-
CRDT-first collaboration
-
a stable public API surface
-
a fixed storage backend choice
-
byte-identical replay for every step
-
perfect rollback coverage for all external effects
-
multi-tenant ACL completeness
-
final cross-device synchronization semantics
The spec prioritizes kernel semantics over deployment scale.
5. Normative Language¶
The keywords MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY are to be interpreted in RFC 2119 style usage.
6. Kernel Invariants¶
The following are hard constraints for v0.1.
6.1 Durable State¶
1.
No direct durable mutation. Every durable state change MUST be recorded as one or more events before projections are updated.
2.
No orphan durable objects. Every Step, StepAttempt, Decision, Approval, CapabilityGrant, Receipt, Belief, and MemoryRecord MUST belong to a task and MUST be attributable to a principal.
3.
No silent overwrite of revisable knowledge. Belief and MemoryRecord changes MUST be versioned or superseding; they MUST NOT be silently overwritten in place.
6.2 Execution Authority¶
1.
No direct model-to-tool execution. The model MUST NOT invoke executors directly. It may emit proposals, plans, assertions, and action requests.
2.
No ambient authority. Executors MUST NOT rely on broad process-level authority alone. Effectful execution MUST be bound to a scoped CapabilityGrant or an equivalent constrained execution record.
3.
No irreversible action without a decision. Destructive, external, credentialed, publish, payment, push-like, or policy-override actions MUST have a Decision record before execution.
4. No high-risk action without policy. Policy evaluation is part of the kernel path, not an optional UI feature.
5.
No delayed high-risk action without witness revalidation. If approval arrives after a pause for a high-risk write-like action, the kernel MUST revalidate a StateWitness or equivalent execution preconditions before running the executor.
6.3 Knowledge and Context¶
1.
No memory write without evidence. Durable memory promotion MUST cite evidence references. Unsupported writes MUST degrade to working state or scratchpad.
2.
No unbounded working state growth. Working execution state MUST be schema-governed and size-bounded. Transcript append alone is not a conformant state strategy.
3.
No secret material in model context by default. Secret values MUST NOT enter model-visible context unless an explicit policy profile allows it.
6.4 Audit and Recovery¶
1. Every important action produces a receipt. If no receipt exists, the action is not considered durably complete.
2.
No silent replay of uncertain side effects. If an effect may have run but the outcome is unknown, the kernel MUST enter observation or resolution semantics before re-executing.
3.
Every task is replayable or explainable. The kernel MUST retain enough information to replay inputs, observe outputs, or reconstruct the decision and evidence chain.
7. Important Actions¶
The following action classes are considered important by default and MUST produce receipts:
-
local write
-
local delete
-
command execution
-
VCS mutation
-
network write
-
credentialed API call
-
publication
-
payment or spending
-
durable memory promotion
-
policy override
-
rollback execution
-
approval resolution for consequential actions
A policy profile MAY add additional action classes to this set.
8. First-Class Objects¶
Hermit Kernel v0.1 is centered on twelve first-class objects.
8.1 Task¶
A Task is the durable entrypoint for work.
A task represents:
-
an explicit goal
-
a lifecycle state
-
a priority
-
an owner
-
a policy profile
-
a task contract boundary
-
a step graph boundary
A task MUST exist before execution begins. All ingress channels such as CLI, scheduler, webhook, Feishu, remote panel, or future adapters MUST create or resume a task instead of invoking the runtime directly.
Minimum fields:
-
task_id -
title -
goal -
status -
priority -
owner_principal_id -
policy_profile_ref -
task_contract_ref -
created_at -
updated_at
Recommended fields:
-
parent_task_id -
depends_on -
labels -
deadline_at -
workspace_hint -
requested_by -
source_channel -
parallelism_mode
Clarification:
parent_task_idis for decomposition or derived-task lineageparent_task_idMUST NOTbe used as the conversational carry-forward mechanism for follow-up questions
8.1.2 IngressRecord¶
An IngressRecord is the durable record for each inbound free-form message before it changes task state.
Minimum fields:
ingress_idconversation_idsource_channelraw_textnormalized_textstatusresolutioncreated_atupdated_at
Recommended fields:
actorprompt_refreply_to_refquoted_message_refexplicit_task_refreferenced_artifact_refschosen_task_idparent_task_idconfidencemarginrationale_refor embedded rationale payload
Semantics:
- every free-form ingress
MUSTbe durably recorded before it mutates task, step, or attempt state - ingress binding
MUSTbe explainable from candidates and rationale - unresolved ambiguity is a legal kernel state;
pending_disambiguationis not an error - adapters and product surfaces
MAYchoose whether to auto-bind, ask, or defer, but the core kernelMUST NOTpretend a unique binding exists when it does not
8.1.3 Ingress Binding Semantics¶
Ingress binding is not a binary continue vs new heuristic. The kernel SHOULD support at least these outcomes:
controlapprovalappend_notefork_childstart_new_rootchat_onlypending_disambiguation
Recommended binding priority:
- explicit task, approval, receipt, or command target
- adapter reply target or quoted-message target
- pending approval correlation
- current focus task
- ranked candidate open tasks
fork_childstart_new_rootpending_disambiguation
Related-task semantics:
append_notemutates the current task inputfork_childcreates a new child task related to the bound task without inheriting the full working statestart_new_rootcreates a new unrelated root task in the same conversation container
8.1.1 Continuation Anchors¶
Kernel ingress MUST distinguish between active-task continuation and terminal-outcome continuation.
Rules:
- a terminal task is any task in
completed,failed, orcancelled - a terminal task
MUST NOTbe implicitly reopened by a follow-up message - when a follow-up refers to a terminal outcome, the kernel
MUSTcreate a new task - that new task
MUSTcarry a structuredcontinuation_anchorrather than a raw transcript pointer
Minimum continuation anchor fields:
anchor_task_idanchor_kindselection_reasonoutcome_statusoutcome_summarysource_artifact_refs
For the v0.1 continuation flow, anchor_kind is completed_outcome.
Durability requirement:
- the continuation anchor
SHOULDbe written into ingress metadata and into the event-backed task creation payload so projections and context packs can be rebuilt without transcript replay
8.2 Step¶
A Step is the smallest logical recoverable unit within a task.
Examples:
-
plan
-
search
-
inspect
-
edit
-
run tests
-
prepare patch
-
await approval
-
publish result
-
observe remote state
-
rollback
A step is a logical work unit, not a concrete run attempt. A step MUST define a contract boundary.
Minimum fields:
-
step_id -
task_id -
kind -
title -
contract_ref -
status -
depends_on -
max_attempts -
created_at -
updated_at
A step contract SHOULD define:
-
objective
-
expected inputs
-
expected outputs
-
success criteria
-
allowed action classes
-
rollback hint when applicable
8.3 StepAttempt¶
A StepAttempt is the concrete execution instance of a step.
This is the primary recovery boundary for durable execution.
Minimum fields:
-
attempt_id -
task_id -
step_id -
attempt_no -
status -
context_pack_ref -
working_state_ref -
workspace_lease_ref -
idempotency_key -
started_at -
finished_at
Recommended fields:
-
resume_from_ref -
executor_mode -
policy_version -
state_witness_ref -
environment_ref
Semantics:
-
A step may have multiple attempts over time.
-
Only one mutable active attempt per step is allowed unless a future version explicitly supports concurrent attempts.
-
Approval pauses and resumptions attach to a specific step attempt.
-
A retry
MUSTcreate a new attempt number. -
An attempt is the unit that receives policy results, approvals, grants, and receipts.
-
An attempt
MAYcarry execution-phase metadata such asplanning,policy_pending,awaiting_approval,authorized_pre_exec,executing,observing, orsettling. -
Newly bound task input
MUSTfirst become task input delta andMUST NOThot-patch an executor already in flight. -
If input, approval packet, policy assumptions, or witness assumptions drift past the current checkpoint, the kernel
MUSTre-enter policy or supersede the attempt instead of silently mutating execution state.
8.4 Event¶
An Event is the kernel source of truth.
Events are immutable records describing state transitions or externally relevant observations. Projected views such as task summaries, approval queues, artifact listings, belief views, memory views, and session projections MUST derive from events.
Minimum fields:
-
event_id -
schema_version -
event_type -
entity_type -
entity_id -
task_id -
step_idwhen applicable -
attempt_idwhen applicable -
task_seq -
occurred_at -
actor_type -
actor_id -
payload -
causation_id -
correlation_id
Recommended fields:
-
idempotency_key -
prev_event_hash -
workspace_id -
principal_scope -
policy_profile_version
8.5 Artifact¶
An Artifact is the canonical container for work products, observations, and evidence.
Artifacts are not limited to files on disk. An artifact may refer to a file, blob, JSON document, remote snapshot, content-addressed bundle, or sealed proof packet.
Minimum fields:
-
artifact_id -
class -
kind -
uri -
content_hash -
media_type -
byte_size -
created_at -
producer -
retention_class -
trust_tier -
sensitivity_class
Recommended fields:
-
sealed_at -
expires_at -
lineage_ref -
task_local_alias
Lifecycle events:
-
created -
referenced -
sealed -
promoted -
compacted -
expired
8.6 Belief¶
A Belief captures what the system currently treats as true enough to reason with inside or near a task boundary.
Beliefs are provisional, revisable, and evidence-bound. Beliefs are not durable memory by default.
Minimum fields:
-
belief_id -
task_id -
claim -
scope -
evidence_refs -
confidence -
trust_tier -
status -
created_at
Recommended fields:
-
step_id -
attempt_id -
supersedes -
contradicts -
expires_at -
structured_assertion
Belief statuses:
-
active -
superseded -
contradicted -
revoked -
expired
8.7 MemoryRecord¶
A MemoryRecord is durable knowledge expected to survive task boundaries.
Minimum fields:
-
memory_id -
claim -
scope -
evidence_refs -
trust_tier -
promotion_reason -
status -
created_at
Recommended fields:
-
promoted_from_belief_id -
retention_class -
invalidated_at -
supersedes -
structured_assertion
Memory statuses:
-
active -
invalidated -
revoked -
expired
A memory record MUST NOT be hard-deleted as the default correction path. Invalidation or supersession is the default correction path.
8.8 Decision¶
A Decision records a consequential judgment made by the system or a human.
Examples:
-
choose plan B
-
ignore stale memory
-
proceed with destructive cleanup
-
request approval before push
-
downgrade execution to readonly
-
resolve unknown outcome via observation
-
promote belief to durable memory
Minimum fields:
-
decision_id -
task_id -
step_id -
attempt_idwhen applicable -
decision_type -
summary -
rationale -
evidence_refs -
risk_level -
decided_by -
reversible -
created_at
Recommended fields:
-
alternatives_considered -
policy_override_ref -
effective_until -
constraints_ref
8.9 Approval¶
An Approval is a first-class execution object, not a UI-only affordance.
Approvals represent pauses in the task graph where progress is blocked pending a human or policy-authorized resolution.
Minimum fields:
-
approval_id -
task_id -
step_id -
attempt_id -
status -
approval_type -
requested_action_ref -
approval_packet_ref -
requested_at -
resolved_at -
resolved_by
Recommended fields:
-
expires_at -
constraints_ref -
state_witness_ref -
policy_result_ref
Statuses:
-
pending -
granted -
denied -
expired -
cancelled -
invalidated
8.10 Receipt¶
A Receipt is the durable proof record for an important action.
A receipt is not a raw log line. It is a structured proof object with pointers to inputs, authority, environment, outputs, and observed results.
Minimum fields:
-
receipt_id -
task_id -
step_id -
attempt_id -
receipt_class -
action_request_ref -
input_refs -
environment_ref -
policy_result_ref -
approval_ref -
capability_grant_ref -
output_refs -
result_code -
result_summary -
created_at
Recommended fields:
-
decision_ref -
rollback_ref -
replay_class -
verifiability -
signer_ref -
signature -
receipt_bundle_ref
8.11 Principal¶
A Principal represents an attributable actor.
Principal types include:
-
user -
supervisor -
agent -
service -
scheduler -
webhook -
policy_engine -
executor -
system
Minimum fields:
-
principal_id -
principal_type -
display_name -
created_at
Recommended fields:
-
authn_context_ref -
labels -
external_identity_ref
8.12 CapabilityGrant¶
A CapabilityGrant is a scoped authority record that authorizes a specific execution envelope.
Minimum fields:
-
grant_id -
task_id -
step_id -
attempt_id -
issued_to_principal_id -
issued_by_principal_id -
action_class -
resource_scope -
issued_at -
expires_at -
max_uses
Recommended fields:
-
constraints_ref -
approval_ref -
policy_result_ref -
revoked_at -
consumed_at
Semantics:
-
Grants are least-privilege by default.
-
A grant is bound to a task and attempt.
-
Executors
MUSTrefuse operations outside grant scope. -
Expired or revoked grants are invalid even if an earlier approval existed.
9. Object Relationships¶
The kernel object graph follows these rules:
-
A
Taskowns manySteps. -
A
Stepowns manyStepAttempts across retries or resumptions. -
Events describe lifecycle transitions of all durable objects. -
Artifacts are produced and consumed by attempts. -
Beliefs andMemoryRecords cite evidence from artifacts, not just raw text. -
Decisions,Approvals,CapabilityGrants, andReceipts attach to a task and, when applicable, to a concrete attempt. -
Sessionor chat history is a projection derived from tasks, events, and selected artifacts; it is not the source of truth.
10. Layered Architecture¶
Hermit Kernel v0.1 is split into six layers.
10.1 Control Plane¶
Responsibilities:
-
accept ingress from CLI, scheduler, webhook, Feishu, remote panel, or future adapters
-
create, resume, cancel, pause, and reprioritize tasks
-
publish events
-
expose supervision and inspection interfaces
The control plane MUST NOT contain model reasoning logic.
10.2 Task and Step Orchestrator¶
Responsibilities:
-
create steps from task contracts
-
maintain dependency ordering
-
select ready steps
-
allocate attempts
-
enforce single-writer task semantics by default
This layer coordinates work selection, not tool execution.
10.3 Durable Execution Engine¶
Responsibilities:
-
acquire workspace leases
-
compile context packs
-
invoke models in propose-only or constrained reasoning modes
-
normalize action requests
-
checkpoint after each durable boundary
-
recover from interruption using the event log
Recovery happens at StepAttempt granularity.
10.4 Policy, Approval, and Capability Layer¶
Required chain:
Model output -> ActionRequest -> PolicyEngine -> ApprovalEngine -> CapabilityGrant -> Executor
Responsibilities:
-
classify requested actions
-
evaluate policy profile and risk
-
return
allow,require_approval,deny, ordowngrade -
mint scoped capability grants
-
block and resume the same attempt
-
revalidate witnesses for delayed actions
10.5 Artifact, Knowledge, and Context Layer¶
Subcomponents:
-
Artifact Store -
Working State Store -
Belief Store -
Memory Store -
Decision Ledger -
Context Compiler
Responsibilities:
-
store work products
-
maintain evidence lineage
-
separate working state, belief, and durable memory
-
compile minimal context packs
-
compile structured carry-forward from continuation anchors when present
-
compact and seal evidence-bearing outputs
10.6 Supervision and Proof Surface¶
Responsibilities:
-
inspect tasks, steps, attempts, approvals, grants, receipts, beliefs, memory, and decisions
-
expose artifact lineage and context manifests
-
export proof bundles
-
allow approve, deny, cancel, retry, rollback, and revoke-grant operations where supported
-
explain what changed, why, and with which authority
This layer is part of the trust model, not just observability.
11. Execution Lifecycle¶
11.1 Standard Lifecycle¶
A conformant effectful execution path SHOULD proceed as follows:
-
An ingress channel creates or resumes a
Task. -
The orchestrator selects a ready
Step. -
The engine acquires a
WorkspaceLeaseand creates aStepAttempt. -
The context compiler emits a
context.packartifact and checkpoints it. -
The model is invoked with the compiled context.
-
The model may emit zero or more of:
-
plan updates
-
belief assertions
-
draft deliverables
-
action requests
-
Each action request is normalized into an
ActionRequest. -
Policy evaluates the request.
-
Policy returns one of:
-
allow -
require_approval -
deny -
downgrade -
If approval is required, the attempt is paused without executing the side effect.
-
On grant, the kernel revalidates witness state when required.
-
If authorized, a scoped
CapabilityGrantis issued. -
The executor runs inside the lease and grant scope.
-
Outputs and observations are captured as artifacts.
-
A receipt is issued.
-
Step and task projections are updated from events.
11.2 Model Authority Boundary¶
The model MAY reason, propose, or revise beliefs. The model MUST NOT directly execute tools, shell commands, filesystem writes, or network writes.
The kernel interprets model output through typed boundaries.
11.3 ActionRequest¶
ActionRequest is a typed execution proposal generated from model output or a non-model source.
Suggested structure:
ActionRequest { action_request_id task_id step_id attempt_id action_class target_resources[] params_ref expected_effects[] reversibility_hint reason proposed_by proposed_at }
An action request is SHOULD be stored as an artifact of kind action.request.
11.4 Downgrade Semantics¶
If policy returns downgrade, the result MUST include either:
-
a rewritten lower-risk action request, or
-
an explicit restricted execution mode
Examples:
-
mutation request downgraded to readonly inspection
-
network write downgraded to network read
-
publish downgraded to draft artifact generation
11.5 Block and Resume¶
Approval or external wait states MUST NOT force a full task restart. The same StepAttempt MUST resume whenever its checkpointed inputs remain valid.
If witness drift, policy drift, or input invalidation occurs, the system MUST either:
-
re-enter policy evaluation, or
-
supersede the attempt with a new attempt
Clarifications:
- a newly bound ingress first becomes task input delta; it is not direct executor mutation
- the kernel
MUSTabsorb new input only at durable boundaries such as pre-policy, post-policy, pre-exec, post-observation, or pre-next-step compilation - input drift, approval-packet drift, and witness drift
MAYproduce different rationale, but they share the same recovery shape: recompile, re-enter policy, or supersede
12. State Machines¶
12.1 Task State Machine¶
Minimum task states:
-
created -
ready -
running -
blocked -
paused -
completed -
failed -
cancelled -
rolled_back
Allowed transitions:
-
created -> ready -
ready -> running -
running -> blocked -
blocked -> running -
running -> paused -
paused -> ready -
running -> completed -
running -> failed -
ready|running|blocked|paused -> cancelled -
completed|failed -> rolled_backwhen supported
12.2 Step State Machine¶
Minimum step states:
-
planned -
ready -
running -
blocked -
succeeded -
failed -
cancelled -
superseded
A step’s effective state is derived from its latest relevant attempt plus orchestration state.
12.3 StepAttempt State Machine¶
Minimum attempt states:
-
created -
leased -
compiling_context -
reasoning -
awaiting_policy -
awaiting_approval -
authorized -
executing -
observing -
receipt_pending -
succeeded -
failed -
cancelled -
superseded
Rules:
-
Each new attempt increments
attempt_no. -
Each attempt state transition
MUSTemit events. -
Approval pauses apply at attempt granularity.
-
Observation after uncertain execution is a first-class attempt phase.
12.4 Approval Blocking Semantics¶
When policy returns require_approval:
-
the engine emits
approval.requested -
the current attempt transitions to
awaiting_approval -
execution state is checkpointed
-
no effectful executor runs yet
-
on grant or deny, the same attempt resumes or terminates unless invalidated
For delayed high-risk actions, approval resumption MUST include witness validation before authorization.
13. Event Model¶
13.1 Principles¶
The event log is append-only. Events are immutable after commit.
Events MUST be:
-
attributable
-
causally linked
-
schema-versioned
-
projection-friendly
Events SHOULD be:
-
totally ordered within a task
-
hash-linked within a task
-
deduplicable where side effects are involved
13.2 Ordering¶
The kernel MUST provide a monotonically increasing task_seq per task.
A global total order across all tasks is not required for v0.1.
13.3 Idempotency¶
Events related to side effects, grant issuance, approval resolution, executor dispatch, and receipt issuance SHOULD carry an idempotency_key.
If duplicate submission is detected, the kernel MUST either:
-
reuse the original durable result, or
-
emit a dedupe event instead of replaying the side effect
13.4 Hash Linking¶
A task event stream SHOULD include prev_event_hash to support tamper-evident sequencing.
A conformant implementation that omits hash linking MUST still preserve append-only semantics and durable ordering within the task.
13.5 Event Categories¶
Suggested categories:
-
task events
-
step events
-
attempt events
-
artifact events
-
working state events
-
belief events
-
memory events
-
decision events
-
approval events
-
capability events
-
receipt events
-
workspace events
-
policy events
-
supervision events
13.6 Projections¶
The kernel MUST support projections.
Minimum projections:
-
task summary view
-
step queue view
-
approval inbox
-
active grant view
-
artifact catalog
-
working state view
-
belief view
-
memory view
-
decision timeline
-
receipt ledger
-
session/chat projection
-
conversation focus view
Projection rebuild from the event log SHOULD be possible without bespoke repair logic.
Conversation focus view rules:
- one conversation
MAYhave many open tasks - one conversation
MUSThave at most one implicit focus task - background progress
MUST NOTautomatically steal focus - focus changes
SHOULDbe projection-rebuildable from ingress binding, explicit task switch, task lifecycle events, and adapter reply targeting
14. Artifact and Evidence Model¶
14.1 Artifact Classes¶
Suggested classes:
-
source -
working -
derived -
evidence -
deliverable -
audit
Suggested kinds:
-
context.pack -
action.request -
policy.result -
approval.packet -
state.witness -
environment.snapshot -
workspace.snapshot -
search.bundle -
web.snapshot -
file.snapshot -
patch -
diff -
command.transcript -
test.report -
image -
binary.attachment -
belief.extract -
memory.promotion -
receipt.bundle
14.2 Addressing¶
Artifacts SHOULD support stable addressing by:
-
content hash
-
logical URI
-
task-local alias when needed
Path identity alone is insufficient for trust-sensitive workflows.
14.3 Immutability and Sealing¶
Artifact content is immutable once created. Metadata evolution MUST occur through events.
Artifacts cited by a decision, approval, or receipt SHOULD be sealed or hash-locked.
14.4 Lineage¶
Artifacts MUST support lineage sufficient to answer:
-
which attempt produced this artifact
-
which inputs contributed to it
-
which later decisions cited it
-
which beliefs referenced it
-
whether it was promoted into durable memory
-
whether it was included in a receipt bundle
14.5 EvidenceRef¶
EvidenceRef is the typed pointer used by beliefs, memory, decisions, approvals, and receipts.
Suggested structure:
EvidenceRef { artifact_id selector excerpt_hash capture_method confidence trust_tier }
A selector MAY be:
-
byte range
-
line range
-
JSON path
-
DOM selector
-
timestamp span
-
structured record identifier
15. Working State, Belief, Memory, and Context¶
15.1 Knowledge Layers¶
Hermit v0.1 distinguishes four layers:
1. Scratchpad Ephemeral, non-durable, easy to discard.
2. WorkingState Durable but task-local execution state. Bounded, schema-governed, and cheap to revise.
3. Belief Evidence-backed working truth used for reasoning. Revisable and versioned.
4. MemoryRecord Durable cross-task knowledge promoted with evidence and trust metadata.
15.2 WorkingState¶
WorkingState is not the same as transcript history.
WorkingState MUST be:
-
task-local
-
schema-governed
-
size-bounded
-
event-backed
-
compactable
WorkingState SHOULD include only items such as:
-
active objective decomposition
-
current constraints
-
selected plan pointer
-
pending questions
-
resource handles
-
expected outputs
-
execution-local caches
-
unresolved uncertainties
WorkingState MUST NOT become an unbounded append-only dump of prior conversation.
15.3 Belief Rules¶
Beliefs MUST support:
-
confidence updates
-
supersession
-
contradiction marking
-
revocation
-
scope-aware coexistence
Contradiction does not imply deletion. Two beliefs may coexist if scope differs or uncertainty remains unresolved.
15.4 Trust Tiers¶
Suggested trust tiers:
-
untrusted -
observed -
verified -
user_asserted -
policy_asserted
Trust tier affects:
-
whether a claim may influence planning
-
whether a claim may trigger autonomous action
-
whether a claim may be promoted to durable memory
-
whether contradiction requires human review
15.5 Memory Promotion Rules¶
A durable memory write MUST include:
-
a claim
-
a scope
-
evidence references
-
trust tier
-
promotion reason
Memory promotion SHOULD also include:
-
source belief or artifact
-
invalidation policy
-
retention class
Cross-task memory promotion is an important action and MUST emit a receipt.
15.6 Context Compiler¶
The context compiler produces the minimal execution pack for an attempt.
Inputs may include:
-
task contract
-
step contract
-
active policy profile
-
working state
-
selected beliefs
-
eligible durable memory
-
relevant artifacts
-
active decisions
-
workspace snapshot refs
-
prior receipts when relevant
-
bound ingress deltas since the last durable boundary
-
focus task summary when conversation routing depends on implicit focus
Outputs MUST be:
-
compact enough to keep model context focused
-
traceable back to artifact and belief identifiers
-
deterministic enough to explain later
The output SHOULD be stored as a context.pack artifact.
15.7 Context Precedence¶
A recommended precedence order is:
-
task contract
-
step contract
-
active policy constraints
-
active decisions
-
working state
-
selected beliefs
-
durable memory
-
relevant artifacts
-
bound ingress deltas and focus summary
-
session/chat projection only when necessary
15.8 Session History¶
Message history may exist, but it is a projection and convenience layer. It is not the primary state substrate.
16. Decisions, Policy, Approval, and Capability¶
16.1 Decision Classes¶
Suggested classes:
-
planning -
execution -
safety -
memory -
publishing -
rollback -
uncertainty_resolution
16.2 Risk Bands¶
Suggested risk bands:
-
low -
moderate -
high -
critical
Risk classification SHOULD consider:
-
action class
-
resource sensitivity
-
credential usage
-
reversibility
-
blast radius
-
uncertainty of target state
-
provenance quality of supporting evidence
16.3 Policy Profiles¶
A policy profile defines execution constraints for a task.
Examples:
-
readonly_analysis -
local_edit_allowed -
network_read_allowed -
repo_mutation_gated -
external_publish_gated -
destructive_denied
Policy profiles MUST be visible on the task object.
16.4 Action Classes¶
Suggested action classes:
-
read local
-
write local
-
delete local
-
execute command
-
network read
-
network write
-
credentialed API call
-
VCS mutation
-
publication
-
payment or spending
-
durable memory write
-
rollback
16.5 Policy Results¶
Policy evaluation returns exactly one of:
-
allow -
require_approval -
deny -
downgrade
A policy result SHOULD include:
-
action summary
-
risk summary
-
relevant policy profile version
-
affected resources
-
reasoning summary
-
required witness or revalidation conditions
16.6 Capability Grants¶
A capability grant is minted only after:
-
direct allow, or
-
approval grant plus any required witness validation
A capability grant MUST be scoped by at least:
-
task
-
attempt
-
action class
-
resource scope
-
expiry
-
usage limit
A grant SHOULD also encode:
-
network egress allowlist
-
filesystem path constraints
-
repo/ref constraints
-
secret handles allowed at execution time
16.7 Approval Packets¶
Approval requests SHOULD contain:
-
action summary
-
risk summary
-
relevant decision rationale
-
evidence refs
-
expected effect
-
impacted resources
-
rollback availability
-
expiry rules
-
state witness when required
Approval packets are artifacts and should be inspectable later.
If a newly bound ingress changes the action summary, risk, target resources, or other policy-relevant inputs, the prior approval packet MUST NOT be reused silently. The kernel MUST re-enter policy or create a superseding attempt.
16.8 State Witness¶
A state.witness artifact captures execution-time preconditions for delayed or high-risk actions.
It SHOULD include:
-
target resource fingerprints
-
relevant versions or hashes
-
observed preconditions
-
observation time
-
witness expiry
-
observing principal
For the following action classes, witness validation MUST run on delayed execution unless policy explicitly waives it:
-
local write
-
local delete
-
VCS mutation
-
network write
-
credentialed API call
-
publication
-
rollback
If witness validation fails, the previous authorization MUST NOT be treated as sufficient. The kernel MUST re-enter policy or create a superseding attempt.
Approval-packet drift, witness drift, and newly bound input are distinct causes, but all three MUST resolve through durable re-entry or supersession rather than in-memory mutation of a running executor.
16.9 Policy Override¶
Policy override is a consequential action.
A policy override MUST have:
-
an elevated principal
-
an explicit decision
-
a receipt
-
a clear scope and expiry
17. Workspace, Environment, and Secrets¶
17.1 Workspace Lease¶
Execution occurs under a workspace lease.
A lease SHOULD define:
-
lease_id -
task_id -
attempt_id -
workspace_id -
root_path -
holder_principal_id -
acquired_at -
expires_at -
mode -
resource_scope
Suggested lease modes:
-
readonly -
mutable -
isolated -
external_effects_disabled
17.2 Environment Capture¶
Important attempts SHOULD capture environment facts needed for explainability:
-
OS
-
shell
-
cwd
-
relevant env whitelist
-
network mode
-
tool versions when material
-
repo HEAD when material
-
interpreter/runtime version
These may be referenced via an environment.snapshot artifact.
17.3 Secret Handling¶
Secrets MUST be handled by reference where possible.
Rules:
-
raw secret values
MUST NOTbe inserted into context packs by default -
executors
SHOULDresolve secret handles at execution time -
receipts and artifacts
MUSTredact secret material -
policy
MAYallow controlled secret exposure only under explicit profile constraints
18. Receipts, Replay, and Rollback¶
18.1 Receipt Classes¶
Suggested receipt classes:
-
tool_execution -
command_execution -
publish -
memory_promotion -
approval_resolution -
rollback -
observation_resolution
18.2 Receipt Requirements¶
Each receipt MUST answer:
-
what was intended
-
what was authorized
-
what actually ran
-
in which environment it ran
-
what changed
-
what outputs were produced
-
what was observed afterward
-
whether rollback is supported
18.3 Verifiability Levels¶
Suggested verifiability levels:
-
hash_only -
hash_chained -
signed -
signed_with_inclusion_proof
A base v0.1 implementation MUST support at least hash_only. A stronger implementation SHOULD support signed receipts and hash-linked task streams.
18.4 Receipt Bundles¶
A receipt.bundle artifact SHOULD include:
-
canonical receipt body
-
referenced input and output hashes
-
environment summary
-
policy result hash
-
approval packet hash
-
capability grant hash
-
decision ref
-
replay metadata
-
rollback metadata when applicable
18.5 Replay Classes¶
A receipt SHOULD declare one of:
-
deterministic_replay -
idempotent_replay -
observe_only -
explain_only
Replayability does not require byte-identical re-execution for every step. It requires either replayable inputs or an explainability bundle that reconstructs the causal chain.
18.6 Rollback¶
Rollback support may be partial.
If an action is not rollbackable, the receipt MUST say so. If an action is rollbackable, the receipt SHOULD include:
-
rollback method
-
rollback prerequisites
-
rollback artifact refs
-
rollback result when executed
Rollback itself is an important action and requires its own receipt.
19. Failure, Recovery, and Idempotency¶
19.1 Crash Before Durable Commit¶
If the process crashes before an event commit, no durable state change is assumed.
19.2 Crash After Commit but Before Dispatch¶
If authorization has been durably recorded but the executor has not run, the attempt MUST resume from the last checkpoint and MUST NOT emit a duplicate grant unnecessarily.
19.3 Crash During or After Dispatch¶
If the executor may have run but outcome persistence is incomplete, the kernel MUST enter observing semantics.
The kernel MUST prefer:
-
idempotent re-query of executor state
-
target-system observation
-
durable executor transcript reconciliation
-
human resolution when needed
19.4 Unknown Outcome¶
If the system cannot determine whether an important side effect occurred, it MUST issue a receipt with an uncertainty-bearing result_code, such as unknown_outcome, and block unsafe automatic replay.
Unknown outcome is not a silent failure mode. It is a first-class state that requires resolution.
19.5 Retry Rules¶
A retry MUST respect action class:
-
readonly and pure computations
MAYreplay more freely -
effectful or destructive actions
MUSTrely on idempotency, observation, or renewed decision authority -
stale approvals
MUST NOTsilently carry over without revalidation when witness or policy drift occurred
19.6 Expiry and Drift¶
Approvals, grants, and witnesses may expire independently.
On resume, the kernel MUST check:
-
approval expiry
-
grant expiry
-
witness validity
-
policy profile version compatibility
20. Concurrency and Consistency¶
20.1 Default Consistency Model¶
Hermit v0.1 uses single-writer per task semantics by default.
This means:
-
only one orchestration path may durably mutate a task at a time
-
projections may lag
-
authorization decisions
MUSTrely on the event log, not on stale projections
20.2 Parallel Steps¶
Parallel execution MAY be supported only when all of the following hold:
-
the task is marked as parallelizable
-
steps are dependency-independent
-
resource scopes are disjoint or explicitly lockable
-
policy permits concurrent execution
20.3 Leases and Locks¶
Mutable attempts SHOULD use leases with expiry. A lost lease MUST prevent silent continuation.
20.4 Projection Lag¶
Projection lag is acceptable for read views. Projection lag MUST NOT authorize side effects.
21. Supervision and Trust Surface¶
The supervision surface should support:
-
viewing live task state
-
viewing ready, running, and blocked steps
-
inspecting attempts and checkpoints
-
inspecting active approvals
-
inspecting active grants
-
inspecting artifact lineage
-
inspecting context pack manifests
-
inspecting belief revisions
-
inspecting memory promotions and invalidations
-
inspecting decision chains
-
inspecting receipts and proof bundles
-
triggering supported resume, cancel, retry, approve, deny, revoke-grant, and rollback operations
The supervision surface should answer these questions quickly:
-
What is Hermit doing right now?
-
Why is it doing that?
-
What authority allows it?
-
What evidence is it using?
-
What exactly was sent to the model?
-
What is blocked?
-
What changed?
-
Can this be undone?
-
Can this be verified independently?
22. Compatibility with Current Hermit¶
This section maps current modules to their target role in the new kernel.
-
runner Target role: ingress adapter boundary and attempt execution facade, not primary state authority.
-
runtime Target role: model interaction loop inside the durable execution engine.
-
scheduler Target role: task creation, wake-up, and step selection source, not a direct agent invoker.
-
session Target role: projection for chat UX and compatibility, not source of truth.
- plugin and tool registry Target role: action request normalizer plus executor registry behind policy and grants.
22.1 Required Structural Shift¶
Current Hermit centers execution around:
-
session lookup
-
prompt assembly
-
message history
-
runtime loop
-
direct tool dispatch
Kernel v0.1 centers execution around:
-
task creation or resume
-
step scheduling
-
step attempt creation
-
context compilation from artifacts, working state, beliefs, and memory
-
action request normalization
-
policy and approval gates
-
capability grant issuance
-
receipt issuance
-
projection rebuild from events
23. Suggested Module Layout¶
The target module map is:
src/hermit/control/ hermit/tasks/ hermit/steps/ hermit/execution/ hermit/events/ hermit/artifacts/ hermit/evidence/ hermit/working_state/ hermit/beliefs/ hermit/memory/ hermit/decisions/ hermit/policy/ hermit/approvals/ hermit/capabilities/ hermit/receipts/ hermit/proofs/ hermit/workspaces/ hermit/context/ hermit/supervision/ hermit/projections/ hermit/identity/
This layout is conceptual for v0.1. Exact file placement may change, but module boundaries should preserve the same semantics.
24. Suggested Persistent Records¶
v0.1 should at minimum support durable records for:
-
principals
-
tasks
-
steps
-
step attempts
-
events
-
artifacts
-
working state snapshots or patches
-
beliefs
-
memory records
-
decisions
-
approvals
-
capability grants
-
receipts
-
workspace leases
-
task projections
-
step projections
The event log is primary. Other tables or collections may be projections, indexes, or specialized stores.
25. Conformance Profiles¶
To keep v0.1 implementable, the spec defines three conformance profiles.
25.1 Core Profile¶
A Core implementation must provide:
-
task-first ingress
-
step and step attempt semantics
-
append-only event-backed durable state
-
artifact-native context compilation
-
working state, belief, and memory separation
-
no direct model-to-tool execution
-
receipts for important actions
-
session as projection
25.2 Governed Profile¶
A Governed implementation adds:
-
policy profiles
-
action classification
-
approval blocking and resume
-
capability grants
-
witness revalidation for delayed high-risk actions
-
no ambient authority for effectful execution
25.3 Verifiable Profile¶
A Verifiable implementation adds:
-
hash-linked task event streams
-
sealed receipt bundles
-
verifiability metadata on receipts
-
optional signatures or inclusion proofs
-
exportable proof artifacts
A system may claim Hermit Kernel v0.1 Core, Core + Governed, or Core + Governed + Verifiable.
26. Security and Trust Posture¶
The kernel is designed around zero-trust memory and governed execution.
Security-relevant principles:
-
memory is not trusted merely because it exists
-
evidence and trust tier matter
-
models do not hold execution authority
-
policy runs before side effects
-
approvals are part of execution semantics
-
capability grants bound authority to scope and time
-
delayed actions must revalidate world state when required
-
receipts exist for auditability and post-hoc proof
-
rollback metadata is explicit, not implied
This is a trust model, not a UI enhancement.
27. Exit Criteria for v0.1¶
Hermit Kernel Spec v0.1 should be considered materially implemented only when all of the following are true:
-
Every ingress path creates or resumes a task.
-
Durable state changes are event-backed.
-
Steps and step attempts are recoverable without rerunning a whole conversation by default.
-
Direct model-to-tool execution is removed from the kernel path.
-
Context packs are compiled from artifacts, working state, beliefs, and memory rather than raw transcript alone.
-
Policy gates high-risk actions before execution.
-
Approvals can pause and resume the same attempt.
-
High-risk delayed actions revalidate witness state before execution.
-
Effectful execution runs through scoped authority rather than ambient authority alone.
-
Durable memory writes require evidence and trust metadata.
-
Important actions emit receipts.
-
Unknown side-effect outcomes are surfaced explicitly and not silently replayed.
-
Session history is demoted to a projection.
-
The supervision surface can explain what happened, why, with what evidence, and under what authority.
-
Every free-form ingress is durably recorded before task mutation.
-
Conversation focus and ingress binding are projection-rebuildable from durable ingress records plus events.
28. Summary¶
Hermit Kernel v0.1 is defined as:
A local-first agent kernel where durable tasks advance through recoverable step attempts over an event log, compile artifact-native context from bounded working state plus evidence-backed beliefs and memory, treat model output as proposals rather than execution authority, gate risky actions through policy, approval, witness revalidation, and scoped capability grants, and close the trust loop with receipts, replay metadata, and rollback semantics.
That definition is the architectural contract for the next kernel.