Changelog
Changelog
All notable changes to cubby.network are documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
Pre-1.0: breaking changes may land in minor-version bumps (0.MAJOR.MINOR). At 1.0 the API surface freezes and subsequent breaks require a major bump.
[Unreleased]
No unreleased changes.
[0.2.0-beta.1] — 2026-04-23
First external beta. Three adversarial audit cycles closed (28 findings across two LLM audits and one human audit). CI is now strict and blocking. Operator onboarding and rollback runbooks shipped.
Security
- Plan-hash binding.
canonical_plan_payloadnow mixes inintent_type+workflow_pack_version, so captured approvals can't be replayed onto a semantically different change. - Rejector veto removed. Single rejector can no longer block a legitimate change; rejections are group-scoped with a quorum symmetric to approvals.
- Rewrite policy escalation. Injection-sanitise path now enforces
rewritten_keys ⊆ original_keys. Keeps malicious rewrite policies from smuggling new privileged fields. - Unicode injection bypass. NFKD-normalise +
Cf-to-space replacement catches zero-width-space and fullwidth-glyph evasions. - Snapshot credential leak. Secret scanner extended with 9 HIGH-severity network-device patterns (SNMP community, BGP password, enable secret, TACACS/RADIUS key, IKE PSK, JunOS auth-key, Cisco type-7, key-string).
- OIDC algorithm confusion. Validator enforces an explicit asymmetric-only allowlist (RS256/RS384/RS512, ES256/ES384, PS256).
alg=noneand HS* rejected outright on the OIDC path. - OIDC JWKS spoofing. 15-minute TTL on cached JWKS, HTTPS-only scheme, falls back to cache on transient fetch failure.
- Workflow race. Per-intent
asyncio.Lock+ newget_or_createhelper prevent concurrent callers from driving the same workflow twice. - Signer key_id enumeration. User-facing "failed signer verification" message is now generic; detail stays in the log.
- Metadata DoS. 64 KB body cap on both declared
Content-Lengthand streamed requests (chunked transfer can't bypass). - Runtime rewrite enforcement.
_execute_tool_callnow honours theSafetyGate.review()verdict and executes rewritten args. Previouslyauthorize()silently discarded them, making the injection-sanitise path dead code. - Anthropic system-prompt boundary.
agent.system_promptgoes through the realsystemparameter; tool output returns astool_resultcontent blocks — not stuffed inside a user-role JSON blob. - Production fails fast on simulated adapters.
build_demo_harnessderivesallow_simulatedfromNETOPS_ENV; a production boot refuses to register any simulated plugin. - Web-research injection scan. Every hit's title + snippet is run through the same scanner that guards persistent memory before persisting to the wiki store. Poisoned hits are dropped and surfaced as
refused_hits. - Route RBAC.
/runbooks/evaluateand/events/webhookrequire thenetwork-operatorrole.make_role_dependencyfactory added for future gates. - Probe info disclosure.
/livezstays unauth and minimal;/healthand/readyz?detail=1expose plugin / signer / backend state only to authenticated callers. - Shared-secret CAB. Boot warning is ERROR-logged + printed to stderr; production boot raises
RuntimeErrorunlessNETOPS_CAB_ACKNOWLEDGE_SHARED_SECRET=1is set. - Autonomy privilege laundering fixed.
/autonomy/incident-loopnow propagates the caller's identity and roles into the cascaded drift remediation (previously forgedsystem:autonomy/team-lead). Cascade path escalates risk to HIGH to force CAB sign-off. - Policy fail-closed. Triage, drift, and capacity workflows check
policy_decision.allowedand returnFAILEDbefore any side-effectful work (ticketing, evidence writes, remediation cascade). - Verify contract unified across SDK, router, simulator, real adapter, and engine. Previously dropped credentials; verification was silently unauthenticated.
- Evidence scan-before-write. Secret scanner runs against the in-memory payload first; on a HIGH finding, nothing touches disk.
- OAuth refresh HTTPS-only.
NETOPS_CODEX_TOKEN_URLmust behttps://; rejected at construction. - Artifact path traversal.
LocalFsArtifactStore._pathresolves the candidate and usesPath.relative_tofor containment. Catches symlink-escape that textual..checks miss. .env.exampleships emptyNETOPS_EVIDENCE_LEGACY_KEY_IDS+NETOPS_EVIDENCE_CHAIN_RESET_BUNDLE_IDSdefaults instead of non-empty values that normalised bypass.- CI security job is blocking.
pip-audit . --strict+bandit -llagainst the real dep graph — no|| trueswallows. - Docker compose dev-only posture. Every service binds to
127.0.0.1; stock credentials replaced with${*:-CHANGE_ME_*}placeholders. README warns that compose is not a production baseline.
Added
docs/OPERATOR_GUIDE.md— env-var matrix, demo-vs-production posture, real-adapter wiring, per-approver Ed25519 upgrade path, auth upgrade, secrets custody, "ready for a second human" checklist.docs/ROLLBACK.md— how to recover when a change leaves the network in a bad state. Covers self-rollback, stuck-workflow recovery, false-success triage, evidence-chain recovery.QUICKSTART.md— 30-minute path from clone to first signed change against a real Nokia SR Linux lab.cubby smokenow reports selected runtime, simulated-vs-real device mix, signer state, and a readiness verdict.cubby config— renders the resolvedRuntimeConfigwith per-field set-vs-default state; redacts sensitive values.api_max_body_bytesruntime config knob (default 64 KB).cab_acknowledge_shared_secretruntime config knob for production deployments that accept the shared-secret limitation.ClaudeAgentRuntimehas 7 unit tests pinning the message shape (system channel,tool_resultblocks,is_errorflagging, convergence bound) plus a live-API contract test gated onNETOPS_LLM=1.- Live Nokia SR Linux devicelab validated end-to-end (snapshot, CAB-signed change, rollback-on-verification-failure, evidence chain verify).
Changed
- Agent runtime resolution order:
ANTHROPIC_API_KEY>OPENAI_API_KEY> Codex CLI OAuth > mock. Claude Opus 4.7 is the default model. - Evidence scanner now catches network-device credentials (SNMP community, BGP MD5, enable secret, TACACS/RADIUS key, IKE PSK, JunOS authentication-key, Cisco type-7, key-string).
- Ruff ignore list curated for a Python 3.9+ codebase; format-string rules are deliberately off.
Fixed
- Three latent bugs:
ValidationFindingTypeError (wrong kwarg), CAB duplicate-approver counting (same signer counted twice toward quorum), undefinedpincdn_placement.total_offload_rps.
Prior history
Pre-0.2 releases were not tagged. Major milestones in main history, in order:
| Commit | Summary |
|---|---|
9eefe8d | 9 findings from the first adversarial audit — safety, wiring, observability. |
e0edc53 | 6 autonomous work items closed; full CAB-signed change validated on live Nokia SR Linux. |
0266ff5 | Devicelab validated end-to-end against real Nokia SR Linux. |
53a4273 | tests/devicelab — lab-agnostic smoke suite + Containerlab topology. |
733f29b | Lifted patterns from NetClaw, Hermes, obsidian-wiki, GIRA paper, Angler. |
9463bbe | Enforce signed CAB approvals, migrate knowledge to wiki, expand agent layer. |
8ce28cd | Vendor doc wiki — 35 docs, 6 vendors. |
bd53718 | Generic LLM-driven change workflow + operation card wiki. |
a74ae3b | P1/P2 security hardening pass. |
87ed550 | MVP packaging, transports, vendor doc vault, event ingestors. |
1067668 | Initial commit. |
[Unreleased]: https://github.com/aethon-network/platform/compare/v0.2.0-beta.1...HEAD [0.2.0-beta.1]: https://github.com/aethon-network/platform/releases/tag/v0.2.0-beta.1