Pattern-Learning-Bridge: How SI-Core Actually Learns From Its Own Failures
Draft v0.1 — Non-normative supplement to SI-Core / SI-NOS / Goal-Native Algorithms / Failure & Rollback Patterns
This document is non-normative. It describes one way to implement structural learning on top of SI-Core: not by updating model weights, but by improving policies, compensators, SIL code, and goal structures using the evidence already collected by the runtime.
1. What “learning” means in SI-Core
In most today’s stacks, “learning” means:
- collect logs
- pick some loss
- fine-tune a model
- re-deploy
It’s powerful, but:
- the reason for the change is often opaque,
- the change is not structurally localized (“some weights moved somewhere”),
- governance has to work backwards from metrics and incidents.
In SI-Core, the runtime already gives you structured evidence:
- jump logs with [OBS][ID][ETH][EVAL][MEM]
- Effect Ledgers with RML-2/3 compensators
- EthicsTrace streams with policy versions and viewpoints
- Goal-Native GCS vectors for each decision
- Failure Traces for rollbacks and incidents
So “learning” at the SI-Core level is:
Upgrading the structures that shape behavior, using the runtime’s own evidence.
Concretely, the things that change are:
- Ethics policies (rules, thresholds, viewpoint priorities)
- Saga / compensator logic for effectful operations (RML-2/3 plans)
- Goal models & GCS estimators (what we measure, how we approximate it)
- Semantic compression policies (ε-budgets and thresholds)
- SIL code in DET/CON/GOAL layers (core decision logic)
The question is: how do we update these safely, traceably, and continuously?
That is the job of the Pattern-Learning-Bridge (PLB).
2. Where the Pattern-Learning-Bridge lives
At a high level, PLB sits between runtime evidence and governed code/policy:
┌───────────────┐
│ SI-Core │
│ (live ops) │
└─────┬─────────┘
│
│ [Structured logs & metrics]
│ - Jump logs ([OBS][ID][ETH][EVAL][MEM])
│ - Effect Ledger / Failure Traces
│ - GCS / goal metrics
│ - Telemetry (CAS, SCI, SCover, EAI, EOA, EOH, RBL, RIR, ACR)
▼
┌──────────────────────┐
│ Pattern-Learning- │
│ Bridge (PLB) │
│ - mines patterns │
│ - proposes patches │
│ - validates in │
│ sandbox │
└───────┬──────────────┘
│
│ candidate changes
│ - ethics policy deltas
│ - compensator upgrades
│ - SIL patches
│ - goal model configs
▼
┌───────────────────────┐
│ Human + Conformance │
│ Kit + CI/CD │
│ - review/approve │
│ - run golden-diff │
│ - deploy new │
│ versions │
└───────────────────────┘
Key properties:
PLB never edits live systems directly.
Instead, it proposes versioned deltas:
[email protected] → v2.2[email protected] → v1.1city_gate.silpatch v3.4 → v3.5
Every proposal is:
- backed by concrete evidence (incidents, metric drift, pattern clusters),
- validated in simulation / sandbox,
- passed through the conformance kit (golden-diff, SCover, CAS checks),
- and then goes through the same CI/CD as human-written changes.
PLB is not a magical “self-improvement daemon”. It’s a governed proposal engine that is allowed to look at structured logs and suggest specific structural upgrades.
3. What the PLB reads: evidence sources
A minimal PLB instance normally consumes:
Jump Logs
- [OBS]: observation snapshots / semantic units used
- [ID]: actor, role, origin (self vs external)
- [ETH]: policy, decision, viewpoint_base, constraints
- [EVAL]: risk profile, sandbox usage
- [MEM]: audit chain references
Effect Ledger & Failure Traces
- which external effects were executed
- which compensators were invoked
- RML-level results (success/partial/failure)
- Failure taxonomy labels (transient vs persistent, logic vs system, etc.)
Goal Metrics & GCS vectors
- per-jump GCS values across goals
- floors and ceilings
- patterns of repeated under-performance
Semantic Compression Telemetry
- R_s (semantic compression ratio) per stream
- ε estimates per goal/stream
- GCS_full vs GCS_sem discrepancies
Core Metrics (SI Evaluation Pack)
- CAS: hash stability across DET runs
- SCI: contradictions per 1e6 events
- SCover: traced share of SIR/sirrev blocks
- EAI: ethics alignment index (pass ratio on effectful ops)
- EOA: ethics overlay availability (was ETH reachable/available when effectful ops needed it?)
- EOH: evaluation overhead (latency / share spent in ETH+EVAL gates; guardrail cost)
- RBL: p95 revert latency
- RIR: rollback integrity success rate
- ACR: audit completeness ratio
From these, PLB can detect where structure is misaligned:
- “This compensator fails too often in scenario X.”
- “This ethics policy yields systematic bias against group Y.”
- “This GCS estimator underestimates risk in region Z.”
- “This SIL function behaves non-deterministically under certain inputs.”
4. The Pattern-Learning loop
Conceptually, PLB runs a multi-stage loop:
Collect candidates
- Incidents with rollbacks
- Metric regressions (EAI drop, RIR below SLO, CAS drift)
- Clusters of similar failures or near-misses
- Recurrent human overrides / escalations
Mine patterns
- “Every time flood risk is high and traffic load is high, this compensator struggles.”
- “Property-damage goal GCS is consistently negative when this policy fires.”
- “Semantic compression ε for stream
canal_sensorsspikes before failures.”
Generate hypotheses & patch proposals
- “Increase weight of property damage in
ETH-FLOOD-003for sector 8.” - “Add an extra rollback step before DB commit in
payment_saga.” - “Refactor
decide_gate_offset()SIL function to cap offset when level ≥ 1.8m.” - “Tighten ε_max for semantic compression on specific high-risk channels.”
- “Increase weight of property damage in
Validate in sandbox
Re-run historical scenarios with proposed changes:
- Do we fix the failures?
- Do we avoid new regressions?
- How do GCS vectors change?
Run golden-diff and SCover checks:
- Are we touching only the expected frames and IR nodes?
- Is structural coverage still acceptable?
Propose to human + conformance kit
Emit a Change Proposal Object:
{ "target": "ETH-FLOOD-003", "from_version": "v2.1", "to_version": "v2.2-candidate", "evidence": [...], "expected_effects": { "GCS.city.property_damage_minimization": "+0.15 (median)", "EAI": "-0.01 (more rejections; within floor)", "EOA": "unchanged", "EOH": "unchanged", "RIR": "unchanged" }, "risk_assessment": "LOW_MEDIUM", "test_summary": {...} }Human reviewers and the conformance kit decide:
- Accept / modify / reject
- Schedule rollout and monitoring plan
Roll-out & monitor
After deployment:
- track metrics specifically around the changed structures,
- feed new evidence back into PLB,
- close the loop.
From SI-Core’s point of view, this is just another governed change pipeline—but one that is continuously fed by runtime evidence.
5. Example 1 — Ethics policy self-improvement
Take the city flood-gate scenario seen in other docs.
5.1 The problem
Observed pattern over 3 months:
Several incidents where localized property damage occurred
Flood risk for hospitals was correctly minimized
Ethics policy
[email protected]focuses mainly on:city.flood_risk_minimizationcity.hospital_access
Goal
city.property_damage_minimizationexists but is underweight
Metrics show:
- GCS_flood is consistently positive (+0.6 ~ +0.8)
- GCS_property is often negative (-0.2 ~ -0.3)
- EAI remains high (no explicit ethics rule broken)
So nothing is “formally illegal,” but policy is mis-tuned.
5.2 What PLB sees
PLB mines incidents where:
- flood risk ≤ threshold (no hospital impact), but
- property damage in certain sectors is high, and
- decisions passed
ETH-FLOOD-003cleanly.
It clusters them by:
- sector(s) affected
- water level / trend profiles
- traffic patterns
- existing EthicsTrace entries
It discovers:
“When risk is moderate and localized to sectors 7–9, current ETH policy does not sufficiently penalize property damage.”
5.3 Proposed patch
PLB synthesizes a policy delta, conceptually:
- add a sector-specific override
- increase weight of
city.property_damage_minimizationin those sectors - add logging hooks for evaluation
In pseudo-YAML:
policy: ETH-FLOOD-003
from_version: v2.1
to_version: v2.2-candidate
changes:
- add_constraint:
id: ETH-FLOOD-PROP-SECTOR7-9
applies_when:
sector in [7, 8, 9]
AND flood_risk in ["LOW", "MEDIUM"]
rule:
# Reject actions whose GCS on property_damage is too low
require GCS.city.property_damage_minimization >= -0.05
- adjust_weights:
goals:
city.property_damage_minimization:
multiplier: 1.4 # relative to v2.1
PLB also attaches:
- historical scenarios where this new constraint would have rejected the problematic actions
- sandbox simulation results on those scenarios
- a quick check that EAI stays above floor, and CAS/RIR do not regress
5.4 Governance
Human + conformance kit:
review the change and evidence
possibly adjust thresholds / multipliers
run broader sandbox tests
deploy with canary + extra monitoring on:
- property damage stats
- EAI, CAS, SCover
- number of ETH rejections in those sectors
Over time, PLB keeps monitoring:
- Did the incidents stop?
- Did we unintentionally bias against certain neighborhoods?
- Do we need another iteration?
6. Example 2 — Improving a Saga compensator (RML-2)
Consider a payment system running under SI-Core with RML-2:
multi-step saga:
- reserve funds
- record invoice
- notify downstream systems
There is a compensator set for each step.
6.1 The problem
Effect ledger + Failure Traces show:
occasional partial rollback:
- funds successfully released
- invoice record remains
- downstream notification inconsistent
RIR (Rollback Integrity Rate) dips to 0.93 for this saga
Pattern: failures occur when network blip hits between steps 2 and 3
6.2 What PLB sees
PLB clusters all failures for payment_saga:
identifies that most are:
- transient network issues, not logic bugs
- during step 3 (“notify downstream”)
checks that compensator for step 3 is:
- “replay notification”
- no compensator for invoice record if step 2 commits but 3 fails
6.3 Proposed patch
PLB suggests:
- add an invoice “pending” state plus a compensator for step 2
- enforce idempotency keys across notifications
- ensure “pending” entries can be re-swept and finalized
In conceptual patch form:
target: "[email protected]"
to_version: "v1.1-candidate"
changes:
- modify_step:
step_id: 2
new_behavior:
- write invoice with status="PENDING"
- register compensator that marks PENDING invoices as "CANCELED" (tombstone) rather than deleting
- modify_step:
step_id: 3
new_behavior:
- send notification with idempotency_key = saga_id
- on success: update invoice status="CONFIRMED"
- add_rollback_rule:
on_failure:
- always run compensator for step 3 if partially applied
- if step 3 cannot be confirmed within T, run step 2 compensator
Again, PLB backs it with:
- simulated sagas under injected network faults
- predicted RIR improvement, RBL impact, CAS and SCI effects
Human + conformance kit review, run golden-diff for the saga logic, and deploy if satisfied.
7. Example 3 — SIL code refactoring
In a SIL-enabled system, core logic lives in SIL programs. PLB can propose SIL patch candidates.
7.1 The problem
SIL function decide_gate_offset(level_mm, trend_mm_per_h):
- occasionally proposes large offsets when level is already near critical
- downstream constraints (CON layer) often reject and fall back to 0
- pattern: wasted decision cycles + noisy GCS behavior
7.2 What PLB sees
From SIR / .sirrev:
identifies the IR nodes corresponding to the logic that produces offsets
sees repeated pattern where:
level_mm >= 1800trend_mm_per_h > 20- function still returns offset > 40cm
checks that CON layer then rejects as unsafe, causing repeated retries
7.3 Proposed SIL patch
PLB synthesizes a small SIL refactor:
@layer(DET)
fn validate_offset_cm(offset_cm:i32, level_mm:f32): bool {
// DET-side pre-check (policy-versioned snapshot) to reduce obvious CON rejects.
if level_mm >= 1800.0 && offset_cm > 30 {
return false;
}
if offset_cm < -50 || offset_cm > 50 {
return false;
}
true
}
@layer(DET)
fn decide_gate_safe(level_mm:f32, trend_mm_per_h:f32): i32 {
let offset_cm = estimate_gate_offset(level_mm, trend_mm_per_h);
if !validate_offset_cm(offset_cm, level_mm) {
// Fallback to conservative option
return 0;
}
return offset_cm;
}
PLB attaches:
- SIR-level diff (
si-golden-diffoutput) - SCover comparison before/after
- metrics showing fewer CON rejections, smoother GCS patterns
Human devs can:
- adjust the thresholds
- integrate with existing SIL project
- run full conformance kit before deployment
8. Safety boundaries: what PLB is not allowed to do
PLB is powerful and must be fenced.
Typical boundaries:
PLB cannot directly modify live systems.
- Only emits versioned change proposals.
- All actual changes go through normal CI/CD + conformance.
PLB cannot lower safety invariants without explicit approval.
- Any proposal that would:
- lower ethics floors,
- weaken RML level,
- reduce audit completeness (ACR),
- or change the definition/collection of the evaluation evidence (e.g., [OBS]/[EVAL] streams or metric semantics), must be marked as HIGH-RISK and require governance approval.
- Any proposal that would:
PLB cannot bypass SI-Core’s [OBS][ETH][MEM][ID][EVAL].
- It can only use what is audited.
- If something is not in audit, PLB does not get to invent it.
SIL / policy patches must preserve structural guarantees.
- golden-diff shows affected frames / nodes
- SCover not allowed to drop below declared thresholds
- CAS/EAI/RIR regressions must be explicitly called out
Human-in-the-loop remains mandatory for critical domains.
- Healthcare, finance, safety-critical infrastructure: PLB can suggest, not auto-merge.
In short: PLB is a tool for governed evolution, not a free-running self-modification daemon.
8.1 When PLB proposals fail
PLB itself can be wrong. It can overfit, chase spurious correlations, or simply propose unhelpful changes. We treat PLB as a fallible subsystem with its own monitoring.
Typical failure modes
False-positive patterns
PLB identifies correlations that do not generalize.Mitigations:
- Require a minimum incident count per pattern
- Use basic statistical sanity checks
- Make human review mandatory
- Track long-term proposal acceptance rate
Sandbox passes, production fails
Proposed patch works in replay, but fails under live conditions.Mitigations:
- Canary deployments (5–10% of traffic)
- Enhanced monitoring around the change
- Automatic rollback triggers
- Post-deployment validation runs
Competing proposals
Multiple patches target the same policy/Saga/SIL.Mitigations:
- Conflict detection in PLB (overlapping targets)
- Human arbitration of which to try first
- Serial deployment with observation windows
Cascading changes
One patch implies further necessary changes.Mitigations:
- Change impact analysis / dependency graphs
- Staged rollout plans
- Explicit observation periods between stages
PLB bias or drift
PLB starts to optimize the wrong objectives (e.g., overemphasis on short-term metrics).Mitigations:
- Regular PLB audits and retrospectives
- Alignment checks against declared goals
- Trend analysis of proposal types and impacts
- Ability to temporarily revert to “manual only” mode
Monitoring PLB health
Example metrics:
- Proposal acceptance rate
- Sandbox–production agreement rate
- Time-to-incident-mitigation for PLB-driven changes
- False-positive rate (proposals later reverted)
- Revert frequency
Red-flag ranges (non-normative):
- Proposal acceptance rate < 30% over a quarter
- Revert rate > 20% of deployed proposals
- Sandbox agreement < 80% on validation scenarios
Response playbook:
- Pause new PLB-generated proposals for the affected area
- Audit pattern-mining logic and thresholds
- Tighten criteria for candidate patterns
- Resume with stricter governance; keep humans in the loop
PLB is powerful, but never above scrutiny. SI-Core should treat it as another component that can be audited, tuned, and even turned off temporarily.
9. How LLMs and RL fit without taking over
PLB is agnostic to the internal machinery used to:
- mine patterns,
- generate patch candidates,
- propose new SIL fragments or policies.
In practice:
An LLM can be used as:
- a pattern summarizer (“describe common structure of these failures”),
- a code/policy generator under strict schemas,
- a text-to-SIL assistant for developers.
RL-style methods can:
- optimize GCS estimators and goal weights offline,
- propose parameter updates (thresholds, weights) under SI-Core constraints.
Crucially:
All such proposals must still flow through PLB’s structural pipeline:
- backed by evidence,
- sandbox-validated,
- checked by the conformance kit,
- approved by humans where required.
LLMs and RL thus live inside PLB as sub-modules, not as opaque gods.
9.1 LLM and RL integration patterns (non-normative)
LLMs and RL algorithms can live inside PLB as pattern-mining and proposal-generation tools. They do not bypass governance.
Use case 1: Pattern summarization
Input to LLM:
- A batch of similar incidents (JSON)
- Their Jump logs, EthicsTrace, GCS vectors, Failure Traces
Prompt sketch:
"Analyze these incidents and identify recurring structural patterns. Focus on: [OBS] structure, [ETH] decisions, GCS vectors, failure modes. Output: a short hypothesis list and candidate root-cause patterns."
Output:
- Natural-language summary
- 2–5 hypotheses that PLB can turn into structured tests
Human review is required before any patch generation.
Use case 2: Policy patch drafting
Input:
- Pattern summary
- Current policy (YAML)
- A JSON Schema describing valid patches
Prompt sketch:
"Given this policy and this pattern summary, propose a patch. Output strictly as JSON conforming to this schema: …"
PLB then:
- Validates against schema
- Runs golden-diff / structural checks
- Runs sandbox validation
- Sends the patch as a proposal to human + conformance kit
Use case 3: SIL code suggestions
Input:
- Failure pattern in SIR /
.sirrev - Current SIL function(s) and type signatures
LLM produces a SIL patch candidate which is then:
- compiled,
- golden-diff’d,
- property-tested,
- and reviewed by humans before any deployment.
RL integration: tuning GCS estimators
RL does not directly act on the live system. Instead, it can optimize parameters of GCS estimators offline.
- State: semantic features + context
- Action: small parameter adjustments (weights, thresholds)
- Reward: GCS estimation accuracy on a validation set
Hard constraints:
- Parameter bounds and monotonicity constraints
- No direct relaxation of ethics floors or safety invariants
- All proposals still flow through PLB’s validation + human review
Key principle:
LLMs and RL are tools inside PLB,
subject to the same governance and conformance checks as human-written proposals.
10. Minimal PLB: implementation sketch
Here is a toy, non-normative sketch of a minimal PLB loop focused on ethics policies.
class PatternLearningBridge:
def __init__(self, log_reader, sandbox, diff_tool, policy_repo):
self.log_reader = log_reader
self.sandbox = sandbox
self.diff_tool = diff_tool
self.policy_repo = policy_repo
def run_once(self):
# 1) Collect candidates: recent incidents + metric regressions
incidents = self.log_reader.fetch_incidents(window="7d")
metric_drift = self.log_reader.fetch_metric_anomalies(
metrics=["EAI", "RIR", "GCS.city.property_damage_minimization"]
)
candidates = self._cluster_candidates(incidents, metric_drift)
proposals = []
for cluster in candidates:
# 2) Mine patterns (could call out to LLM or custom logic)
pattern = self._summarize_pattern(cluster)
# 3) Generate policy patch proposal
patch = self._propose_ethics_patch(pattern)
if not patch:
continue
# 4) Validate in sandbox on historical scenarios
sim_report = self._validate_patch_in_sandbox(patch, cluster)
# 5) Run structural checks (golden-diff, SCover)
diff_report = self.diff_tool.check_patch(patch)
if not diff_report.pass_safety:
continue
# 6) Emit proposal object
proposal = {
"patch": patch,
"pattern_summary": pattern.summary,
"sim_report": sim_report,
"diff_report": diff_report,
}
proposals.append(proposal)
# 7) Write proposals to a queue for human / governance review
self._emit_proposals(proposals)
# ... helper methods elided ...
You can imagine similar loops:
one specialized for Saga/compensators,
one for SIL functions,
one for semantic compression policies,
all sharing common utilities for:
- log retrieval,
- scenario replay,
- conformance checks.
10.1 PLB computational requirements (non-normative)
The Pattern-Learning-Bridge is offline / nearline by design. It does not sit on the hot path of jumps, but it still needs compute for mining and validation.
The numbers below are order-of-magnitude examples for a mid-scale L2 deployment (≈ 1M jumps/day). Real systems may differ by 10× in either direction.
Pattern mining (daily)
Typical batch job:
- Log processing: 10–30 min
- Clustering: 5–15 min
- Pattern extraction: 5–20 min
End-to-end: ~20–65 min / day on a modest analytics cluster.
Sandbox validation (per proposal)
For each candidate patch (policy/Saga/SIL):
- Historical scenario replay: 5–30 min
- Golden-diff / structural checks: 1–5 min
- Conformance & consistency checks: 2–10 min
End-to-end: ~8–45 min / proposal.
Typical PLB usage:
- 5–20 proposals / week
- → ~1–2 hours / week of sandbox time, amortized.
Resource profile (example)
- Pattern mining: 4–8 vCPUs, 16–64 GB RAM
- Sandbox: isolated cluster using 10–20% of production capacity
- Storage: 30–90 days of structured logs
(~100 GB – 1 TB, depending on jump volume and SCover)
Cost optimization strategies
- Run mining jobs in off-peak windows
- Share sandbox infrastructure across teams
- Use incremental pattern updates rather than full re-mining
- Cache validation results for repeated scenarios
Scaling patterns
- L1/L2 cores: a single PLB instance is usually enough
- L3 / multi-region: federated PLBs per region, sharing a global pattern library
- Cross-region patterns (e.g., ethics drift) are learned at the “global PLB” layer
These are not requirements, just a way to scope the ballpark cost of taking PLB seriously.
11. Roadmap: how to adopt PLB incrementally
You do not need a full SIL stack to start using a PLB-like pattern.
A reasonable path:
Start with “human PLB”
- Just unify your SI-Core logs, Effect Ledger, and metrics into a single analytics and incident review workflow.
- Use them to inform manual policy/code changes.
Automate detection before automation of changes
Build automated jobs to:
- detect repeated failure clusters,
- flag metric regressions,
- summarize EthicsTrace anomalies.
Deliver this as reports / dashboards to humans.
Introduce structured change proposals
Capture policy/code changes as explicit Change Proposal Objects with:
- references to evidence,
- expected effects,
- validation artifacts.
Add light-weight PLB
- Let a service generate draft proposals (especially for simple, parameter-only changes).
- Keep humans firmly in the approval loop.
Deepen PLB scope as confidence grows
Gradually allow PLB to propose:
- small Saga refinements,
- minor SIL refactors,
- compression policy adjustments, with strict boundaries and conformance tests.
Full PLB for mature SI-Core deployments
For L2/L3 systems with rich telemetry and SIL coverage:
- PLB becomes the main “learning engine”,
- LLMs/RL act as pattern miners inside it,
- governance decides how fast the system is allowed to self-improve.
12. Why this matters
If SI-Core were only a static constraint system, you’d eventually hit:
- mis-tuned ethics policies,
- brittle compensators,
- misaligned goals,
- aging SIL logic.
You would have to manually chase these problems forever.
Pattern-Learning-Bridge changes the posture:
- the runtime knows where it hurts,
- evidence is structurally organized,
- proposals are structurally localized and auditable,
- humans and tools co-govern how structure evolves.
Put differently:
- SI-Core + PLB is not just “an AI that obeys rules”,
- it is an intelligence runtime that can improve its own rules, without giving up traceability or control.
That’s what “learning” means in Structured Intelligence Computing.