Pattern-Learning-Bridge: How SI-Core Actually Learns From Its Own Failures

Community Article Published December 27, 2025

Draft v0.1 — Non-normative supplement to SI-Core / SI-NOS / Goal-Native Algorithms / Failure & Rollback Patterns


This document is non-normative. It describes one way to implement structural learning on top of SI-Core: not by updating model weights, but by improving policies, compensators, SIL code, and goal structures using the evidence already collected by the runtime.


1. What “learning” means in SI-Core

In most today’s stacks, “learning” means:

  • collect logs
  • pick some loss
  • fine-tune a model
  • re-deploy

It’s powerful, but:

  • the reason for the change is often opaque,
  • the change is not structurally localized (“some weights moved somewhere”),
  • governance has to work backwards from metrics and incidents.

In SI-Core, the runtime already gives you structured evidence:

  • jump logs with [OBS][ID][ETH][EVAL][MEM]
  • Effect Ledgers with RML-2/3 compensators
  • EthicsTrace streams with policy versions and viewpoints
  • Goal-Native GCS vectors for each decision
  • Failure Traces for rollbacks and incidents

So “learning” at the SI-Core level is:

Upgrading the structures that shape behavior, using the runtime’s own evidence.

Concretely, the things that change are:

  • Ethics policies (rules, thresholds, viewpoint priorities)
  • Saga / compensator logic for effectful operations (RML-2/3 plans)
  • Goal models & GCS estimators (what we measure, how we approximate it)
  • Semantic compression policies (ε-budgets and thresholds)
  • SIL code in DET/CON/GOAL layers (core decision logic)

The question is: how do we update these safely, traceably, and continuously?

That is the job of the Pattern-Learning-Bridge (PLB).


2. Where the Pattern-Learning-Bridge lives

At a high level, PLB sits between runtime evidence and governed code/policy:

         ┌───────────────┐
         │   SI-Core     │
         │  (live ops)   │
         └─────┬─────────┘
               │
               │  [Structured logs & metrics]
               │  - Jump logs ([OBS][ID][ETH][EVAL][MEM])
               │  - Effect Ledger / Failure Traces
               │  - GCS / goal metrics
               │  - Telemetry (CAS, SCI, SCover, EAI, EOA, EOH, RBL, RIR, ACR)
               ▼
      ┌──────────────────────┐
      │ Pattern-Learning-    │
      │ Bridge (PLB)         │
      │  - mines patterns    │
      │  - proposes patches  │
      │  - validates in      │
      │    sandbox           │
      └───────┬──────────────┘
              │
              │  candidate changes
              │  - ethics policy deltas
              │  - compensator upgrades
              │  - SIL patches
              │  - goal model configs
              ▼
  ┌───────────────────────┐
  │ Human + Conformance   │
  │ Kit + CI/CD           │
  │  - review/approve     │
  │  - run golden-diff    │
  │  - deploy new         │
  │    versions           │
  └───────────────────────┘

Key properties:

  • PLB never edits live systems directly.

  • Instead, it proposes versioned deltas:

  • Every proposal is:

    • backed by concrete evidence (incidents, metric drift, pattern clusters),
    • validated in simulation / sandbox,
    • passed through the conformance kit (golden-diff, SCover, CAS checks),
    • and then goes through the same CI/CD as human-written changes.

PLB is not a magical “self-improvement daemon”. It’s a governed proposal engine that is allowed to look at structured logs and suggest specific structural upgrades.


3. What the PLB reads: evidence sources

A minimal PLB instance normally consumes:

  1. Jump Logs

    • [OBS]: observation snapshots / semantic units used
    • [ID]: actor, role, origin (self vs external)
    • [ETH]: policy, decision, viewpoint_base, constraints
    • [EVAL]: risk profile, sandbox usage
    • [MEM]: audit chain references
  2. Effect Ledger & Failure Traces

    • which external effects were executed
    • which compensators were invoked
    • RML-level results (success/partial/failure)
    • Failure taxonomy labels (transient vs persistent, logic vs system, etc.)
  3. Goal Metrics & GCS vectors

    • per-jump GCS values across goals
    • floors and ceilings
    • patterns of repeated under-performance
  4. Semantic Compression Telemetry

    • R_s (semantic compression ratio) per stream
    • ε estimates per goal/stream
    • GCS_full vs GCS_sem discrepancies
  5. Core Metrics (SI Evaluation Pack)

    • CAS: hash stability across DET runs
    • SCI: contradictions per 1e6 events
    • SCover: traced share of SIR/sirrev blocks
    • EAI: ethics alignment index (pass ratio on effectful ops)
    • EOA: ethics overlay availability (was ETH reachable/available when effectful ops needed it?)
    • EOH: evaluation overhead (latency / share spent in ETH+EVAL gates; guardrail cost)
    • RBL: p95 revert latency
    • RIR: rollback integrity success rate
    • ACR: audit completeness ratio

From these, PLB can detect where structure is misaligned:

  • “This compensator fails too often in scenario X.”
  • “This ethics policy yields systematic bias against group Y.”
  • “This GCS estimator underestimates risk in region Z.”
  • “This SIL function behaves non-deterministically under certain inputs.”

4. The Pattern-Learning loop

Conceptually, PLB runs a multi-stage loop:

  1. Collect candidates

    • Incidents with rollbacks
    • Metric regressions (EAI drop, RIR below SLO, CAS drift)
    • Clusters of similar failures or near-misses
    • Recurrent human overrides / escalations
  2. Mine patterns

    • “Every time flood risk is high and traffic load is high, this compensator struggles.”
    • “Property-damage goal GCS is consistently negative when this policy fires.”
    • “Semantic compression ε for stream canal_sensors spikes before failures.”
  3. Generate hypotheses & patch proposals

    • “Increase weight of property damage in ETH-FLOOD-003 for sector 8.”
    • “Add an extra rollback step before DB commit in payment_saga.”
    • “Refactor decide_gate_offset() SIL function to cap offset when level ≥ 1.8m.”
    • “Tighten ε_max for semantic compression on specific high-risk channels.”
  4. Validate in sandbox

    • Re-run historical scenarios with proposed changes:

      • Do we fix the failures?
      • Do we avoid new regressions?
      • How do GCS vectors change?
    • Run golden-diff and SCover checks:

      • Are we touching only the expected frames and IR nodes?
      • Is structural coverage still acceptable?
  5. Propose to human + conformance kit

    • Emit a Change Proposal Object:

      {
        "target": "ETH-FLOOD-003",
        "from_version": "v2.1",
        "to_version": "v2.2-candidate",
        "evidence": [...],
        "expected_effects": {
          "GCS.city.property_damage_minimization": "+0.15 (median)",
          "EAI": "-0.01 (more rejections; within floor)",
          "EOA": "unchanged",
          "EOH": "unchanged",
          "RIR": "unchanged"
        },
        "risk_assessment": "LOW_MEDIUM",
        "test_summary": {...}
      }
      
    • Human reviewers and the conformance kit decide:

      • Accept / modify / reject
      • Schedule rollout and monitoring plan
  6. Roll-out & monitor

    • After deployment:

      • track metrics specifically around the changed structures,
      • feed new evidence back into PLB,
      • close the loop.

From SI-Core’s point of view, this is just another governed change pipeline—but one that is continuously fed by runtime evidence.


5. Example 1 — Ethics policy self-improvement

Take the city flood-gate scenario seen in other docs.

5.1 The problem

Observed pattern over 3 months:

  • Several incidents where localized property damage occurred

  • Flood risk for hospitals was correctly minimized

  • Ethics policy [email protected] focuses mainly on:

    • city.flood_risk_minimization
    • city.hospital_access
  • Goal city.property_damage_minimization exists but is underweight

Metrics show:

  • GCS_flood is consistently positive (+0.6 ~ +0.8)
  • GCS_property is often negative (-0.2 ~ -0.3)
  • EAI remains high (no explicit ethics rule broken)

So nothing is “formally illegal,” but policy is mis-tuned.

5.2 What PLB sees

PLB mines incidents where:

  • flood risk ≤ threshold (no hospital impact), but
  • property damage in certain sectors is high, and
  • decisions passed ETH-FLOOD-003 cleanly.

It clusters them by:

  • sector(s) affected
  • water level / trend profiles
  • traffic patterns
  • existing EthicsTrace entries

It discovers:

“When risk is moderate and localized to sectors 7–9, current ETH policy does not sufficiently penalize property damage.”

5.3 Proposed patch

PLB synthesizes a policy delta, conceptually:

  • add a sector-specific override
  • increase weight of city.property_damage_minimization in those sectors
  • add logging hooks for evaluation

In pseudo-YAML:

policy: ETH-FLOOD-003
from_version: v2.1
to_version: v2.2-candidate

changes:
  - add_constraint:
      id: ETH-FLOOD-PROP-SECTOR7-9
      applies_when:
        sector in [7, 8, 9]
        AND flood_risk in ["LOW", "MEDIUM"]
      rule:
        # Reject actions whose GCS on property_damage is too low
        require GCS.city.property_damage_minimization >= -0.05

  - adjust_weights:
      goals:
        city.property_damage_minimization:
          multiplier: 1.4   # relative to v2.1

PLB also attaches:

  • historical scenarios where this new constraint would have rejected the problematic actions
  • sandbox simulation results on those scenarios
  • a quick check that EAI stays above floor, and CAS/RIR do not regress

5.4 Governance

Human + conformance kit:

  • review the change and evidence

  • possibly adjust thresholds / multipliers

  • run broader sandbox tests

  • deploy with canary + extra monitoring on:

    • property damage stats
    • EAI, CAS, SCover
    • number of ETH rejections in those sectors

Over time, PLB keeps monitoring:

  • Did the incidents stop?
  • Did we unintentionally bias against certain neighborhoods?
  • Do we need another iteration?

6. Example 2 — Improving a Saga compensator (RML-2)

Consider a payment system running under SI-Core with RML-2:

  • multi-step saga:

    1. reserve funds
    2. record invoice
    3. notify downstream systems

There is a compensator set for each step.

6.1 The problem

Effect ledger + Failure Traces show:

  • occasional partial rollback:

    • funds successfully released
    • invoice record remains
    • downstream notification inconsistent
  • RIR (Rollback Integrity Rate) dips to 0.93 for this saga

  • Pattern: failures occur when network blip hits between steps 2 and 3

6.2 What PLB sees

PLB clusters all failures for payment_saga:

  • identifies that most are:

    • transient network issues, not logic bugs
    • during step 3 (“notify downstream”)
  • checks that compensator for step 3 is:

    • “replay notification”
    • no compensator for invoice record if step 2 commits but 3 fails

6.3 Proposed patch

PLB suggests:

  • add an invoice “pending” state plus a compensator for step 2
  • enforce idempotency keys across notifications
  • ensure “pending” entries can be re-swept and finalized

In conceptual patch form:

target: "[email protected]"
to_version: "v1.1-candidate"

changes:
  - modify_step:
      step_id: 2
      new_behavior:
        - write invoice with status="PENDING"
        - register compensator that marks PENDING invoices as "CANCELED" (tombstone) rather than deleting
  - modify_step:
      step_id: 3
      new_behavior:
        - send notification with idempotency_key = saga_id
        - on success: update invoice status="CONFIRMED"
  - add_rollback_rule:
      on_failure:
        - always run compensator for step 3 if partially applied
        - if step 3 cannot be confirmed within T, run step 2 compensator

Again, PLB backs it with:

  • simulated sagas under injected network faults
  • predicted RIR improvement, RBL impact, CAS and SCI effects

Human + conformance kit review, run golden-diff for the saga logic, and deploy if satisfied.


7. Example 3 — SIL code refactoring

In a SIL-enabled system, core logic lives in SIL programs. PLB can propose SIL patch candidates.

7.1 The problem

SIL function decide_gate_offset(level_mm, trend_mm_per_h):

  • occasionally proposes large offsets when level is already near critical
  • downstream constraints (CON layer) often reject and fall back to 0
  • pattern: wasted decision cycles + noisy GCS behavior

7.2 What PLB sees

From SIR / .sirrev:

  • identifies the IR nodes corresponding to the logic that produces offsets

  • sees repeated pattern where:

    • level_mm >= 1800
    • trend_mm_per_h > 20
    • function still returns offset > 40cm
  • checks that CON layer then rejects as unsafe, causing repeated retries

7.3 Proposed SIL patch

PLB synthesizes a small SIL refactor:

@layer(DET)
fn validate_offset_cm(offset_cm:i32, level_mm:f32): bool {
  // DET-side pre-check (policy-versioned snapshot) to reduce obvious CON rejects.
  if level_mm >= 1800.0 && offset_cm > 30 {
    return false;
  }
  if offset_cm < -50 || offset_cm > 50 {
    return false;
  }
  true
}

@layer(DET)
fn decide_gate_safe(level_mm:f32, trend_mm_per_h:f32): i32 {
  let offset_cm = estimate_gate_offset(level_mm, trend_mm_per_h);

  if !validate_offset_cm(offset_cm, level_mm) {
    // Fallback to conservative option
    return 0;
  }

  return offset_cm;
}

PLB attaches:

  • SIR-level diff (si-golden-diff output)
  • SCover comparison before/after
  • metrics showing fewer CON rejections, smoother GCS patterns

Human devs can:

  • adjust the thresholds
  • integrate with existing SIL project
  • run full conformance kit before deployment

8. Safety boundaries: what PLB is not allowed to do

PLB is powerful and must be fenced.

Typical boundaries:

  1. PLB cannot directly modify live systems.

    • Only emits versioned change proposals.
    • All actual changes go through normal CI/CD + conformance.
  2. PLB cannot lower safety invariants without explicit approval.

    • Any proposal that would:
      • lower ethics floors,
      • weaken RML level,
      • reduce audit completeness (ACR),
      • or change the definition/collection of the evaluation evidence (e.g., [OBS]/[EVAL] streams or metric semantics), must be marked as HIGH-RISK and require governance approval.
  3. PLB cannot bypass SI-Core’s [OBS][ETH][MEM][ID][EVAL].

    • It can only use what is audited.
    • If something is not in audit, PLB does not get to invent it.
  4. SIL / policy patches must preserve structural guarantees.

    • golden-diff shows affected frames / nodes
    • SCover not allowed to drop below declared thresholds
    • CAS/EAI/RIR regressions must be explicitly called out
  5. Human-in-the-loop remains mandatory for critical domains.

    • Healthcare, finance, safety-critical infrastructure: PLB can suggest, not auto-merge.

In short: PLB is a tool for governed evolution, not a free-running self-modification daemon.

8.1 When PLB proposals fail

PLB itself can be wrong. It can overfit, chase spurious correlations, or simply propose unhelpful changes. We treat PLB as a fallible subsystem with its own monitoring.

Typical failure modes

  1. False-positive patterns
    PLB identifies correlations that do not generalize.

    Mitigations:

    • Require a minimum incident count per pattern
    • Use basic statistical sanity checks
    • Make human review mandatory
    • Track long-term proposal acceptance rate
  2. Sandbox passes, production fails
    Proposed patch works in replay, but fails under live conditions.

    Mitigations:

    • Canary deployments (5–10% of traffic)
    • Enhanced monitoring around the change
    • Automatic rollback triggers
    • Post-deployment validation runs
  3. Competing proposals
    Multiple patches target the same policy/Saga/SIL.

    Mitigations:

    • Conflict detection in PLB (overlapping targets)
    • Human arbitration of which to try first
    • Serial deployment with observation windows
  4. Cascading changes
    One patch implies further necessary changes.

    Mitigations:

    • Change impact analysis / dependency graphs
    • Staged rollout plans
    • Explicit observation periods between stages
  5. PLB bias or drift
    PLB starts to optimize the wrong objectives (e.g., overemphasis on short-term metrics).

    Mitigations:

    • Regular PLB audits and retrospectives
    • Alignment checks against declared goals
    • Trend analysis of proposal types and impacts
    • Ability to temporarily revert to “manual only” mode

Monitoring PLB health

Example metrics:

  • Proposal acceptance rate
  • Sandbox–production agreement rate
  • Time-to-incident-mitigation for PLB-driven changes
  • False-positive rate (proposals later reverted)
  • Revert frequency

Red-flag ranges (non-normative):

  • Proposal acceptance rate < 30% over a quarter
  • Revert rate > 20% of deployed proposals
  • Sandbox agreement < 80% on validation scenarios

Response playbook:

  1. Pause new PLB-generated proposals for the affected area
  2. Audit pattern-mining logic and thresholds
  3. Tighten criteria for candidate patterns
  4. Resume with stricter governance; keep humans in the loop

PLB is powerful, but never above scrutiny. SI-Core should treat it as another component that can be audited, tuned, and even turned off temporarily.


9. How LLMs and RL fit without taking over

PLB is agnostic to the internal machinery used to:

  • mine patterns,
  • generate patch candidates,
  • propose new SIL fragments or policies.

In practice:

  • An LLM can be used as:

    • a pattern summarizer (“describe common structure of these failures”),
    • a code/policy generator under strict schemas,
    • a text-to-SIL assistant for developers.
  • RL-style methods can:

    • optimize GCS estimators and goal weights offline,
    • propose parameter updates (thresholds, weights) under SI-Core constraints.

Crucially:

  • All such proposals must still flow through PLB’s structural pipeline:

    • backed by evidence,
    • sandbox-validated,
    • checked by the conformance kit,
    • approved by humans where required.

LLMs and RL thus live inside PLB as sub-modules, not as opaque gods.

9.1 LLM and RL integration patterns (non-normative)

LLMs and RL algorithms can live inside PLB as pattern-mining and proposal-generation tools. They do not bypass governance.

Use case 1: Pattern summarization

Input to LLM:

  • A batch of similar incidents (JSON)
  • Their Jump logs, EthicsTrace, GCS vectors, Failure Traces

Prompt sketch:

"Analyze these incidents and identify recurring structural patterns. Focus on: [OBS] structure, [ETH] decisions, GCS vectors, failure modes. Output: a short hypothesis list and candidate root-cause patterns."

Output:

  • Natural-language summary
  • 2–5 hypotheses that PLB can turn into structured tests

Human review is required before any patch generation.

Use case 2: Policy patch drafting

Input:

  • Pattern summary
  • Current policy (YAML)
  • A JSON Schema describing valid patches

Prompt sketch:

"Given this policy and this pattern summary, propose a patch. Output strictly as JSON conforming to this schema: …"

PLB then:

  • Validates against schema
  • Runs golden-diff / structural checks
  • Runs sandbox validation
  • Sends the patch as a proposal to human + conformance kit

Use case 3: SIL code suggestions

Input:

  • Failure pattern in SIR / .sirrev
  • Current SIL function(s) and type signatures

LLM produces a SIL patch candidate which is then:

  • compiled,
  • golden-diff’d,
  • property-tested,
  • and reviewed by humans before any deployment.

RL integration: tuning GCS estimators

RL does not directly act on the live system. Instead, it can optimize parameters of GCS estimators offline.

  • State: semantic features + context
  • Action: small parameter adjustments (weights, thresholds)
  • Reward: GCS estimation accuracy on a validation set

Hard constraints:

  • Parameter bounds and monotonicity constraints
  • No direct relaxation of ethics floors or safety invariants
  • All proposals still flow through PLB’s validation + human review

Key principle:

LLMs and RL are tools inside PLB,
subject to the same governance and conformance checks as human-written proposals.


10. Minimal PLB: implementation sketch

Here is a toy, non-normative sketch of a minimal PLB loop focused on ethics policies.

class PatternLearningBridge:
    def __init__(self, log_reader, sandbox, diff_tool, policy_repo):
        self.log_reader = log_reader
        self.sandbox = sandbox
        self.diff_tool = diff_tool
        self.policy_repo = policy_repo

    def run_once(self):
        # 1) Collect candidates: recent incidents + metric regressions
        incidents = self.log_reader.fetch_incidents(window="7d")
        metric_drift = self.log_reader.fetch_metric_anomalies(
            metrics=["EAI", "RIR", "GCS.city.property_damage_minimization"]
        )

        candidates = self._cluster_candidates(incidents, metric_drift)

        proposals = []
        for cluster in candidates:
            # 2) Mine patterns (could call out to LLM or custom logic)
            pattern = self._summarize_pattern(cluster)

            # 3) Generate policy patch proposal
            patch = self._propose_ethics_patch(pattern)

            if not patch:
                continue

            # 4) Validate in sandbox on historical scenarios
            sim_report = self._validate_patch_in_sandbox(patch, cluster)

            # 5) Run structural checks (golden-diff, SCover)
            diff_report = self.diff_tool.check_patch(patch)
            if not diff_report.pass_safety:
                continue

            # 6) Emit proposal object
            proposal = {
                "patch": patch,
                "pattern_summary": pattern.summary,
                "sim_report": sim_report,
                "diff_report": diff_report,
            }
            proposals.append(proposal)

        # 7) Write proposals to a queue for human / governance review
        self._emit_proposals(proposals)

    # ... helper methods elided ...

You can imagine similar loops:

  • one specialized for Saga/compensators,

  • one for SIL functions,

  • one for semantic compression policies,

  • all sharing common utilities for:

    • log retrieval,
    • scenario replay,
    • conformance checks.

10.1 PLB computational requirements (non-normative)

The Pattern-Learning-Bridge is offline / nearline by design. It does not sit on the hot path of jumps, but it still needs compute for mining and validation.

The numbers below are order-of-magnitude examples for a mid-scale L2 deployment (≈ 1M jumps/day). Real systems may differ by 10× in either direction.

Pattern mining (daily)

Typical batch job:

  • Log processing: 10–30 min
  • Clustering: 5–15 min
  • Pattern extraction: 5–20 min

End-to-end: ~20–65 min / day on a modest analytics cluster.

Sandbox validation (per proposal)

For each candidate patch (policy/Saga/SIL):

  • Historical scenario replay: 5–30 min
  • Golden-diff / structural checks: 1–5 min
  • Conformance & consistency checks: 2–10 min

End-to-end: ~8–45 min / proposal.

Typical PLB usage:

  • 5–20 proposals / week
  • ~1–2 hours / week of sandbox time, amortized.

Resource profile (example)

  • Pattern mining: 4–8 vCPUs, 16–64 GB RAM
  • Sandbox: isolated cluster using 10–20% of production capacity
  • Storage: 30–90 days of structured logs
    (~100 GB – 1 TB, depending on jump volume and SCover)

Cost optimization strategies

  • Run mining jobs in off-peak windows
  • Share sandbox infrastructure across teams
  • Use incremental pattern updates rather than full re-mining
  • Cache validation results for repeated scenarios

Scaling patterns

  • L1/L2 cores: a single PLB instance is usually enough
  • L3 / multi-region: federated PLBs per region, sharing a global pattern library
  • Cross-region patterns (e.g., ethics drift) are learned at the “global PLB” layer

These are not requirements, just a way to scope the ballpark cost of taking PLB seriously.


11. Roadmap: how to adopt PLB incrementally

You do not need a full SIL stack to start using a PLB-like pattern.

A reasonable path:

  1. Start with “human PLB”

    • Just unify your SI-Core logs, Effect Ledger, and metrics into a single analytics and incident review workflow.
    • Use them to inform manual policy/code changes.
  2. Automate detection before automation of changes

    • Build automated jobs to:

      • detect repeated failure clusters,
      • flag metric regressions,
      • summarize EthicsTrace anomalies.
    • Deliver this as reports / dashboards to humans.

  3. Introduce structured change proposals

    • Capture policy/code changes as explicit Change Proposal Objects with:

      • references to evidence,
      • expected effects,
      • validation artifacts.
  4. Add light-weight PLB

    • Let a service generate draft proposals (especially for simple, parameter-only changes).
    • Keep humans firmly in the approval loop.
  5. Deepen PLB scope as confidence grows

    • Gradually allow PLB to propose:

      • small Saga refinements,
      • minor SIL refactors,
      • compression policy adjustments, with strict boundaries and conformance tests.
  6. Full PLB for mature SI-Core deployments

    • For L2/L3 systems with rich telemetry and SIL coverage:

      • PLB becomes the main “learning engine”,
      • LLMs/RL act as pattern miners inside it,
      • governance decides how fast the system is allowed to self-improve.

12. Why this matters

If SI-Core were only a static constraint system, you’d eventually hit:

  • mis-tuned ethics policies,
  • brittle compensators,
  • misaligned goals,
  • aging SIL logic.

You would have to manually chase these problems forever.

Pattern-Learning-Bridge changes the posture:

  • the runtime knows where it hurts,
  • evidence is structurally organized,
  • proposals are structurally localized and auditable,
  • humans and tools co-govern how structure evolves.

Put differently:

  • SI-Core + PLB is not just “an AI that obeys rules”,
  • it is an intelligence runtime that can improve its own rules, without giving up traceability or control.

That’s what “learning” means in Structured Intelligence Computing.

Community

Sign up or log in to comment