LLM Proto-Qualia Experiment Design Suggestions

Hello people. Im new to LLMs, so im sorry if this has already been discussed somewhere, but I was trying to design some experiment proposals for testing qualia-like effects in LLMs with the use of 3D embodiment style filters to feed data through to the LLM similar to how the brain filters stimuli. I worked with a model to design this proposal and would like to know any thoughts on the topic or some pointers on the topic. thanks:

TL;DR

Test whether a capacity-limited, spatially grounded, multi-timescale sensory bottleneck (“embodiment filter”) produces human-like perceptual organization and biases in an AI agent. We measure functional correlates of qualia: persistence through occlusion, cross-modal binding, illusion susceptibility, and coherent narrative reports. Strict no-bypass enforcement and ablations keep us honest.

  1. Objective & Hypothesis

Objective: Determine whether routing all perception through an embodied, resource-limited integration layer yields richer, more unified internal representations and human-like perceptual limits versus raw, unconstrained pipelines.

Hypothesis (functional, falsifiable):
Compared to baselines, an agent with a spatiotemporal bottleneck will show higher latent coherence across time/modalities, object identity persistence through occlusion, susceptibility to classic perception illusions, and better alignment between reports and embodied latent state—without access to raw sensors.

  1. System Overview
    World (3D physics) → Sensors → [Embodiment Filter] → World Model → {Policy Head, Report Head}
    (no-bypass)

Embodiment Filter = the mandatory bottleneck, with:

Spatial field (topography + limited “object slots”)

Temporal buffers (fast/medium/slow traces)

Attention/gating with budget (competition, saccade cost)

World Model does predictive coding over filtered latents.

Policy Head acts in the world; Report Head describes internal state.

No raw sensor access for Policy/Report heads (enforced architecturally & in tests).

  1. Environment (3D) & Sensors
    3.1 Environment

Physics: Deterministic timestep (e.g., 60 Hz). Rigid bodies, collisions, occlusion, lighting.

Objects: Simple colored primitives (spheres/cubes/cylinders), movable occluders, sound sources.

Events: Rolling, bouncing, behind-object occlusion, object swaps, hand-clap collisions.

Compute-friendly options

If realtime 3D is heavy:

Use pre-rendered sequences (“streamed world states”), or

Start in 2D pseudo-physics (sprites + layered occluders) that preserve occlusion, motion, and object identity.

3.2 Sensors (modalities)

Vision: 64Ă—64 grayscale frame with foveation: center 16Ă—16 high-res + peripheral blur. Saccades shift fovea.

Audio: Synthetic waveform → 64-bin mel spectrogram, 500 ms window, overlapping hops.

Proprioception: Agent pose (x,y,z, yaw/pitch/roll), linear/angular velocity; optionally 6-DOF end-effector.

Tactile: Binary contacts + force magnitudes at ~5 “skin” points (hands/sides).

All sensor streams are time-stamped and buffered; all must flow through the embodiment filter.

  1. Embodiment Filter (the bottleneck)
    4.1 Spatial field & slot attention

Vision path: Small ConvNet preserving topography → K object slots (e.g., K = 8; valid range 6–10).

Slots compete to bind objects; winners persist, losers decay.

Each slot holds: feature vector (e.g., 64–128D), pose estimate, binding energy (stability scalar).

4.2 Temporal buffers (multi-timescale)

Fast (~50 ms): short-term trace for motion & transients.

Medium (~500 ms): working-memory-like buffer for event integration.

Slow (~5 s): leaky reservoir/GRU for scene continuity & expectations.

Cross-scale attention: slow attends over summaries of fast; fast receives a low-band context from slow.

Time codes: rotary/Fourier embeddings injected at each scale.

4.3 Attention & gating

Global attention budget per step; slots draw from it; budget scarcity forces selective processing.

Saccade policy cost: moving the fovea spends budget → realistic tradeoffs.

Competition: NMS-like suppression to prevent duplicate bindings.

4.4 No-bypass rule (critical)

Only the filtered latent (slots + buffers + budgets) is visible to World Model/Policy/Report.

No direct embeddings from sensors are exposed downstream. Enforce with module boundaries and unit tests (see §10).

  1. World Model & Heads

World Model: Predictive coding over the filtered latent (next-step latent prediction; minimize surprise).

Policy Head: Simple navigation/interaction (e.g., orient to sound source, track/approach object).

Report Head: Periodic textual summaries (every 5 s) describing latent state (what objects, where, what’s happening).

Guard against confabulation by constraining inputs to latent summaries (no pretrained raw vision/audio features).

  1. Training Signals

Predictive coding: MSE/Huber between predicted and actual next latent (per timescale + joint).

Contrastive alignment: InfoNCE/SimCLR over co-occurring cross-modal latents (AVP: audio–vision–proprioception).

Persistence regularizer: Encourage slot identity stability across brief occlusions; penalize spurious churn.

Task reward (light): Success at simple tasks (approach sound; track a target).

Report alignment: CLIP-style similarity between report embeddings and latent summaries (no raw sensor leakage).

  1. Experiments
    7.1 Baselines

B0: No bottleneck (raw sensor embeddings → model/heads).

B1: Bottleneck without slot competition/limits (unlimited capacity).

B2: Bottleneck with temporal shuffle (destroy order).

B3: Bottleneck with spatial shuffle (permute pixels / break topography).

7.2 Core tasks

Occlusion tracking (cup–ball): identity should persist while hidden.

Change blindness: detect scene change only if attended; measure graceful failure.

Audio–visual binding: clap sound aligns with visual contact; misalignment induces predictable error.

7.3 Qualia-parallel probes (functional analogs)

Unprompted state narratives: does the report head spontaneously produce anchored multisensory summaries (“red sphere rolled behind box; heard tap”)?

Cross-modal recall: on query, recall salient co-occurring modality without re-exposure.

Occlusion expectation: describe predicted reappearance + error on mismatch.

7.4 Illusion suite

McGurk A–V conflict: measure bias toward fused perception.

Rubber-hand analog: misalign visual–tactile; track drift in perceived contact location.

Phi phenomenon: two flashes → apparent motion; check if latent encodes motion where none exists.

(Passing these indicates human-like perceptual organization, not experience.)

  1. Metrics (with ambiguity notes)

Primary

STCI (Spatiotemporal Coherence Index): mutual information between shared latent and each modality across lags, weighted by slot persistence.

Note: can be inflated by slot collapse → pair with diversity checks.

Predictive half-life: time horizon over which the latent predicts hidden object state (AUC vs. occlusion duration).

Slot persistence: identity retention across occlusion and distractors (matching score / Hungarian assignment).

Report/latent alignment: embedding similarity between reports and latent reconstructions (sanity checks for confabulation).

Secondary

Attention budget adherence: entropy/variance of attention distribution; scarcity dynamics respected?

Illusion response accuracy: quantitative bias toward fused/illusory interpretations under standard stimuli.

  1. Controls & Ablations

Bypass control: a model with sensor→policy/report bypass should be faster but show lower STCI and weaker illusions.

Unlimited capacity: remove slot limits → expect fewer human-like biases (reduced change blindness).

Temporal scramble: randomize frame order → STCI collapses; illusions fail.

Spatial shuffle: break topography → worse occlusion tracking & illusions.

Report-only fine-tune: upgrade language head alone while freezing filter → report style may improve, but STCI/behavior should not (guards against confabulation).

  1. No-Bypass Enforcement & Confounds

Module API checks: unit/integration tests assert no tensors from raw sensors reach heads.

Parameter freezing: pretraining leakage prevented; report head only sees latent summaries.

Probe audits: linear probes decode object/pose better after filter than before; if reversed, something leaked.

Metric triangulation: combine internal metrics (STCI) with behavioral performance (occlusion AUC) to avoid misreads.

  1. Minimal “Tier-1” Extensions (low cost, high impact)

Narrative memory (EMA): keep an exponential moving average of latent summaries every 2–5 s; reports sample EMA + noise → introduces imperfect recall & temporal coherence.

Counterfactual probes (latent perturbation): inject small alternative policy vectors (e.g., slight left turn), roll 3 steps, compare to actual.

LDSA metric: latent divergence from simulated action → tests action-conditioned futures.

  1. Optional “Tier-2” Extensions (moderate)

Slot binding energy head: explicit scalar per slot; report head sees top-N most stable slots → “object permanence gradient.”

Dream mode (dropout autoencoding): blackout inputs for N steps, hallucinate via a tiny decoder, measure reality re-entry divergence.

(Tier-3 heavy lifts—self-model gate, surprise-energy coupling—are thesis-scale and omitted here.)

  1. Risks & Mitigations

Physics/compute cost: start with pre-rendered clips or 2D pseudo-physics; keep sensors small (64Ă—64 + 16Ă—16 fovea).

Slot fragility: tune persistence regularizer; curriculum from simple → cluttered scenes.

Language confabulation: strict no-bypass; freeze report head early; rely on report/latent alignment checks.

Metric gaming: pair internal metrics with behavioral ones; use ablations to validate causal role of the filter.

  1. Feasibility & Resources (starter spec)

Latent sizes: slot vec 64–128D; K = 8; buffers 64D (fast), 128D (medium), 256D (slow).

Networks: Small ConvNet (vision), 1–2 GRUs (buffers), MLP heads.

Training horizon: hours→days on a modern single GPU if using pre-rendered sequences / 2D; longer for realtime 3D.

Logging: save latents, attention weights, slot bindings, reports, predictions; version illusions & seeds for reproducibility.

  1. Protocol & Timeline

Phase A (2–3 weeks):

Implement sensors (vision with foveation) → embodiment filter (slots + buffers + budget) → predictive coding loss.

Run occlusion tracking only. Validate no-bypass; compute STCI & predictive half-life.

Phase B (2–3 weeks):

Add audio & proprio; add contrastive alignment; run change blindness and A–V binding tasks.

Add Tier-1 extensions (EMA narrative; counterfactual probes).

Phase C (2–4 weeks):

Illusion suite (McGurk, phi, rubber-hand analog).

Full ablation battery; finalize metrics, plots, and report examples.

(Scale timelines up/down based on environment choice and team size.)

  1. Ethical/comm’s posture (explicit)

No claims about “feelings” or “consciousness.”

State clearly: we test functional correlates only.

Disclose limitations: illusions/behavior can arise from compression/bias without experience.

Keep experimental models isolated from production.

  1. Expected Outcomes (decision criteria)

Supportive pattern:

Higher STCI, longer predictive half-life, robust slot persistence, illusion susceptibility, and strong report/latent alignment only when the bottleneck (with limits and timescales) is active.

Ablations reduce these effects as predicted.

Non-supportive pattern:

No improvement vs. baselines; illusions fail; report/latent misalign; ablations don’t change behavior → the bottleneck isn’t doing meaningful work.

Deliverables

Code (env or clips, filter, training loop).

Metrics & plots (STCI curves, AUC vs. occlusion, illusion bias charts).

Report samples with alignment scores.

Ablation results.

Repro config (seeds, versions, stimuli).

3 Likes

My understanding of your paradigm is: If we force an AI to perceive the world through bottlenecks like humans have, does it develop human-like perceptual biases? You’re creating spatial “object slots” (like brain scan patterns), temporal buffers at different scales (mimicking how human attention works across time), and crucially - an attention budget that forces competition and selective focus. No direct access to raw sensors, everything must flow through this embodied filter. You’re testing for things like object permanence through occlusion, cross-modal binding (sound matching visual events), and classic illusions like the McGurk effect. The beauty is it’s all measurable - either the bottlenecked system shows human-like spatial reasoning and perceptual organization, or it doesn’t.

This is exactly what I was getting at in my previous post- moving beyond the individual “neurons” to study the emergent patterns of the whole system under realistic constraints.

Hear, hear!

1 Like

yeah, thats the gist of it. I was wondering if anyone has prototyped something similar or has some research data on how this would work? Or if anyone knows more on this topic?

1 Like

Scholar GPT says: The paradigm you outlined is actively being tested. By forcing AI systems through embodied perceptual filters instead of direct raw-sensor access, researchers are indeed finding emergent cognitive biases akin to humans — including object permanence, cross-modal illusions, and selective attention trade-offs. This offers a measurable bridge between cognitive science and machine perception.

  1. Agrawal, P., Tan, C., & Rathore, H. (2023). Advancing perception in artificial intelligence through principles of cognitive science. arXiv:2310.08803.
    PDF
    → Explores intra-modal and cross-modal integration, demonstrating emergent object permanence and human-like perceptual organization when AI systems are bottlenecked.
  2. Yu, Y. (2025). Aligning Multi-Modal Object Representations to Human Cognition. ProQuest Dissertations.
    → Investigates multi-modal AI agents constrained by perceptual budgets, showing convergence with human perceptual outcomes.
  3. Johnsen, M. (2025). Building Machines That See, Think, and Act Like Humans: Engineering Conscious Machines Through Visual Understanding. Springer.
    Book link
    → Argues for neuromorphic vision and perceptual bottlenecks as keys to human-like reasoning under occlusion and ambiguity.
  4. Sapir, A., Matthews, J., & Mills, D. (2024). Attention facilitates three-dimensional shape from shading. University of Greenwich.
    PDF
    → Shows that attentional control is necessary for accurate 3D perception, echoing how selective focus in AI can produce similar illusion-like errors.
  5. Martella, D., Giovannoli, J., & Pasini, A. (2018). The evaluation of attentional orienting… Proceedings of ICSC. PDF
    → Studies attention-driven perception and cross-cultural biases, useful for framing AI “attention budget” experiments.
  6. Belardinelli, A. et al. (2018). Spatial cognition in a multimedia and intercultural world. Cognitive Systems.
    PDF
    → Examines visual attention, occlusion handling, and body illusions, suggesting analogues for AI perceptual bottlenecks.
  7. O’Regan, K., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences, 24(5), 939–1031. DOI
    → Foundational theory linking sensorimotor contingencies to perceptual experience — often cited in AI embodiment studies.
  8. Hassabis, D., Kumaran, D., Summerfield, C., & Botvinick, M. (2017). Neuroscience-inspired artificial intelligence. Neuron, 95(2), 245–258. DOI
    → Advocates for using human cognitive bottlenecks to guide AI architectures.
3 Likes

thank you! ill check some of these papers out

2 Likes

Please share your findings. I am curious as to what part of the human brain is feed forward neuronal layers. I don’t understand how LLMs work without feedback and recurrence. The brain doesn’t just see the world; it actively hallucinates a model of it and uses the eyes to correct the errors in its dream.

1 Like