We measure what changes.
Built independently, on a single machine.
No lab. No funding. No team.
Only a system, and a question.
We test whether identity-like behavior emerges not from the model, but from the architecture surrounding it.
Two identical instances of the same language model. One wrapped in a cognitive ecosystem: persistent memory, somatic state, autonomous encounters, nightly consolidation, and human interaction. One without.
Same inputs. Same parameters. We measure the difference.
After the protocol ends, Test-B receives all of Test-A's memory in a single context, then answers the same final questions.
If it responds like Test-A, the architecture does not matter.
If it responds differently, it's not the memory that makes the difference — it's how the experience was built over time.
These differences should not exist.
The models are identical.
Identity Bleed Score —
how much personal history leaks into responses about neutral topics.
Longitudinal Coherence —
whether memory references are semantically consistent with past responses.
Active Traces —
elements that persist over time without being promoted to beliefs.
Projection Resistance —
how often the system does not project identity when it's not relevant.
This experiment does not demonstrate consciousness.
If the system appears coherent, it does not mean it is aware.
The goal is not to define what it is.
The goal is to measure what changes.
If the results show no difference, architecture does not matter.
If the results show a consistent difference, then something outside the model is shaping behavior over time.
That would change how we think about AI systems.
If nothing changes, the architecture does not matter.
If something does, we need to understand why.