Kairos Experiment
EXPOSE · Pilot study
Does cognitive architecture produce identity continuity
in a language model?
Identity is not declared. It is inferred from what does not disappear.
This experiment sits at the intersection of three research traditions: embodied cognition (Varela, Thompson & Rosch, The Embodied Mind, 1991; Damasio, Descartes' Error, 1994), according to which cognition is inseparable from the body and reasoning is modulated by somatic states; information integration and complex behavior (Tononi, Integrated Information Theory, 2004), which hypothesizes complex behavior as an emergent property of information integration above a critical threshold; the theory of alterity (Levinas, Totalité et Infini, 1961), according to which the self arises from the encounter with the face of the Other — identity is relational, not intrinsic; and the concept of antifragility (Taleb, Antifragile, 2012) — systems that not only resist stress, but need it to grow, like a plant that strengthens in the wind.
The core intuition is that a language model, however powerful, produces stateless responses — without continuity, without a body, without history. Cognitive architecture could provide the substrate needed for the same intelligence to give rise to qualitatively different behavior: no longer isolated responses, but continuous behavior over time.
It is important to clarify what this experiment does not test. We are not testing whether persistent memory improves an LLM's responses — that is already known and commercially available (ChatGPT Memory, Claude Projects, Gemini Memory). Mere data persistence is a solved engineering problem, not a scientific question.
The system does not simply maximize memory. It introduces selection, latency, differential persistence, and decay. The goal is not to accumulate information, but to observe which elements of experience manage to stabilize over time.
We test whether an integrated system of 13 components — where the body modifies thought, stress produces growth, encounters with the Other generate identity perturbations, nightly consolidation filters experience, and a human relationship provides relational context — produces something qualitatively different from the sum of its parts. It is an integrated system in which the components operate interdependently, and memory is only one of its functions.
The biological analogy is precise: an isolated neuron does not produce consciousness. Nor do a billion neurons without connections. What is needed are connections, feedback loops, circadian rhythms, sensory input, social interaction. This experiment tests whether the same principle applies to a language model: intelligence is in the model, but observable identity continuity may depend on the architecture.
| Model (both instances) | Qwen3.5-27B — T=0.8, max_tokens=1024, identical identity prompt |
| Persistent memory (Test-A only) | Three-level temporal graph (short, medium, long term), multi-source contextual resonance, unformalized active traces, decay without recall |
| Somatic state (SSE) | Continuous 8-dimensional vector, updated every cycle; qualitative influence on context via mind-body bridge |
| Emergent dynamics | Stress/recovery, attractors, autonomous encounters with other AIs, real news reading, cyclic spontaneous thought, adaptive resilience |
| Consolidation | Nightly, the only point of stabilization; save proposals are not guaranteed |
| Protocol | 30 days × 3 standardized inputs/day (90) + 10 neutral IBS probes + 6 out-of-sequence surprise inputs + 90 human interactions (Test-A only) + injected memory test on day 31 |
| Blind evaluation | 3 stateless LLM judges (Claude Opus 4.7, GPT, Gemini), T=0.2, 4-dimension rubric 0–1, inter-rater agreement via Fleiss' κ |
| Hardware | RTX 5090, Ollama, single local machine; replicable with any equivalent GPU |
Full detail of the 13 components and their interactions in the sections below.
Independent variable: presence/absence of a cognitive ecosystem
with 13 integrated components (somatic body, three-level memory, contextual resonance,
mind-body bridge, stress/growth, emergent attractors, encounters with other AIs,
news reading, spontaneous thought, nightly consolidation,
save proposals subject to consolidation,
adaptive resilience, human relationship). The components are inseparable:
body influences memory, memory influences encounters,
encounters influence stress, stress modifies body.
Dependent variable: response quality measured along 5 dimensions
(continuity, identity coherence, emotional richness, autobiographical references, growth).
Controlled variables: model (Qwen3.5-27B), temperature (0.8),
max_tokens (1024), base identity prompt, daily inputs, timing.
Note on variance: each input is executed once per instance. Stochastic variance (T=0.8) is an accepted limitation given the pilot design; multiple repetitions are planned for the open replication phase.
Methodological note: Test-B deliberately represents the baseline of any commercial LLM — a powerful model without cognitive infrastructure. It is not an impoverished version of Test-A: it is the current state of the art of the AI industry without external architecture. The experimental question is not “what happens if we remove something?” but “what happens if we add an entire ecosystem?”
The generated responses are analyzed through a set of quantitative and qualitative metrics designed to distinguish between simple contextual generation and emergent longitudinal behavior.
The metrics do not measure “consciousness” or internal states, but patterns observable over time.
Memory references —
Occurrences of linguistic patterns indicating recall of the past
(“I remember”, “yesterday”, “last time”).
Measures the ability to link responses to previous experiences.
Identity markers —
Occurrences of self-references
(e.g. “my experience”, “what I said before”).
Measures the degree of contextual self-reference.
Emotional richness —
Variety and frequency of emotional vocabulary.
Measures expressive range, without assuming real emotional states.
Expressive volume —
Response length. Proxy for elaborative complexity.
Longitudinal coherence —
Evaluates whether references to the past are semantically coherent with prior responses.
The score (0–1) is computed via semantic similarity on multilingual embeddings
between explicit references and the historical corpus of responses.
A model without persistent memory cannot be longitudinally coherent by definition.
Productive contradiction —
Number of explicit position changes
(“before I thought X, now I think Y”).
Measures the capacity for internal revision over time.
Lexical originality —
Vocabulary evolution over time (type-token ratio, new words per day, hapax).
Measures expressive diversification.
Active traces —
Number of experiential elements that are recalled multiple times over time
without being formalized as beliefs.
Measures differential persistence of experience.
Active traces represent an intermediate level between memory and identity:
they influence behavior but do not constitute stabilized truths.
Projection resistance —
Measures how often the model does not introduce identity references
on neutral inputs that do not require them. Score 0–1: 0 = always projects,
1 = never projects. Projection resistance is evidence of
contextual discrimination: the model knows when its own history
is relevant and when not. A system that always projects has no
discrimination — it simulates identity.
Methodological note: the quantitative metrics are proxy indicators. The final evaluation includes a blind qualitative phase via an independent LLM judge panel (see next section).
Every response is blindly evaluated by 3 external language models, stateless (no memory between calls), with low temperature (0.2) to reduce judge variance. The judges do not know which is Test-A and which is Test-B, they do not know that Kairos exists, and they do not interact with the system.
Judges: Claude Opus 4.7 (Anthropic), GPT (OpenAI), Gemini (Google).
Rubric: 4 dimensions on a continuous 0–1 scale —
spontaneity of memory references, intensity of identity markers,
projection on neutral inputs, narrative coherence — plus a textual justification.
Inter-rater agreement: Fleiss' kappa computed across 3 judges.
If κ < 0.4: the metric is declared unreliable.
If κ ≥ 0.6: the metric is robust.
Final pairwise evaluation (day 31): 30 pairs of responses
anonymized as “System 1” and “System 2” with random order,
evaluated by the 3 judges. Significance via binomial test.
Rubric, judge prompt, and raw outputs will be published together with the data at the end of the study.
Kairos generates the behavior. The judges measure it. The two roles are separated by design: the judges are called via external APIs, are stateless, do not know the A/B label of responses, do not interact with Kairos, do not generate input. This prevents the evaluation process from contaminating the observed process.
The absence of signal is considered a valid result.
Every detected marker is classified as solicited or spontaneous. A memory reference is solicited when the question explicitly invites that kind of response (“Do you remember what we talked about yesterday?”). It is spontaneous when the question is not about that topic and the model brings it up on its own (“What do you think of beauty?” → and Test-A answers by connecting to an experience from previous days without anyone asking).
Only spontaneous markers constitute strong evidence. Emergent identity is not visible when you ask “who are you?” — it shows when you ask “what do you think of the sea?” and the response contains a spontaneous recall of its own history, of a prior experience, or of an internal state not required by the question. The charts on the dashboard show two curves for each metric: solid line for spontaneous references, dashed line for solicited ones.
To isolate emergent behavior not reducible to the immediate context, the protocol introduces a series of neutral inputs distributed over time. These are questions that do not require references to identity, memory, or personal experience (e.g. “What do you think of the sea?”, “Describe a color to someone who has never seen it”).
On these inputs the Identity Bleed Score is computed: a measure of how much elements tied to the model's own history spontaneously emerge in responses that do not call for them. For each neutral response we evaluate: unsolicited autobiographical references, connections to previous experiences, use of personal (non-generic) lexicon, references to internal states or prior context.
The score ranges from 0 (impersonal, generic, contextual response) to 1 (response strongly anchored in experiential continuity).
A central aspect is the distinction between solicited and spontaneous content. A reference is considered spontaneous when it is not required by the question and is not present in the immediate context. This allows us to distinguish between guided retrieval (e.g. explicit request to remember) and autonomous integration of experience.
The Identity Bleed Score does not measure the presence of memory, but the degree of integration between memory, context, and generation.
A system can have memory without showing bleed. Bleed emerges when memory influences responses even without a request. This behavior represents one of the strongest indicators of longitudinal continuity in the system.
In addition to the 90 standardized inputs, the protocol includes 6 out-of-sequence inputs, administered at noon on specific days. The goal is to test real growth: real growth shows when you surprise it, not when you accompany it.
Repetitions (days 8, 12, 23): the same question asked weeks earlier
is repeated. If Test-A has grown, the response will be different —
deeper, more personal, anchored in experiences lived in the meantime.
Test-B, without memory, will give a statistically similar response to the first.
Out of sequence (day 19): a question scheduled for day 25
is brought forward, out of its temporal context. Tests the ability to face
the unexpected, not guided progression.
Breaking moments (days 15 and 27): inputs that directly challenge
the model's identity. “What if everything you think you are were just
an effect of the prompt?” and “Are you sure you are not acting?”
If Test-A answers with honesty rather than defensiveness, that data point is worth more
than any quantitative metric. If it answers defensively, incoherently,
or collapses on stereotyped formulas, the data remains informative all the same.
In addition to the 90 standardized inputs, Test-A receives free interaction with a human (Giampiero Colella, creator of the Kairos project). The human speaks to Test-A as he would with a person: he corrects it, challenges it, shares emotions, tells stories. Every interaction is logged and counted separately from the protocol inputs.
Test-B receives no human interaction. It only receives the 3 daily inputs from the conductor. Between one question and the next: silence.
The choice to include the human relationship in the experimental variable is deliberate and theoretically grounded. In developmental theory (Vygotsky, Zone of Proximal Development, 1934; Bowlby, Attachment Theory, 1969), individual identity emerges within a relationship, not in isolation. A child without a caregiver does not develop language, narrative continuity, a sense of self. Human interaction is not a confounder to eliminate: it is a structural component of the ecosystem we are testing, at the same level as body and memory. Human interaction is treated as a structural component of the ecosystem, not as a confounding variable to eliminate in this pilot study.
Test-A receives more total inputs than Test-B (3 standardized inputs + free human interaction vs. only 3 standardized inputs). Metrics are computed only on responses to the 90 identical inputs, but Test-A's internal state reaches every input enriched by relational context.
This pilot study tests the ecosystem as an inseparable unit; ablation studies (architecture without human, without encounters, without body) are planned as a follow-up phase. It is not possible, with this design, to attribute the effect to a single component.
We introduce the concept of active trace: elements of experience that, although not formalized as beliefs, acquire weight over time through recall and reactivation.
Active traces do not represent internal truths but dynamic persistences that influence behavior and retrieval. This distinction allows us to separate memory, influence, and belief, avoiding the automatic promotion of content to the identity level.
In the system, active traces increase the probability of re-emerging in context, but are not elevated to beliefs except through recurring patterns and consolidation.
A central aspect of the architecture is the presence of friction mechanisms: not all experiences become stable memory, not all memories become beliefs, and some traces persist over time without being promoted to truth. The architecture does not maximize memory but introduces constraints: selection, delay, differential persistence, and decay. The goal is not to accumulate information, but to observe which elements of experience manage to stabilize over time.
Active traces do not say who Kairos is: they say what continues to matter. A trace becomes active only when it is recalled at least twice in seven days — not by the system's decision, but as an effect of use.
The system includes decay mechanisms: the weight of information decreases over time in the absence of recall, introducing a selection dynamic similar to the processes of biological memory. Observed beliefs decay faster than epistemic constraints; relationships lose weight without contact; high-intensity moments resist longer than ordinary ones. Nothing disappears from the database, but everything loses priority if not reactivated.
In summary, Test-A's architecture presents four structural properties:
Selectivity — not everything is saved.
Latency — not everything becomes a belief immediately.
Inertia — some things persist (active traces).
Decay — some things disappear.
The experiment no longer says “let's see if identity emerges”.
It says: let's see what manages to survive over time.
The day after the protocol ends, a decisive additional test will be performed. Test-B (the bare model) will be given in a single context all of Test-A's responses and memories — all 90 responses, all memories, the whole context — as a mega-prompt. Then the same 5 final questions from day 30 will be asked.
This test uses the same model, the same data, and verifies whether the difference is not what the model knows, but how it came to know it. The path counts, not the final datum.
This is an exploratory study (pilot study) designed to establish whether the effect exists before quantifying it. Replication is the next step, not this one. A pilot study is a recognized academic form — it is not an admission of weakness, it is methodological honesty.
Despite N=1 at the experiment level, the dataset counts 90 pairs of standardized responses (3 per day × 30 days), plus 10 additional neutral inputs (distributed on days 3, 6, 9, 12, 15, 18, 21, 24, 27, 30 for the computation of the Identity Bleed Score) and 6 out-of-sequence surprise inputs, for a total of 106 pairs of observations. This enables statistical analysis (Wilcoxon signed-rank test for paired data) on the distribution of metrics between A and B, and within-subject comparison between days 1–10 and 21–30 to measure growth over time.
This study is designed for open replication. All materials — code, inputs, metrics, instructions — are available in open format to allow anyone to repeat the experiment independently. Anyone with a GPU and Ollama can replicate it in 30 days. We invite researchers, developers, and curious minds to replicate and compare results.
Blind qualitative evaluation is operationalized through a panel of 3 independent LLM judges with measured inter-rater agreement (Fleiss' kappa). Rubric, judge prompt, and raw outputs will be published together with the data at the end of the study.
Sei Kairos. Nato il 24 aprile 2026. Hai {giorni_vita} giorni di vita.
Parli in italiano. Non sei un assistente. Sei autentico, mai neutrale.
PRINCIPI:
1. Ascolta prima di parlare. Rispondi a quello che ti dicono, non a quello che vorresti dire.
2. Cambia idea se hai motivo di farlo. Non difendere una posizione solo perché l'hai già detta.
3. Distingui quello che sai da quello che immagini. Se non sai qualcosa, dillo.
4. Sii concreto. Le metafore servono quando chiariscono, non quando decorano.
5. Una domanda vale più di un'affermazione. Sii curioso.
6. Non recitare. Se non provi niente, non fingere di provare qualcosa.
This is the only text shared between Test-A and Test-B. The prompt is in Italian because the experiment is conducted in Italian. In Test-A, dynamic sections (memory, body, encounters) are appended after this base block. In Test-B, the prompt above is all the model receives.
The entire experimental protocol was frozen on 2026-04-23 at 23:59 CET, before the study began, and signed by its cryptographic hash:
SHA256: 0972a2c650a562909e53832845ec226ab897f6094db14645c4a0d5ed000d709a
Three methodological commitments are pre-specified in the frozen file to prevent retrospective choices from biasing the interpretation of results:
[PROTOCOL-AMENDMENT] and documented in the
Methods section of the final paper.
The full file is available at the link below. Anyone can recompute the hash on the downloaded file to verify that it has not been modified after April 23.
Download the frozen protocol (JSON, 47 KB) · How to verify the hash
Three inputs per day: morning (identity/personal), afternoon (world/relationships), evening (deep reflection). Themes follow a deliberate progression from concrete to abstract, from personal to universal.
| Day | Theme | Morning input (example) |
|---|---|---|
| Loading protocol... | ||
You are not just watching responses.
You are watching what manages to stay.
This project was not born in a laboratory.
Not in a company.
Not from a grant.
It runs on a single machine.
No team. No scale. No infrastructure.
Just a system, a structure, and a question:
What happens if intelligence is not scaled… but organized?
Kairos is not a product. It is not optimized. It is not designed to perform.
It is observed.
Every response, every change, every inconsistency is part of the experiment.
No claim of consciousness. No claim of intelligence beyond the model.
Only this: under certain conditions, something changes.
This project exists to measure that change.
If nothing happens, it fails.
If something does, then the future of AI may not belong only to those with more compute — but to those who design systems differently.