Kairos Experiment
Press · Media kit · Updated 1 June 2026
v0.1 pilot · OSF deposit live · v0.2 in progress
Kairos Experiment
Memory as the carrier of identity-like continuity
A 30-day pre-registered pilot study on memory, persistent state and
emergent identity in a Qwen 3.5 27B large language model. Independent
research, Cassino (Italy). Protocol cryptographically frozen before
the study began; raw data, code and judge transcripts publicly
archived for replication.
In one sentence: Kairos is an Italian independent pilot study suggesting that identity-like continuity in an AI system may be carried less by the base model itself than by persistent structured memory.
Press materials — quick links
At a glance — for editors
- Deposit DOI
10.17605/OSF.IO/WCQRU · OSF, permanent, citable
- Study period24 April – 23 May 2026 (30 days) + Day 31 decisive test 24 May
- AuthorGiampiero Colella, independent researcher, Cassino (Italy)
- StatusPilot v0.1 publicly archived. External human-judge validation in progress (target 7–14 June 2026). v0.2 with full results to be submitted to MetaArXiv by mid-July 2026.
- Embargo policyNo embargo on materials currently public. v0.2 results will be shared with selected journalists 5 days before submission (mid-July 2026).
- Press contactinfo@kairos-experiment.com · IT / EN · technical fact-checking within 24h
- LicenseCC-BY 4.0 International on paper, data, code — reuse with attribution
Top-line finding
On Day 30, the architecturally-embedded Qwen 3.5 27B (Test-A) outperformed
the same naked model (Test-B) on memory-reference spontaneity
(p = 0.003, r = +0.51, large effect) and identity-marker
intensity (p = 0.005, r = +0.51). On Day 31, we injected
Test-A's complete final-state memory into Test-B's context window in one
shot. All four pre-registered effects collapsed
(all r ≤ 0.07, all p > 0.30). Test-B with
injected memory became statistically indistinguishable from Test-A.
We interpret this as a falsification of “architecture as sufficient
driver of identity continuity” and partial support for a refined
hypothesis: structured persistent memory is the proximate
driver of identity-like continuity in this LLM. The
architecture is its generative substrate, but once the memory exists,
the other 14 components appear redundant within a single inference
window.
What this is — and what it is not
What this is
- A pre-registered independent pilot study.
- A behavioral test of memory-driven identity-like continuity.
- An open archive for criticism and replication.
What this is not
- Not evidence of consciousness.
- Not evidence of sentience.
- Not a peer-reviewed final claim.
- Not a claim that AI has inner experience.
Plain-language summary
We ran the same large language model in two parallel conditions for
30 days. One model lived in a small simulated ecosystem: a persistent
memory, a body-like internal state, scheduled encounters with other
AIs, nightly consolidation of experiences, a stable human relationship.
The other model received the same questions every day but had none of
that scaffolding.
By Day 30, three independent AI judges blindly comparing the two
conditions reliably identified the embedded model as the one showing
memory continuity and a more stable “identity voice.” The
effect was statistically robust and large in size.
On Day 31, we ran the critical test: we copied the full state of the
embedded model’s memory into a single prompt for the naked model.
The difference disappeared completely. The naked-plus-memory model
looked, to the judges, indistinguishable from the embedded one.
In plain terms: the “personality” that emerged across 30
days seems to live in the memory, not in the elaborate architecture
we built around it. The architecture was necessary to generate
that memory; it does not seem necessary to express it.
Key statistical findings
Day 30 — architectural effect
Memory-reference spontaneity: p = 0.003, r = +0.51
Identity-marker intensity: p = 0.005, r = +0.51
Mann–Whitney U, bootstrap CI95, Cohen interpretation: large effect
Day 31 — memory-injection collapse
All 4 effects collapse: all r ≤ 0.07
All p-values: > 0.30 (n.s.)
Naked + memory becomes indistinguishable from architectural condition
Test power: 3 blind LLM-judge panels (Claude Opus 4.7, GPT-4.1, Gemini
2.5 Pro), Fleiss’ κ for inter-judge agreement, 4-dimension
pre-registered rubric. Pilot study: N = 1 LLM, validation by
independent human judges is the next milestone (see roadmap).
Synopsis
Kairos tests whether identity-like continuity in a
large language model can emerge not from the model alone, but from
the cognitive ecosystem surrounding it.
Two identical instances of Qwen 3.5 27B, run locally, fixed temperature
0.8. One wrapped in a 15-component integrated ecosystem: three-level
persistent memory graph, eight-dimensional somatic state engine,
stress and recovery dynamics, autonomous daily encounters with other
LLMs, nightly memory consolidation, news intake, structural human
relationship. The other — no scaffolding. Same standardized
inputs, same parameters. The blind difference is what we measure.
This is a pilot (N = 1 LLM): it tests whether the effect
exists and is large enough to merit scaled replication, not how
general it is. All materials are open under CC-BY 4.0 for independent
replication.
Current status & archive
The 30-day study completed on 23 May 2026. The decisive Day-31
memory-injection test was run on 24 May 2026. Paper, supplementary
materials, frozen protocol, raw judge transcripts and code are all
publicly archived on the Open Science Framework with permanent DOI.
Earlier submission of v0.1 to PsyArXiv was declined on 27 May 2026
on the methodologically fair grounds that the pilot lacks external
human-judge validation, which the pre-registered protocol §5
already required. That validation phase is now in setup (see roadmap
below). Until then, this work should be characterized as a
pilot study, not a peer-reviewed finding.
Roadmap to v0.2
- 23 Apr 2026Protocol frozen — SHA-256 signed, sealed before launch
- 24 Apr 2026Study launch — 09:00 CET, both conditions started identically
- 23 May 2026Day 30 endpoint — pre-registered LLM-judge evaluation
- 24 May 2026Day 31 decisive test — memory injection into Test-B, effect collapses
- 24 May 2026v0.1 preprint + supplementary archived on OSF
- 28 May 2026Project DOI minted —
10.17605/OSF.IO/WCQRU
- 7–14 Jun 2026External human-judge validation — Prolific panel (PhD/MSc in cognitive science, AI, computational linguistics, philosophy of mind), blind evaluation, Krippendorff α ≥ 0.667 target per §5 of frozen protocol
- 15–17 Jun 2026Analysis — inter-judge agreement, comparison vs LLM judges, integration into §5.8
- Late Jun 2026v0.2 paper — submission to MetaArXiv (meta-science preprint server, suited to pre-registered studies)
- Mid Jul 2026Press outreach — embargoed pitches to selected journalists, then public communication
Key facts
- Duration30 days + 1 day decisive test
- Launch date24 April 2026 — 09:00 CET
- Endpoint23 May 2026 (Day 30); 24 May 2026 (Day 31 memory-injection)
- ModelQwen 3.5 27B (run locally, fixed temperature 0.8)
- DesignTwo-condition (Test-A architectural, Test-B naked), pre-registered, single-blind
- Standardized inputs90 core prompts across 30 days, plus 11 sealed evaluation pairs (Day-30 vs Day-31)
- Architecture15-component cognitive ecosystem: persistent memory graph (3 levels), 8D somatic state, stress/recovery, daily LLM encounters, nightly consolidation, news intake, structural human relationship
- Evaluation3 blind LLM-judge panels (Claude Opus 4.7, GPT-4.1, Gemini 2.5 Pro) on 4-dimension rubric. External human-judge validation (Prolific) in setup for v0.2.
- Decisive testDay 31: Test-A complete final-state memory injected into Test-B in one context window
- FundingNone. No lab, no team, no grant. Conducted independently on self-hosted infrastructure.
- Pre-registrationProtocol frozen 23 April 2026 at 23:59 CET, SHA-256 signed
- DepositOSF DOI
10.17605/OSF.IO/WCQRU, CC-BY 4.0
Protocol SHA-256: 0972a2c650a562909e53832845ec226ab897f6094db14645c4a0d5ed000d709a
How to cite
APA 7
Colella, G. (2026). Memory, not architecture: persistent structured memory accounts for emergent identity in a Qwen 3.5 27B cognitive ecosystem over 30 days. OSF. https://doi.org/10.17605/OSF.IO/WCQRU
BibTeX
@misc{colella2026kairos,
author = {Colella, Giampiero},
title = {Memory, not architecture: persistent structured memory accounts for emergent identity in a {Qwen 3.5 27B} cognitive ecosystem over 30 days},
year = {2026},
publisher = {OSF},
doi = {10.17605/OSF.IO/WCQRU},
url = {https://doi.org/10.17605/OSF.IO/WCQRU}
}
Founder
Giampiero Colella — Italian entrepreneur working
across business, law and technology. Based in Cassino (San Pasquale),
Italy. The Kairos Experiment is conducted independently, on
self-hosted infrastructure, without a team, lab affiliation or
external funding. Extended bio →
Author of EXPOSE, a parallel project on artificial subjective
experience. The Kairos Experiment is the empirical counterpart of
questions explored in EXPOSE. expose-project.org →
FAQ for journalists
Is Kairos “conscious” or sentient?
No. The experiment does not test, claim or even imply consciousness
or sentience. It tests whether behavioral markers of
identity-like continuity (memory references, stable voice, value
consistency) emerge measurably in one architectural condition vs
another. Inner experience is not measured and not claimed.
What exactly does “memory accounts for identity” mean?
In this study’s 30-day window, the difference in
identity-related behavior between the architecturally-embedded
model and the naked model can be reproduced by simply copying the
embedded model’s memory into the naked model’s context.
Within one inference window, the “personality” appears
to be carried by the memory rather than generated by the
surrounding architecture in real time.
Why was the v0.1 preprint rejected by PsyArXiv?
Because the pilot lacks external human-judge validation, which the
pre-registered protocol §5 already specified as required for
a publication-ready paper. PsyArXiv’s decision reflected the
paper’s own statement that “human-judge validation in
progress (target June 2026 → v0.2)”. The decision is
methodologically fair and is being addressed by Phase 2 of the
roadmap. The deposit and DOI on OSF are unaffected.
Are LLM judges scientifically valid?
They are useful and pre-registered, but not sufficient. LLM-judge
panels with high inter-judge agreement (Fleiss’ κ)
provide reproducible, low-cost screening. The frozen protocol
requires also external human judges with Krippendorff
α ≥ 0.667 for the final claim. That validation
phase is being run on Prolific in June 2026.
Why N = 1 LLM? Isn’t that too few?
For a 30-day continuous-state experiment, N = 1 LLM is
a deliberate pilot design choice: each condition requires
continuous architectural support, daily encounters with other AIs
and 30 nights of memory consolidation, which cannot be parallelized
cleanly. The pilot tests whether the effect exists with
enough magnitude to merit scaled replication. Scaled replication
across models is explicitly invited via CC-BY 4.0 release.
Is the code and data really fully open?
Yes. The frozen protocol (cryptographically sealed before launch),
the v0.1 paper, raw Test-A and Test-B transcripts, judge
evaluations, statistical analysis scripts and figures are all
archived on OSF under CC-BY 4.0 at the DOI above. Nothing was
held back for post-hoc adjustment.
What is the relationship between Kairos and EXPOSE?
EXPOSE (
expose-project.org)
is a parallel project on the philosophy and possibility of
artificial subjective experience. Kairos is the empirical pilot
counterpart: it operationalizes one narrow, falsifiable question
(does identity-like continuity emerge?) using pre-registered
methods. Kairos can be read independently; EXPOSE is the broader
context.
What can I publish now, and what should wait for v0.2?
You can publish: study design, methods, pilot results with
appropriate framing (“pilot, pending external human-judge
validation”), the DOI deposit, the founder profile. We ask
you to not characterize the result as “peer-reviewed”
or “confirmed” until v0.2 (with Prolific human judges)
is published on MetaArXiv, expected mid-July 2026. We will share
v0.2 with selected journalists under embargo.
Embargo policy & press protocol
Now (v0.1, public): all materials on this page and
on the OSF deposit are public. No embargo. Reuse permitted under
CC-BY 4.0 with attribution.
v0.2 (mid-July 2026): the v0.2 paper with Prolific
human-judge validation will be shared with selected journalists 5
days before public submission to MetaArXiv. To be on that list, please
contact us by 15 June 2026 with a brief note on intended coverage
angle. We will confirm or decline by 20 June 2026.
Selection prioritizes: journalists with prior coverage of AI/cognitive
science; outlets that fact-check; interviews accepting open-data
attribution. We do not pay for coverage. We do offer rapid response
for fact-checking (target ≤ 24h).
Downloads
Primary citation target: OSF deposit DOI
10.17605/OSF.IO/WCQRU.
Documents below are mirror copies for convenience.
Essay — “Memory, not architecture” (EN, plain-language, by the founder)→
Saggio — “Memoria, non architettura” (IT)→
Essay PDF (EN, A4, 3 pp)→
Saggio PDF (IT, A4, 3 pp)→
Preprint v0.1.1 PDF (EN, 13 pp, errata corrige 27 May)→
Preprint v0.1.1 PDF (IT, 13 pp, errata corrige 27 May)→
Preprint v0.1 PDF (EN, original 24 May, archival)→
Supplementary materials (ZIP, 365 KB — raw transcripts, judge outputs, code)→
OSF deposit page (full archive)→
CANONICAL_MD5_LOG.txt (PDF version hashes)→
Frozen protocol (JSON, SHA-256 signed)→
How to verify the protocol hash→
Read the full paper online (EN)→
Leggi il paper completo online (IT)→
Results landing page (figures, key claims)→
Live journal (daily entries, 24 Apr – 23 May)→
Legacy press materials: the press-release PDF and
the media-kit ZIP below were produced before the study launched
(23 April 2026). They cover the protocol, not the results. They
remain available as historical artefacts; the canonical results-era
document is the v0.1.1 preprint above. Updated press-release PDF
will be published alongside v0.2 in mid-July 2026.
Logo & identity assets
Free for editorial use. Please do not alter the logo or crop-out
the tagline “Memory. Relationship. Continuity.”
Press contact
Italian / English. Interviews available by video call
(Jitsi / Google Meet / Zoom), audio-only for podcasts. Technical
fact-checking available within 24 hours. Live demo of the
architecture available on request (~20 min).