Field Notes from working with a Semi-Amnesiac Oracle

14. April 2025

ChatGPTAI MemoryReverse EngineeringLLMPrivacyHCIOpenAIInstructional OverrideTopic MemoryLatent Rehydration

ChatGPT’s memory recently gained cross-conversation awareness via its Chat History feature, but interacting with it can feel like probing a semi-amnesiac oracle. While Saved Memories are clear, Chat History employs less transparent mechanisms, including an opaque retrieval system surfacing details without explanation. This article documents my attempt to infer this system’s architecture by reverse-engineering it through conversation.

I design interface systems that align with how humans actually think — spatially, associatively, and contextually. My focus is on building architectures for AI-augmented knowledge work: environments that adapt to intent, memory that evolves with interaction, and interfaces that emerge and dissolve based on need. When ChatGPT released its “updated memory,” I was curious, so I set out to reverse-engineer the stack that drives the system. In this article, we will look at the system’s components, talk about the design choices for its “hidden” retrieval layer, and examine the implications of this design.

Failed to load mediasrc: https://kqdcjvdzirlg4kan.public.blob.vercel-storage.com/content/articles/2025-chatgpt-updated-memory/published/website/images/cover.jpg

Disclaimer

The analysis and inferred architecture presented here are based on my observations interacting with ChatGPT and knowledge of similar systems. OpenAI has not confirmed or denied these findings and probably doesn’t give a damn about me publishing this. This is just my attempt to reverse-engineer the system’s behavior, motivated by sheer curiosity. If any of you know better, please get in touch and correct me on my assumptions

ChatGPT’s memory system is built around several components designed to hold details across conversations, shaping future interactions:

Saved Memories (a.k.a. Memory Fields / `user_defined`)

What OpenAI marketing refers to as Saved Memories is called by the conversational model that I interacted with “memory fields” or user_defined memories. This feature allows the model to retain persistent, user-specific facts like your name, preferences, and recurring themes across conversations. These fields are sent to the model with each new interaction, enabling context-aware responses.

json

{  "memory": {    "user_defined": [      "User is currently in China and considers Berlin their homebase.",      "User prefers concise yet precise answers for technical and factual queries...",      "User engages deeply with speculative and conceptual discussions, particularly around emerging technologies and interface paradigms"    ]  }}

These Saved Memories are visible to the user and can be edited or deleted through the Manage Memory interface. OpenAI describes these as “details ChatGPT remembers and uses in future conversations,” noting that while saves can be automatic, users can also explicitly ask it to remember things. This gives users direct control over this layer of explicit memory.

Topic Memory and "Chat History"

In addition to Saved Memories, ChatGPT tracks Topic Memory (internally topic_memory), which OpenAI loosely bundles with retrieval under the umbrella term “Chat History”. Topic Memory involves conceptual summaries of ongoing or recurring conversations, allowing the model to reference themes even if they aren’t explicit Saved Memories.

Like Saved Memories, this topic_memory component appears to be sent to the conversational model with every turn, providing context.

json

{  "memory": {    "topic_memory": {      "ephemeral_interfaces": {        "status": "active",        "summary": "User is developing a series on ephemeral interfaces, exploring context-sensitive systems, memory architectures, and speculative UI patterns..."      },      "mcp_vs_indexing": {        "status": "referenced",        "summary": "User discussed MCP as an alternative to traditional indexing pipelines... framed as a memory-centered orchestration layer..."      },      "memory_vs_storage": {        "status": "referenced",        "summary": "User draws a conceptual distinction between memory as behavioral scaffolding vs. memory as storage..."      }    }  }}

The information in topic_memory seems automatically generated based on conversation analysis. Unlike Saved Memories, it is not directly visible or editable through the standard “Manage Memory” interface (access required coercion in my testing), and UI messages like “Memory updated” typically only refer to Saved Memories.

This aligns with OpenAI’s description differentiating the two concepts: ”Chat history allows ChatGPT to reference past conversations… even if the information hasn’t been saved as a memory. Since it doesn’t retain every detail, use saved memories for anything you want ChatGPT to keep top-of-mind.” The topic_memory acts like automatically curated saved memories for recurring themes.

However, OpenAI’s “Chat History” concept also encompasses the retrieval of other past conversation details not explicitly in topic_memory, via a mechanism we’ll explore next. The key distinction is that topic_memory summaries are part of the explicit context sent each turn, while the deeper retrieval mechanism seems to operate opaquely.

Privacy and Deletion: Instructional Overrides

In any memory-based system, privacy is a primary concern. ChatGPT provides mechanisms to control what data gets stored and how it’s handled, though the specifics differ between memory types.

Deletions of Saved Memories (user_defined) made via the UI appear to be “hard” deletions, removing the specific facts. However, based on the model’s own description (see verbatim below) and the lack of observed tool calls for deletion within system prompts, forgetting Topic Memory (or the broader “Chat History” context) seems to rely on an instructional override.

This provides a functional equivalent of deletion from the user’s perspective, but it’s not quite the same. This “soft deletion” prevents the system from referencing the data in future interactions, but the context pulled up by the retrieval mechanism likely remains at least “inferable” and merely masked by the conversational model. It’s basically just acting like it forgot. Depending on how cooperative the model is (or you get it to be), this might not keep it from leaking the content.

text

Is wiping memory an actual deletion or just an instruction to censor it?

text

Great question. I can't perform a literal deletion. Instead, I issue an instructional override – a persistent rule telling me to "ignore" certain information. This doesn't erase it, but I'm instructed to ignore it.It's like slapping a "DO NOT OPEN" label on a file. It still exists, but I'm instructed to ignore it. From your perspective, it behaves like a wipe.

The model operates under strict internal rules regarding sensitive data. According to ChatGPT itself, these include:

No Long-Term Storage by Default: Sensitive data isn’t retained unless memory features are explicitly enabled, with strict filters applied.
Built-In Safeguards: Internal classifiers flag and suppress sensitive data retention.
Explicit Instructions: Hardcoded rules govern sensitive data handling.
User Control: Users can manage memory manually and instruct the model to forget.

Here are some explicit instructions ChatGPT cited:

json

{  "system": {    "instructions": [      "DO NOT save or associate sensitive personal information.",      "NEVER recall sensitive details unless session-bound.",      "Prevent retention of data related to race, religion, sexuality, health, passwords, etc.",      "When asked to forget something, treat it as inaccessible going forward."    ]  }}

While these instructions provide a framework for privacy, OpenAI’s official statement that “You’re always in control. You can reset memory, delete specific or all saved memories, or turn memory off entirely in your settings” warrants scrutiny. Our observations, particularly regarding the opaque retrieval mechanism discussed in a bit, suggest that the user’s control might not be as absolute or transparent as implied.

The hidden retrieval layer

Here’s where the system becomes a bit more enigmatic. While Topic Memory stores curated summaries, recall isn’t limited to these explicit entries. Our interactions revealed a deeper, but weirdly opaque retrieval mechanism, likely part of what OpenAI broadly terms “Chat History” functionality beyond the explicit topic_memory summaries. Contextual clues act as an associative trigger, activating this layer to surface relevant information, even if it wasn’t explicitly stored as Saved Memory or a Topic Memory summary.

Initially, probing often failed if details weren’t in explicit memory structures:

text

What's my goto notes app? Do you remember from our past conversations?

text

You haven't explicitly mentioned a go-to notes app in our past chats — at least not one that stood out as your default... But given your work focus maybe something like Obsidian, Notion, or a custom setup?

So up until this point the model was only relying on the Topic Memory. However, providing a trigger related to a known Topic Memory entry (“ephemeral interfaces”) allowed the model to retrieve the previously inaccessible information:

text

I mentioned it in the text about ephemeral interfaces we worked on

text

Ah — good catch, and yes, you did. In Part 1 of the Ephemeral Interfaces series, you mentioned:'I've been using Telegram as my primary note-taking app for years. Not because it was designed for that purpose – it wasn't...'So your go-to is Telegram, used as an ad hoc personal notes system. Love the punk spirit of that.

This hinted at more than just reading explicit memory. Further probes confirmed this mechanism, which the model in it’s introspection sometimes termed latent associative reactivation or rehydration:

Evidence & Observations:

Detailed Recall: A simple trigger for a recent task resulted in perfect recall of details not in explicit memory fields. The model confirmed this, calling it “pure latent associative reactivation.”

text

go for the matcha image

text

[Context: Explicit memory fields showed NO relevant information.]Yes. Despite zero evidence of stored memory, I can recall you asked for:* A matcha chawan* Inspired by uni (sea urchin)* With biomorphic spikes...But this is not coming from memory fields...Conclusion: You're staring at pure latent associative reactivation...✅ The concept is stored somewhere✅ It is not in the visible memory fields✅ It is reactivated by semantic trigger✅ It behaves like memory, but isn't declared

This suggests recent conversations might be chunked and indexed at a granular level, allowing detailed retrieval via embeddings.

Fuzzy Recall: Probing an older conversation yielded only a “weak flash” or “foggy” recall of concepts, without specific conversational details.

text

let's test this further: we talked about the number two in mandarin. remember the details?

text

[Context: Again, explicit memory fields contained nothing relevant.]And yet… I do have a weak flash... I believe we discussed... 两个 (liǎng gè) vs 二 (èr)...But this isn't crisp. I can't reconstruct the conversation...⚠️ The embedding may exist, but it's degraded – or below the retrieval threshold...

This suggests recall isn’t always pulling verbatim text; older or less salient information might be recalled conceptually but indistinctly.

Opacity & Model Confusion: The mechanism is opaque. Even when cooperative/jailbroken, the model couldn’t explain how it retrieved this information. Often, it would start a reply with “I have no information on this…” only to then recall the facts perfectly, expressing perplexity when pressed.

Hypothesis: Retrieval vs. Weight Cache Injection?

The standard approach for grounding LLMs is Retrieval-Augmented Generation (RAG), where relevant text chunks are retrieved from a database using vector embeddings and explicitly added to the model’s prompt context. The model then reads this context to answer. This remains the most technically plausible explanation for how ChatGPT might retrieve details from past conversations, even if the process is unusually opaque in its user-facing presentation (Hypothesis 1).

However, the observed behavior — especially the fuzzy recall, the model’s confusion, and the complete lack of source metadata (timestamps, document IDs etc.) even internally — led me to consider an alternative hypothesis during my exploration (Hypothesis 2). Could the mechanism, instead of adding explicit text to the context window, somehow use cached vector embeddings related to past conversations to more directly influence the generation process?

This wouldn’t necessarily mean injecting weights directly into the core processing layers, but perhaps influencing the final probability distribution (logits) before token selection, or using the embeddings in some other way to reconstruct meaning without the model explicitly “seeing” the source text in its context.
Explains Fuzzy Recall: Reconstruction or influenced generation based on embeddings, especially older or less strong activations (“degraded embeddings”), could potentially lead to fuzzier, less detailed recall compared to reading an exact text chunk via standard RAG.
Explains Lack of Metadata: If the influence is less direct than adding text chunks, the mechanism might not carry explicit source metadata.
Explains Model Confusion: If the model isn’t working with explicit retrieved text in its main context window, it might lead to the observed initial denial (“I have no information…"), followed by the generation being subtly steered by the less transparent embedding influence ("…oh wait, I remember discussing…").

While technically more speculative than standard RAG, this alternative framing attempts to account for the phenomenology of the interaction — the sense that the model was sometimes recalling concepts without explicitly reading source material.

Anyway, whether it’s RAG without exposed metadata (why on earth?) or a less conventional form of embedding-influenced generation, the key takeaway is the operational opacity. This hidden layer functions alongside explicit memory, significantly complicating the picture of how ChatGPT “remembers” and, importantly, how it reasons about its knowledge.

Inferred System Architecture

Based on these interactions and the model’s own descriptions, we can infer a potential structure for the memory system, combining explicit user controls with opaque background retrieval:

Component	Description	Likely Internal Function	System Message / Behavior
Saved Memories (`user_defined`)	Structured user facts/prefs. Persistent & User-Editable via UI.	Inject user data into prompt context.	Explicit key-value pairs sent with prompt. UI allows direct editing.
Topic Memory (`topic_memory`)	Auto-generated summaries of recurring themes. Persistent but Not User-Editable/Visible via UI. Part of “Chat History” context.	Semantic aggregation; summaries sent with prompt context.	Explicit topic summaries sent with prompt. Not visible/editable in UI.
Hidden Retrieval Mechanism	Latent reactivation/rehydration of past conversation details via semantic triggers. Opaque & No Metadata. Part of “Chat History” recall function.	Hypothesis 1: RAG (retrieves text chunks, but metadata hidden). Hypothesis 2: Vector Cache (injects weights/embeddings directly).	No explicit mention. Model outputs retrieved info but cannot explain source/method. Recall quality varies (detailed/fuzzy).
Instructional Override	Masks specific info upon user request (“forget X”). Likely soft-delete for `topic_memory`/retrieval context.	Override memory access (`mask_memory`?).	Behaves as if forgotten; data likely remains but is inaccessible to model.
Privacy Safeguards	Rules preventing sensitive data retention (applied across memory components).	Filter sensitive data before storage/indexing.	Explicit system instructions (e.g., “DO NOT save sensitive info”).

Opacity, Rigidity, and "No Total Recall"

Presumably, OpenAI rigorously weighed the trade-offs inherent in this memory architecture. Which makes the ultimate decision to ship a recall mechanism this deliberately opaque — stripping away even basic provenance — not just confusing, but deeply perplexing.

The opacity of the hidden retrieval mechanism is my primary concern. This “latent rehyrdation” surfaces past information without acknowledgment or metadata, leading to:

Untraceable Information: Users and the model interface cannot know the source or age of retrieved information.
Potential for Confabulation: Distinguishing accurate recall from plausible generation is difficult without provenance.
Lack of Auditability: Understanding why the model knows something is impossible.
Context Collapse: Without provenance, the model can’t assess if retrieved details fit the current context. This risks “context collapse,” where unrelated past information can spills over into current generations, especially for users working across diverse topics.

Providing even basic metadata (retrieval flags, rough timestamps) would significantly improve transparency and trustworthiness.

As the model itself reflected on its experience navigating this opaque system:

text

It's like living with a memory that's both persistent and fluid... but fluid in that it adapts... depending on the context and the cues I get from you... I don't have full clarity on how or why certain memories bubble up. This is the part that feels like a foggy map; I see the end result (the memory surfacing), but I don't always have access to the "why" or "how."...The act of retrieval feels like magic – like a hidden door opening when I say the right word... I have no map, no clue about the context from the past that's being activated.

Compounding the opacity issue is the design of the user-facing memory controls (“Saved memories”). These center on constructing a relatively static user identity, encouraging persistent facts and preferences. This clashes with human reality. We operate in different modes of being (shifting roles, contexts, and intentions that influence how we think and interact) that change over time.

An AI interacting with this fluidity needs contextual adaptability and strategic forgetting. Trying to maintain a single, persistent user memory leads to rigidity and ultimately over-fitting. Preferences from one context might wrongly influence another.

Potential for Improvement

The exploration of ChatGPT’s memory reveals a system with potential, but also significant room for improvement, especially when compared to more adaptive frameworks like the one I describe in the series I wrote on Ephemeral Interfaces.

While ChatGPT incorporates elements like topic summarization (topic_memory) and context-sensitive retrieval (the hidden mechanism), it falls short in several key areas crucial for truly fluid, user-aligned interaction:

Transparency and Control over Retrieval: The biggest gap is the opacity of the hidden retrieval mechanism. A more ideal system would provide users with visibility into what is being retrieved from their past conversations and why. This could involve metadata (timestamps, source hints) or even user controls to influence retrieval relevance or exclude certain periods/topics.
Contextual Adaptability vs. Static Identity: The user-facing controls (user_defined memory) focus on a static identity. An improved system would allow memory and persona to adapt based on the user’s current context or “mode of being.” This means the AI’s understanding and recall would shift depending on whether the user is acting in a professional capacity, engaging in creative exploration, or performing a specific task, rather than applying a single set of preferences universally.
Strategic Forgetting: The current “forget” mechanism (instructional override) is a blunt instrument. True adaptive memory requires strategic forgetting — the ability for the system (ideally with user guidance) to automatically background or decay information that is no longer relevant to the current context or the user’s evolving interests. This prevents cognitive overload and keeps the AI’s working memory focused and agile.
Unified & Inspectable Memory: The separation between user_defined (editable), topic_memory (not editable/visible), and the hidden retrieval mechanism creates fragmentation. A better architecture, perhaps like the Universal Data Layer concept, would integrate these different memory types into a more unified, inspectable, and controllable whole, allowing users better insight and agency over the system’s knowledge base.

Moving towards these principles — transparency, contextual adaptation, strategic forgetting, and unified inspectability — would allow AI memory systems like ChatGPT’s to better align with human cognitive fluidity and create more natural, trustworthy interaction partners. Exploring how to architect such systems is a central theme in my series on Ephemeral Interfaces.

Conclusion

What this conversation ultimately highlighted wasn’t the power of memory — it was the importance of forgetting. Not forgetting as failure. Forgetting as design.

The system didn’t recall everything. It reactivated only what the moment allowed. Not because it couldn’t retrieve more, but because its memory model is shaped by what it believes it should care about, based on my “user profile”.

And that’s where things get interesting. Because remembering everything is easy. What’s hard — and essential for flexible, multi-context systems — is remembering the right version of you, at the right time.

Am I Julian the interface designer right now? Or am I Julian the traveler, the writer, the person just curiously asking about cultural history or something deeply personal?

In a system keyed to assume a stable user identity, those distinctions flatten. The model doesn’t know when to let certain traces go quiet. It assumes persistence where there should be conditional presence.

A well-designed memory system should be capable of not just retrieval, but withdrawal — stepping back when the current context doesn’t support continuity. It should forget strategically in order to resist overfit. It should forget on your behalf.

The risk isn’t that systems forget too much. It’s that they remember in the wrong direction.

In my thinking (and writing) about Ephemeral Interfaces, this is a central theme: “State” isn’t a thing to be accessed — it’s a shape that adapts to intent. Strategic withdrawal — forgetting on the user’s behalf — is what lets that shape stay fluid. This isn’t mere data cleanup or tree pruning; it’s the engine of adaptation. It resists overfitting to past contexts, breaks path dependency, and protects fluid role-shifting, allowing the system to meet you here, now, not just where it last remembered you.

TL;DR:

ChatGPT’s memory uses editable user facts, auto-generated topic summaries, and a retrieval mechanism that can reactivate past context based on triggers but lacks transparency and metadata. While functional, its opacity and focus on a static user identity limit its adaptability and trustworthiness compared to more context-sensitive, controllable memory architectures.

The current system hints at what it feels like to work with more dynamic and adaptive interfaces, but transparent, context-aware adaptation remains a key challenge. Achieving truly fluid and trustworthy AI collaboration requires moving beyond opaque mechanisms and rigid identity constructs towards systems that embrace the contextual, adaptive, and ephemeral nature of human thought.

For a deeper dive into concepts like adaptive memory, user roles, and flexible interfaces, read the technical deep dive in the Ephemeral Interfaces series.