Highlight

All things memory

What a system should remember, what it should forget, and who stays in control of either.

Memory is what turns a generic assistant into something that feels like it knows you. It is also where a product can quietly overstep. The useful question is not how much you can store. It is what is worth keeping to serve the person, and what is safer to let go.

Appropriate flow, not total recall

The cleanest way to think about the line comes from the philosopher Helen Nissenbaum, whose framework of contextual integrity holds that privacy is the appropriate flow of information, not its absence. Models do not hold that line on their own. The ConfAIde benchmark found that leading models reveal information in ways people judge inappropriate a large share of the time, even when told to be careful.

Legible, and in the user's hands

The labs have converged on giving users the controls. OpenAI's memory ships with ways to see, edit, and turn off what it keeps. Anthropic scopes memory per project and offers an incognito mode that saves nothing. The research system MemGPT shows the architecture that makes this possible, explicit tiers a system pages in and out rather than one opaque store. The pieces below go into where memory earns trust and where it loses it.

Sources and further reading

Privacy as Contextual Integrity. Helen Nissenbaum, Washington Law Review, 2004
Can LLMs Keep a Secret? Testing Privacy via Contextual Integrity (ConfAIde). Mireshghallah et al., ICLR 2024
Memory and new controls for ChatGPT. OpenAI, 2024
Bringing memory to teams. Anthropic, 2025
MemGPT: Towards LLMs as Operating Systems. Packer et al., UC Berkeley, 2023

In this topic

March 12, 2025·6 min read

Memory is the trust surface

The moment a system remembers you, it takes on a responsibility it didn't have before.

Read →

June 18, 2026·5 min read

When remembering backfires

More memory is not more helpful. Unscoped recall can make a product feel like it is watching you.

Read →

Reflective Surfaces

What makes a conversation actually good.

The questions that do not fit in an eval. What makes a conversation land, and why trust is so hard to measure. New writing, in your inbox.

Subscribe on Substack →