Research log

Writing

Experiment notes, architecture decisions, and the failures along the way. I write to organize my thinking and to leave a public record of what I'm learning.

01

I Compressed LLM Context 10x and Kept 89% of the Facts

Sentence-level pooled embeddings as a drop-in replacement for raw tokens in LLM context. 89% fact extraction at 10x compression, scaling flat to 500 pools.

researchembeddingskv-cache
15 min