Anthropic researchers detail natural language autoencoders, which convert LLM activations, the numbers encoding a model's thoughts, into natural language text

When you talk to an AI model like Claude, you talk to it in words. Internally, Claude processes those words as long lists of numbers...