Source: OpenAI engineers earlier this month told some colleagues they had figured out a way to more than halve the cost of inference

We closely track efforts by Anthropic, Google and OpenAI to get access to more server chips to run their models. But we don't talk enough about the work...