That expensive API subscription just lost to a free download

That expensive API subscription just lost to a free download.

The coding benchmark wars just took a turn nobody expected. A 753-billion parameter model from a Chinese startup just beat GPT-5.5 on multiple coding benchmarks, costs one-sixth the price, and is released under an MIT license. You can download it right now and run it on your own hardware.

This is not an incremental improvement. The gap between proprietary and open-weight models on coding tasks just evaporated.

What GLM-5.2 Actually Is

GLM-5.2 is a Mixture-of-Experts (MoE) model from Z.ai (formerly Zhipu AI). 753 billion total parameters, 40 billion active per forward pass. The architecture uses what they call "IndexShare", a shared indexer across every four sparse attention layers that cuts per-token compute by 2.9x at the 1-million-token context limit.

The model has two thinking modes: "Max" for complex reasoning (higher token usage), and "High" for latency-sensitive work (halves required output tokens). You can toggle between them via API parameters.

No vision support yet. Text-in, text-out only.

The Benchmark Numbers

Here is where it gets interesting. GLM-5.2 does not just edge out GPT-5.5, it beats it convincingly on engineering-focused benchmarks:

Benchmark	GLM-5.2	GPT-5.5
SWE-bench Pro	62.1	58.6
FrontierSWE	74.4%	72.6%
MCP-Atlas	77.0	75.3
Humanity's Last Exam	54.7	52.2
PostTrainBench	34.3%	25.0%

On Terminal-Bench 2.1, it scored 81.0, the first open model to cross 80%. It ranked #1 on Design Arena with an ELO of 1360, beating Claude Fable 5.

A personal benchmark from r/ClaudeCode showed GLM-5.2 matching Claude Opus 4.8 and GPT-5.5 on a web app building task, scoring 94.44% compared to GPT-5.5's 95.56%. The catch: GLM-5.2 took 37 minutes versus 17 minutes for GPT-5.5. Speed is still the gap.

The Pricing Math

This is where the real disruption happens:

GLM-5.2 API: $1.40 input / $4.40 output per 1M tokens
GPT-5.5: roughly $8-10 input / $30 output per 1M tokens (estimated)
Claude Opus 4.8: similar to GPT-5.5

That is a 5-6x cost difference. For a team running 50K requests per day on coding tasks, the annual savings at GLM-5.2 pricing versus GPT-5.5 are in the hundreds of thousands of dollars.

Enterprise tiers start at $12.60/month for light usage. Self-hosting eliminates per-token fees entirely.

The model weights are on Hugging Face under MIT license. No restrictions on commercial use, modification, or local hosting.

Community Reaction

The response has been split but generally positive. On HN, the top comment highlighted GLM-5.2's performance on the "AA-Omniscience Non-Hallucination Rate" benchmark, one of the few tests that rewards models for admitting uncertainty rather than guessing. GLM-5.2 scored far higher than DeepSeek, GPT-5.5, or Fable on that metric.

From the VentureBeat coverage, AI observer Lisan al Gaib put it bluntly: "Frontier labs are absolutely scamming you on API pricing... [Open-model developers] are operating profitably without relying on the newest fancy Blackwell chips, while proprietary labs are probably at 90%+ margins at this point."

A DEV.to developer who tested GLM-5.2 on OpenCode for a real production feature noted the model jumped 11 points on the Artificial Analysis Intelligence Index compared to GLM-5.1 (from 40 to 51), despite being the same physical size. The improvement came from architectural changes, not parameter scaling.

The skeptical take: MoE models struggle with open-ended multi-step reasoning where the model needs to generate novel strategies rather than execute defined plans. On judgment-heavy tasks, closed models still have an edge in "polish."

Sources

So What

The numbers that matter most are not the benchmark scores. They are the cost figures and the license.

MIT license means anyone can self-host this. No API fees, no vendor lock-in, no geographic fencing. For teams in regions where OpenAI or Anthropic do not offer service, or for companies with strict data residency requirements, this is the first open model that actually competes on coding performance.

The speed gap is real, 37 minutes versus 17 minutes for the same task. But that gap is narrowing with each release. GLM-5.1 scored 62.0 on Terminal-Bench. GLM-5.2 scored 81.0. That is a 30% jump in one version.

The uncomfortable question for OpenAI and Anthropic: if an MIT-licensed model from a Chinese startup can match your coding performance at one-sixth the cost, what exactly are customers paying for? The answer is probably "ecosystem integration and polish." That is a valid reason, but it is a much weaker moat than "we are the only ones who can do this."

I expect the next six months to be defined by this exact pressure. The proprietary model tax is shrinking, and it is shrinking fast.

Don't Confuse It

GLM-5.2 is NOT:

GLM-5.1 (the predecessor, scored 62.0 on Terminal-Bench vs 81.0 for 5.2)
GLM-4 (the older generation, significantly weaker)
DeepSeek V4 (different model family, different architecture)

GLM-5.2 IS:

753B parameter MoE from Z.ai (formerly Zhipu AI)
MIT licensed, open weights on Hugging Face
Text-in/text-out only (no vision)
1 million token context window

What GLM-5.2 Actually Is

The Benchmark Numbers

The Pricing Math

Community Reaction

Sources

So What

Don't Confuse It

RELATED_ENTRIES

The smallest model in the room just took charge

A rocket company just bought your coding agent for $60B

59% SWE-Bench score from a model costing $0.30 per million tokens