6.25.2026

Mistral launches OCR 4, turning document extraction into
a full enterprise AI play

Mistral AI on Tuesday released OCR 4, a document intelligence model that moves beyond raw text extraction to return structured representations of entire documents — complete with bounding boxes, block-type classification, and per-word confidence scores. The release marks Mistral's fourth generation of optical character recognition technology in roughly 15 months and lands at a moment when the company's pitch for European AI sovereignty has never been more commercially relevant.

6.24.2026

VibeThinker 3B - Taking on Giant Models

In this video, I look at VibeThinker 3b and how it is beating some models that are 300x its size on certain benchmarks by improving its reasoning and chain of thought to be better for specific use cases.  While the model is not for production it shows what could be done with these techniques.



6.23.2026

Researchers introduce Self-Harness, a framework that lets AI agents rewrite their own rules, boosting performance up to 60%

Researchers at the Shanghai Artificial Intelligence Laboratory have introduced “Self-Harness,” a new paradigm in which an LLM-based agent systematically improves its own operating rules. By examining its own execution traces to apply edits, the system trades manual guesswork for empirical evidence.

Self-improving harnesses can enable development teams to deploy robust custom agents that continually adapt their own execution protocols to overcome model-specific weaknesses.

6.22.2026

Anthropic ships major Claude Design overhaul with design system imports, code round-trips, and a fix for its token-burning problem

Anthropic is shipping a substantially overhauled version of Claude Design that attempts to fix the extreme token consumption issue while simultaneously repositioning the product from a flashy demo into something far more strategically important: a design system compliance layer that connects to code, connects to the tools enterprises already use, and — critically — keeps everything on brand.

6.18.2026

Kimi K2.7 Code: BEST Open Source Model? REALLY Cheap and
Beats
 
Opus 4.8 and GPT 5.5?

Kimi K2.7 Code might be one of the most impressive open-weight coding models released so far. In this video, we fully test Moonshot AI’s latest coding-focused model and see how it compares against Opus 4.8 Max, GPT-5.5, Fable 5, Qwen 3.7 Max, Grok 4.3, and other top coding models.



6.17.2026

Z.ai’s open-weights GLM-5.2 beats GPT-5.5 on multiple long-horizon coding benchmarks for 1/6th the cost

Today, Chinese AI startup Z.ai (formerly Zhipu AI) announced the immediate release of GLM-5.2, a 753-billion parameter open-weights large language model (LLM) engineered specifically to dominate "long-horizon" autonomous coding and engineering tasks. 

Available immediately on Hugging Face, the Z.ai API, and more than 20 third-party coding environments, the model boasts a highly stable 1-million-token context window alongside enterprise subscription tiers starting at just $12.60 per month.

6.16.2026

WWDC26: Run local agentic AI on the Mac using MLX

Run AI agents locally with privacy, low latency, and offline access. Dive into how MLX advancements and Mac hardware make powerful agentic workflows possible entirely on-device. You’ll explore code agents such as OpenCode, see how they integrate into Xcode, learn techniques for multi-Mac scaling, and discover how to integrate tools seamlessly — without ever leaving your machine.