2.12.2026

z.ai's open source GLM-5 achieves record low hallucination rate and leverages new RL 'slime' technique

Chinese AI startup Zhupai aka z.ai is back this week with an eye-popping new frontier large language model: GLM-5.

The latest in z.ai's ongoing and continually impressive GLM series, it retains an open source MIT License — perfect for enterprise deployment – and, in one of several notable achievements, achieves a record-low hallucination rate on the independent Artificial Analysis Intelligence Index v4.0. 

With a score of -1 on the AA-Omniscience Index—representing a massive 35-point improvement over its predecessor—GLM-5 now leads the entire AI industry, including U.S. competitors like Google, OpenAI and Anthropic, in knowledge reliability by knowing when to abstain rather than fabricate information.

2.11.2026

'Observational memory' cuts AI agent costs 10x and outscores
RAG
 
on long-context benchmarks

RAG isn't always fast enough or intelligent enough for modern agentic AI workflows. As teams move from short-lived chatbots to long-running, tool-heavy agents embedded in production systems, those limitations are becoming harder to work around.

In response, teams are experimenting with alternative memory architectures — sometimes called contextual memory or agentic memory — that prioritize persistence and stability over dynamic retrieval.

One of the more recent implementations of this approach is "observational memory," an open-source technology developed by Mastra, which was founded by the engineers who previously built and sold the Gatsby framework to Netlify.

2.10.2026

I tried a Claude Code rival that's local, open source,
and completely free

I was curious if Block's Goose agent, paired with Ollama and the Qwen3-coder model, could really replace Claude Code.

Goose, developed by Jack Dorsey's company Block, is an open-source agent framework, similar to Claude Code. Qwen3-coder is a coding-centric large language model similar to Sonnet-4.5.

2.09.2026

GPT-5.3 Codex Is INSANE! OpenAI’s BEST Model
Might Beat Opus 4.6?

GPT-5.3 Codex is blowing minds in early 2026! In this video, we put OpenAI’s fastest, most interactive coding AI to the test and compare it against Claude Opus 4.6—one of the deepest reasoning, long-horizon planning models out there.



2.06.2026

OpenAI’s GPT-5.3-Codex drops as Anthropic upgrades Claude

OpenAI on Wednesday released GPT-5.3-Codex, which the company calls its most capable coding agent to date, in an announcement timed to land at the exact same moment Anthropic unveiled its own flagship model upgrade, Claude Opus 4.6. The synchronized launches mark the opening salvo in what industry observers are calling the AI coding wars — a high-stakes battle to capture the enterprise software development market.

Anthropic releases Opus 4.6 with new ‘agent teams’

On Thursday, Anthropic released the latest version of Opus — its most advanced model and a particularly important model for Claude Code. Opus 4.5 was only released last November, and with 4.6, the company has sought to broaden its model’s capabilities and appeal, allowing for a greater variety of uses and customers.

Perhaps the most notable addition to the newest version of Opus is the inclusion of what the company calls “agent teams” — teams of agents that can split larger tasks into segmented jobs.

2.05.2026

Apple integrates Anthropic’s Claude and OpenAI’s Codex into Xcode 26.3 in push for ‘agentic coding’

Apple on Tuesday announced a major update to its flagship developer tool that gives artificial intelligence agents unprecedented control over the app-building process, a move that signals the iPhone maker's aggressive push into an emerging and controversial practice known as "agentic coding."

Xcode 26.3, available immediately as a release candidate, integrates Anthropic's Claude Agent and OpenAI's Codex directly into Apple's development environment, allowing the AI systems to autonomously write code, build projects, run tests, and visually verify their own work — all with minimal human oversight.