4.01.2026

Running local models on Macs gets faster with Ollama’s
MLX support

Ollama, a runtime system for operating large language models on a local computer, has introduced support for Apple’s open source MLX framework for machine learning. Additionally, Ollama says it has improved caching performance and now supports Nvidia’s NVFP4 format for model compression, making for much more efficient memory usage in certain models.

3.31.2026

Anthropic Just Dropped the New Blueprint for Long-Running
AI Agents.

Anthropic's engineering team just published a deep dive on harness design for long-running agents. And buried in the technical details are some honest admissions and crucial insights that apply to anyone building multi-step AI systems, not just coding agents.



3.30.2026

Claude Skills: Better Code Beats Just Markdown

In this video, I look at how you can improve your Claude/Agent skills by improving the code quality in the scripts that are in there, and I use scraping scripts as an example of this. 



3.27.2026

LiteParse - The Local Document Parser

In this video, we look at LiteParse, a new open document parser created by the people at LlamaIndex. This library allows you to pass a variety of different types of documents. and output easily to text files or JSON.



3.26.2026

Google’s TurboQuant AI-compression algorithm can reduce
LLM memory usage by 6x

Google Research recently revealed TurboQuant, a compression algorithm that reduces the memory footprint of large language models (LLMs) while also boosting speed and maintaining accuracy.

TurboQuant is aimed at reducing the size of the key-value cache, which Google likens to a “digital cheat sheet” that stores important information so it doesn’t have to be recomputed.

3.25.2026

What is DeerFlow 2.0 and what should enterprises know about this new, powerful local AI agent orchestrator?

ByteDance, the Chinese tech giant behind TikTok, last month released what may be one of the most ambitious open-source AI agent frameworks to date: DeerFlow 2.0. It's now going viral across the machine learning community on social media. But is it safe and ready for enterprise use?

This is a so-called "SuperAgent harness" that orchestrates multiple AI sub-agents to autonomously complete complex, multi-hour tasks. Best of all: it is available under the permissive, enterprise-friendly standard MIT License, meaning anyone can use, modify, and build on it commercially at no cost.

3.24.2026

Claude Code MASSIVE Update! Claude Code OS, Computer Use, /Schedule, & More!

In this video, we’re breaking down all the latest features — including Claude controlling your computer, recurring cloud jobs with /schedule, DOM element selection, Dispatch from your phone, and way more.