AI Digest: Claude Code Leak, Ollama MLX & Google's Time-Series Model

The final hours of March 2026 have delivered a concentrated burst of AI news spanning security incidents, infrastructure breakthroughs, and model releases. Whether you're running inference on Apple hardware, building agentic pipelines, or managing enterprise AI deployments, there's something here demanding your attention. Here are the four developments that matter most right now.

Claude Code Source Code Leaked via NPM Map File

The most consequential story of the past 48 hours is the leak of Claude Code's source code through an exposed map file in Anthropic's NPM registry. Source map files, intended to aid debugging in development environments, can inadvertently expose original, human-readable source code when published alongside minified JavaScript packages — a well-known but frequently overlooked operational security risk.

The timing is particularly sharp given a separate revelation: Anthropic has acknowledged that Claude Code users are hitting usage limits far faster than expected. The combination of unexpectedly high adoption and a public source exposure creates a dual pressure point for Anthropic's engineering and security teams. For developers and enterprises relying on Claude Code in their workflows, the leak raises immediate questions about proprietary logic, prompt structures, and any hardcoded configuration that may now be visible to competitors and adversarial actors.

This incident is a timely reminder that AI tooling published through traditional software registries inherits all the classic supply-chain and operational security risks — and that rapid product growth can outpace security review cycles.

Ollama Gains MLX Acceleration on Apple Silicon

In more constructive infrastructure news, Ollama has announced a preview integration powered by MLX, Apple's machine learning framework optimised for Apple Silicon. For developers running local inference on M-series Macs, this is a meaningful performance upgrade. MLX is designed to exploit the unified memory architecture of Apple Silicon chips, reducing data movement overhead and improving throughput for both training and inference workloads.

Ollama has become one of the most widely used tools for running open-weight models locally, and MLX support positions it to compete more directly with GPU-accelerated cloud inference for a large and growing segment of developers who prefer on-device, privacy-preserving workflows. The preview designation suggests rough edges remain, but the direction of travel is clear: local inference on Apple hardware is becoming a first-class citizen in the AI development stack.

For teams evaluating total cost of inference and latency-sensitive applications, this development is worth benchmarking against your current setup.

Google Releases 200M-Parameter Time-Series Foundation Model

Google has quietly shipped a significant specialised model: a 200-million-parameter time-series foundation model supporting a 16,000-token context window. While the broader industry conversation fixates on ever-larger general-purpose models, this release reflects a maturing recognition that domain-specific foundation models can deliver outsized value at a fraction of the compute cost.

A 16k context window for time-series data is substantively useful — it means the model can reason over extended historical sequences, which is critical for use cases including anomaly detection, demand forecasting, financial signal analysis, and infrastructure monitoring. Foundation models for time-series have historically lagged behind their language and vision counterparts in capability and accessibility, making this a notable step forward.

For ML engineers and data scientists working with sequential or temporal data, this is a model worth evaluating. It also signals that the foundation model paradigm is successfully crossing over from language into structured, numerical domains at production-relevant scales.

Containerized Agent Hosting and the Maturing Agentic Infrastructure Stack

A project called Coasts — Containerized Hosts for Agents surfaced prominently in developer communities, pointing to a broader and accelerating trend: the industrialisation of agentic AI infrastructure. Running autonomous agents reliably in production requires isolation, resource controls, reproducibility, and observability — exactly the problems containerisation has solved for conventional software services.

Coasts represents one of several emerging approaches to packaging and hosting agents with the same operational discipline applied to microservices. As agentic workflows move from demos into production environments, the infrastructure layer is becoming a genuine differentiator. Questions of how you host, scale, and monitor agents are increasingly as important as which model powers them.

Isolation: Containers limit blast radius from agent failures or unexpected tool use
Reproducibility: Declarative environments reduce the "works on my machine" problem for agent behavior
Observability: Standard container tooling integrates with existing monitoring stacks

Early-stage projects like Coasts are worth watching as reference implementations for teams designing their own agent deployment strategies.

Taken together, today's developments paint a picture of an AI ecosystem simultaneously racing forward and grappling with the operational maturity that rapid growth demands. The Claude Code leak is a cautionary tale about security discipline at speed; Ollama's MLX integration and Google's time-series model reflect genuine infrastructure progress; and the emergence of containerised agent hosting suggests the agentic era is moving decisively from experimentation into engineering. Stay sharp — the pace isn't slowing.

AI Digest: Claude Code Leak, Ollama MLX & Google's Time-Series Model

Claude Code Source Code Leaked via NPM Map File

Ollama Gains MLX Acceleration on Apple Silicon

Google Releases 200M-Parameter Time-Series Foundation Model

Containerized Agent Hosting and the Maturing Agentic Infrastructure Stack

Run AI inference without the GPU bill

More from the blog

How AI Inference Is Reshaping E-Commerce & Retail in 2026

AI Funding, Shutdowns & Workplace Shifts: March 30, 2026

How AI Inference Is Transforming Legal & Compliance in 2026

Claude Code Source Code Leaked via NPM Map File

Ollama Gains MLX Acceleration on Apple Silicon

Google Releases 200M-Parameter Time-Series Foundation Model

Containerized Agent Hosting and the Maturing Agentic Infrastructure Stack

Share this post

Run AI inference without the GPU bill

More from the blog

How AI Inference Is Reshaping E-Commerce & Retail in 2026

AI Funding, Shutdowns & Workplace Shifts: March 30, 2026

How AI Inference Is Transforming Legal & Compliance in 2026