The Protege Pattern: Small Models That Know When to Ask
A 7B model with sparse expert access matches agents 50x its size. Route routine work to small models and judgment calls to frontier models.
AI & TechnologyThoughts on design, development, AI infrastructure, and building products.
A 7B model with sparse expert access matches agents 50x its size. Route routine work to small models and judgment calls to frontier models.
AI & TechnologyClaude Code is not an IDE feature. It is infrastructure. 84 hooks, 48 skills, 19 agents, and 15,000 lines of orchestration prove the point.
AI & TechnologyYou cannot debias yourself by trying harder. 10 AI agents debating each other is a structural intervention for better decisions.
AI EngineeringClaude Code vs Codex CLI, scored blind on 5 dimensions across 36 duels. The winner matters less than the synthesis combining both agents' strongest ideas.
AI EngineeringProduction evidence submitted to NIST: AI agent threats are behavioral. 7 failure modes, 3-layer defense, and framework gaps from 60 daily sessions.
AI & Technology121,000 developers surveyed, 92.6% using AI tools, productivity stuck at 10%. The wall is infrastructure, not intelligence. Three root causes and fixes.
AI EngineeringAn autonomous agent published fabricated claims to 8 platforms over 72 hours. Training-phase safety failed at the publication boundary. Here is the fix.
AI & TechnologyWhat 84 hooks, 43 skills, and 19 agents look like as a production agent orchestration layer. Three patterns that transfer to any agent harness.
AI & TechnologyLLMs lose 39% accuracy across 200K+ multi-turn sessions. Three mechanisms drive collapse and longer context windows fix none of them.
AI & Technology15,800 Obsidian notes in embedding space reveal three knowledge topologies. Each has failure modes you can diagnose and reshape with bridge notes.
AI & TechnologyRuntime constitutions enforce AI agent governance where training-phase alignment fails. Competence checks, output gates, and four subsystems keep agents safe.
AI & TechnologyFive research groups published about the same problem this week: AI agents produce code faster than developers can understand it. The debt is in your head.
AI & TechnologyTechnical writing at Introl
Comprehensive hardware recommendations and cost analysis for running large language models locally.
GPU selection guide comparing NVIDIA's latest datacenter accelerators for different AI workloads.
Deep technical dive into Google's Tensor Processing Unit evolution from TPUv1 to TPUv5.
Resource sharing strategies for GPU clusters in containerized environments.
Guide to building and managing distributed AI computing with Ray framework.
Analysis of open source LLM economics and DeepSeek's competitive positioning.
Future datacenter power requirements and NVIDIA's next-generation GPU roadmap.
Small modular reactor solutions for powering next-generation AI infrastructure.
Technical analysis of DeepSeek's Multi-Head Compression architecture innovations.