Agents & AI Engineering

AIH03 Local-First AI Development: Sovereignty, Silicon, and the Private Dev Loop

11/19/2026

9:30am - 10:45am

Level: Intermediate

Brian A. Randell

Partner

MCW Technologies

"Run it locally" used to mean a hobby project on a spare GPU. In 2026 it means something else. Teams in regulated industries, sovereign-cloud jurisdictions, and IP-sensitive shops are quietly building real development workflows around models they host themselves, because shipping source code, prompts, and customer data to a frontier API is no longer a decision they get to make. The question stopped being whether to do this and started being how to do it without buying the wrong hardware, picking the wrong model, or building something slower than what you replaced.

This session walks the three pillars in the title. Sovereignty covers the actual drivers: protecting source code and proprietary prompts, satisfying data residency rules, reducing vendor exposure, and the quieter reason most teams don't say out loud, which is keeping the AI bill predictable. Silicon is where Brian shares results from extensive testing across a wide range of local hardware, from older NVIDIA cards through the RTX 6000 Pro Blackwell, Apple Silicon, AMD systems with NPUs, and Windows on ARM. You'll see where each platform earns its price tag and where the marketing doesn't survive contact with a real workload. The private dev loop is where it gets practical: picking models for code, chat, embeddings, and agent workflows; hosting them with Foundry Local, Ollama, or standard OpenAI-compatible endpoints from .NET; and wiring it all into the way your team already works.

Brian also covers the part most talks skip: when local is the wrong answer, when a hybrid pattern wins, and how to tell the difference before you've spent the budget.

You will learn:

  • The real drivers pushing development teams toward local-first AI, and how to evaluate whether your situation actually qualifies
  • How modern hardware across NVIDIA, Apple Silicon, AMD, and Windows on ARM compares on real .NET development workloads, with honest cost and latency numbers
  • Practical patterns for hosting and integrating local models into a .NET development workflow, including when to stay hybrid rather than going fully local