ZW

San Francisco · Platform engineer since 2007

Zane
Williamson.

Lead Software Engineer · Salesforce Falcon TIDE

I build the Terraform automation platform at Salesforce — most recently a custom HTTP backend that captures state metadata in-flight across millions of statefiles, fronted by an LLM query interface exposed as both REST and an MCP server. In parallel I'm leading the consolidation of our EKS fleet into a smaller, centralized set.

Outside the day job, I run a long-running platform-engineering consulting practice for Geniuslink — 13 years and counting, where I own their multi-cloud Kubernetes platform end-to-end. The cadence shows up on GitHub: 165 merged PRs and ~2,000 direct commits in the last 17 months, most of it scaffolded with Claude Code and Cursor and shipped under my review.

The next bet is inference performance: going deep on vLLM, CUDA, and LLM-serving internals for a late-2026 pivot toward labs and inference startups.

165 merged PRs
last 17 months
2,000+ direct commits
last 17 months
602 lifetime merged PRs
on GitHub
1000+ Garmin watch face
downloads ↗
19yrs building infrastructure
since 2007

01 About

I'm a platform engineer with deep Kubernetes, Terraform, and observability chops, currently on the Falcon TIDE team at Salesforce — Terraform Infrastructure Developer Experience. Before that: Senior Staff SRE at Varo Bank, Senior Staff at Flexport (greenfield K8s platform), and Principal at Zillow (lead the migration to Kubernetes on AWS, top contributor to Apache solr-operator).

The pattern of work I keep finding is the same one: take an opaque infrastructure surface, instrument it, turn that instrumentation into an API, then build the developer-facing tooling that makes the API useful. I built it at Zillow (cidr-house-rules — Lambda + DynamoDB + API Gateway auditing VPC/CIDR/EIP/NAT across dozens of AWS accounts, feeding data-driven Terraform modules). I'm building it again at Salesforce, now with an LLM and an MCP server on the developer-facing side.

I've been doing this since 2007 — System Administrator → Sales Engineer → Sr. Sysadmin (Xen / Puppet / Cobbler) → Operations Engineer → DevOps → Principal Engineer → Senior Staff → Lead Software Engineer. The titles change. The work is the same shape: make complex infrastructure legible to the people who have to use it.

In parallel with the day job I've been running a 13-year platform-engineering consulting practice for Geniuslink — owning their multi-cloud Kubernetes platform, observability stack, Terraform automation and developer-environment tooling. Then on the product side, I ship things: a health-data SaaS, a photo-to-recipe app with an iOS companion, a catalogue of Garmin watch faces past 1000+ downloads on the Connect IQ store, and the dashboard I use to track that catalogue's revenue. All built fast, with Claude in the loop, and shipped.

02 Selected work

Custom Terraform HTTP backend + LLM/MCP frontend

Salesforce · Falcon TIDE2024 — present

Designed and built a custom Terraform HTTP backend that captures state metadata in-flight during apply and persists it to DynamoDB. Fronted by an LLM query interface exposed as both REST and an MCP server, so engineers can ask natural-language questions of the platform's Terraform footprint without piping statefiles through their terminals.

  • Scale millions of statefiles
  • Surfaces REST API · MCP server
  • Storage DynamoDB metadata index
  • Why it matters turns Terraform from a write-only system into a queryable one

EKS fleet consolidation

Salesforce · Falcon TIDE2025 — present

Leading the consolidation of a sprawled EKS fleet into a smaller, centralized set of clusters. The work is the unglamorous side of platform — fewer clusters, sharper SLOs, less operator-toil per engineering hour.

Geniuslink — 13-year platform-engineering consulting practice

Geniuslink · Consulting2013 — present

I've been Geniuslink's outsourced platform team since 2013 — running alongside every full-time role I've held. I own the full Kubernetes platform across EKS, Linode LKE and DigitalOcean, ArgoCD for app + infra GitOps, a Loki / Mimir / OTel-Operator observability stack, Helmfile-driven deploys with per-PR dynamic dev environments, Terraform across the whole footprint, Infisical for secrets, Teleport for access, and Fastly at the edge. In the last 17 months alone: 165 merged PRs and ~1,600 direct commits across 7 active repos (the Genius Link side of the total).

  • Stack EKS · LKE · DO · ArgoCD · Helmfile · Loki / Mimir / OTel · Teleport · Infisical · Fastly
  • Cadence 165 PRs + ~2,000 commits, last 17 months
  • Why it works the work is the cadence — proof that the practice still ships at full-team speed alongside a full-time role

Kubernetes platform reliability + observability rebuild

Varo Bank · Senior Staff SRE2022 — 2024

Owned the platform reliability stack: deployed Grafana Loki + Tempo with the OpenTelemetry Operator for full-stack observability, established SLOs/SLIs across multiple engineering platforms, and migrated the org off Helm onto ArgoCD for deployment lifecycle management. Migrated K8s auth from kube2iam to AWS client auth. Enabled cluster autoscaling for reliability + cost. Built a Kafka outbox + Postgres → data lake messaging pipeline. Authored and maintained the engineering Go CLI for team-wide automation.

  • Stack EKS · ArgoCD · Loki · Tempo · OTel Operator · Kafka · Postgres
  • Languages Go (CLI), Terraform, Java + NodeJS instrumentation

Greenfield Kubernetes platform on AWS

Flexport · Senior Staff Engineer2021 — 2022

Led the greenfield design and rollout of Flexport's modern K8s platform: Terraform AWS + Helm Provider driven EKS, GitOps for infra and apps, Teleport-secured cluster access, Python CLI as the pipeline entrypoint, Terraform modules for rapid platform iteration. Most of the day-to-day was pair-programming with senior engineers ("Voltron") to spread the platform knowledge — that pairing model is something I'd run again.

cidr-house-rules + AWS Terraform automation at scale

Zillow · Principal Engineer2015 — 2021

Designed and open-sourced cidr-house-rules — a Python serverless API (Lambda + DynamoDB + API Gateway) auditing VPC CIDR, EIP and NAT-Gateway utilization across dozens of AWS accounts, exposed as a secure API that powered data-driven Terraform modules. Lead contributor to apache/solr-operator. Designed and open-sourced hyper-kube-config as the K8s config-secret store. Drove the Kubernetes-on-AWS rollout, the Istio sidecar deployment (tracing + mTLS), and the Envoy proxy in front of Varnish.

03 The AI-velocity story

Everyone says they "use AI." I want to show the receipts, because the story isn't "I prompt Claude and it writes the code" — it's that an experienced platform engineer with a tuned agent workflow ships at a cadence that closes the gap to a small team.

Shipped to production: an MCP server on Salesforce Terraform infra

The Falcon TIDE backend's LLM/MCP frontend is production AI-native infrastructure, not a side experiment. Almost no platform team in a large k8s+terraform shop has shipped one of these yet. This is the single artifact I'd point any AI-infra recruiter at.

Cadence: 165 PRs + ~2,000 commits in 17 months

Across 15 active repos (Salesforce work isn't in this count — those repos are private). The work spans platform / observability rollouts, Terraform modules, ArgoCD across EKS / DigitalOcean / Linode, Helm chart maintenance, and a steady stream of personal AI projects.

Receipts: real products in the Products section below

MAHA Healer (health-data SaaS, AI lab parsing), TasteKeeper (photo-to-recipe with an iOS companion), and 1000+ downloads across my Garmin Connect IQ watch faces in Monkey C.

Claude inside platform workflows

Shipped a "Claude skill" + escalated service into a production microservice flow on the Genius Link platform I still maintain. Use Claude Code as the default scaffolding layer; Cursor for editor-level edits; agents for repeatable platform-ops tasks.

Anthropic console + Cursor numbers (tokens / month, completion-acceptance rate, before-and-after PR throughput) added soon.

04 Products shipped with Claude

I treat Claude as a tool for shrinking the gap between "I want this to exist" and "it exists." Three things that came out of that loop:

MAHA Healer

mahahealer.com Health-data SaaS · Free / $9 / $29 tiers

A health-data platform for families. Upload lab PDFs, photos or scans and Claude parses out biomarkers, values and reference ranges; chart trends across reports; track multiple family members; chat with the data; bring your own model keys for end-to-end encrypted analysis. Production SaaS with three pricing tiers — not a demo.

TasteKeeper

tastekeeper.dev Web + iOS app

Capture your kitchen memories with AI. Photograph finished dishes and prep shots; the app identifies ingredients, estimates quantities, and generates the recipe — building a personal cookbook with a review/edit/approve flow. Web app plus a companion iOS app built end-to-end with Claude as collaborator.

Garmin Sales Dashboard

Internal tooling · personal Built with Claude

A custom dashboard I built to track sales and revenue across my own Garmin watch face catalogue — pulling from the Garmin developer API, charting trends, attributing revenue per face. Same pattern from the Salesforce work, smaller domain: instrument an opaque API, surface the data, build the interface I actually want to use.

Garmin Connect IQ watch faces

apps.garmin.com → my developer page Monkey C · 1000+ downloads

A growing catalogue of watch faces I design and ship to the Garmin Connect IQ Store — Monkey C source, AI-assisted scaffolding, hand-tuned visuals. Recent shipped faces include the Burger Watch (now v2 with a supersize option), Thin Blue Line Tactical, Corgi Tamagotchi, T-Rex, Spooky, Golden Gate, and a Sales Tracker. Hardware-shaped distribution and a 1000+ download footprint across the catalogue.

05 The 2026 bet — inference performance

The next compound move is the same shape I've made before. Take a system I understand at the infra layer. Push it down to where the performance is actually decided.

I'm going deep on vLLM and CUDA at the level of the kernels that matter for LLM-serving throughput — paged attention, continuous batching, KV-cache reuse, speculative decoding, the compile-time choices that decide tail latency on a real serving fleet. Plan: contribute upstream to vLLM, tune a sizeable deployment end-to-end, and pair up with people who already live at this layer.

Talk to me if you're running a large inference fleet, building one, or hiring at a lab / inference startup. The strongest fit is somewhere that values "can ship the platform AND own the perf story end-to-end" over either dimension alone.

06 Open source

Single-PR landings across istio, helm, kubernetes-sigs, cert-manager, hashicorp, DataDog, cloudquery, cloud-custodian, and grafana/grafana-ansible-collection.

07 Get in touch

Best way is email. I read everything; I answer most things within a day.