San Francisco · Platform engineer since 2007
Zane
Williamson.
Lead Software Engineer · Salesforce Falcon TIDE
I build the Terraform automation platform at Salesforce — most recently a
custom HTTP backend that captures state metadata in-flight across
millions of statefiles, fronted by an LLM query interface
exposed as both REST and an MCP server. In parallel I'm leading the
consolidation of our EKS fleet into a smaller, centralized set.
Outside the day job, I run a long-running platform-engineering consulting
practice for Geniuslink — 13 years and counting, where I
own their multi-cloud Kubernetes platform end-to-end. The cadence shows
up on GitHub: 165 merged PRs and ~2,000 direct commits in the last
17 months, most of it scaffolded with Claude Code and Cursor and
shipped under my review.
The next bet is inference performance: going deep on vLLM, CUDA, and
LLM-serving internals for a late-2026 pivot toward labs and inference startups.
165
merged PRs
last 17 months
2,000+
direct commits
last 17 months
602
lifetime merged PRs
on GitHub
1000+
Garmin watch face
downloads ↗
19yrs
building infrastructure
since 2007
01 About
I'm a platform engineer with deep Kubernetes, Terraform, and observability
chops, currently on the Falcon TIDE team at Salesforce —
Terraform Infrastructure Developer Experience. Before that:
Senior Staff SRE at Varo Bank,
Senior Staff at Flexport (greenfield K8s platform), and
Principal at Zillow (lead the migration to Kubernetes on
AWS, top contributor to Apache solr-operator).
The pattern of work I keep finding is the same one: take an opaque
infrastructure surface, instrument it, turn that instrumentation into an
API, then build the developer-facing tooling that makes the API useful.
I built it at Zillow (cidr-house-rules —
Lambda + DynamoDB + API Gateway auditing VPC/CIDR/EIP/NAT across dozens of
AWS accounts, feeding data-driven Terraform modules). I'm building it
again at Salesforce, now with an LLM and an MCP server on the
developer-facing side.
I've been doing this since 2007 — System Administrator → Sales Engineer →
Sr. Sysadmin (Xen / Puppet / Cobbler) → Operations Engineer → DevOps →
Principal Engineer → Senior Staff → Lead Software Engineer. The titles
change. The work is the same shape: make complex infrastructure legible
to the people who have to use it.
In parallel with the day job I've been running a 13-year
platform-engineering consulting practice for Geniuslink
— owning their multi-cloud Kubernetes platform, observability stack,
Terraform automation and developer-environment tooling. Then on the
product side, I ship things: a health-data SaaS, a photo-to-recipe app
with an iOS companion, a catalogue of Garmin watch faces past 1000+
downloads on the Connect IQ store, and the dashboard I use to track
that catalogue's revenue. All built fast, with Claude in the loop, and
shipped.
02 Selected work
Designed and built a custom Terraform HTTP backend that captures state
metadata in-flight during apply and persists it to DynamoDB.
Fronted by an LLM query interface exposed as both REST and an MCP server,
so engineers can ask natural-language questions of the platform's
Terraform footprint without piping statefiles through their terminals.
- Scale millions of statefiles
- Surfaces REST API · MCP server
- Storage DynamoDB metadata index
- Why it matters turns Terraform from a write-only system into a queryable one
Leading the consolidation of a sprawled EKS fleet into a smaller,
centralized set of clusters. The work is the unglamorous side of
platform — fewer clusters, sharper SLOs, less operator-toil per
engineering hour.
I've been Geniuslink's outsourced platform team since 2013 — running
alongside every full-time role I've held. I own the full Kubernetes
platform across EKS, Linode LKE and DigitalOcean,
ArgoCD for app + infra GitOps, a Loki / Mimir / OTel-Operator
observability stack, Helmfile-driven deploys with per-PR dynamic dev
environments, Terraform across the whole footprint, Infisical for
secrets, Teleport for access, and Fastly at the edge. In the last
17 months alone: 165 merged PRs and ~1,600 direct commits
across 7 active repos (the Genius Link side of the total).
- Stack EKS · LKE · DO · ArgoCD · Helmfile · Loki / Mimir / OTel · Teleport · Infisical · Fastly
- Cadence 165 PRs + ~2,000 commits, last 17 months
- Why it works the work is the cadence — proof that the practice still ships at full-team speed alongside a full-time role
Owned the platform reliability stack: deployed Grafana Loki + Tempo with
the OpenTelemetry Operator for full-stack observability, established
SLOs/SLIs across multiple engineering platforms, and migrated the org
off Helm onto ArgoCD for deployment lifecycle management. Migrated
K8s auth from kube2iam to AWS client auth. Enabled cluster autoscaling
for reliability + cost. Built a Kafka outbox + Postgres → data lake
messaging pipeline. Authored and maintained the engineering Go CLI for
team-wide automation.
- Stack EKS · ArgoCD · Loki · Tempo · OTel Operator · Kafka · Postgres
- Languages Go (CLI), Terraform, Java + NodeJS instrumentation
Led the greenfield design and rollout of Flexport's modern K8s platform:
Terraform AWS + Helm Provider driven EKS, GitOps for infra and apps,
Teleport-secured cluster access, Python CLI as the pipeline entrypoint,
Terraform modules for rapid platform iteration. Most of the day-to-day
was pair-programming with senior engineers ("Voltron") to spread the
platform knowledge — that pairing model is something I'd run again.
Designed and open-sourced
cidr-house-rules —
a Python serverless API (Lambda + DynamoDB + API Gateway) auditing VPC
CIDR, EIP and NAT-Gateway utilization across dozens of AWS accounts,
exposed as a secure API that powered data-driven Terraform modules.
Lead contributor to apache/solr-operator.
Designed and open-sourced hyper-kube-config
as the K8s config-secret store. Drove the Kubernetes-on-AWS rollout,
the Istio sidecar deployment (tracing + mTLS), and the Envoy proxy in
front of Varnish.
03 The AI-velocity story
Everyone says they "use AI." I want to show the receipts, because the
story isn't "I prompt Claude and it writes the code" — it's that an
experienced platform engineer with a tuned agent workflow ships at a
cadence that closes the gap to a small team.
Shipped to production: an MCP server on Salesforce Terraform infra
The Falcon TIDE backend's LLM/MCP frontend is production AI-native
infrastructure, not a side experiment. Almost no platform team in a
large k8s+terraform shop has shipped one of these yet. This is the
single artifact I'd point any AI-infra recruiter at.
Cadence: 165 PRs + ~2,000 commits in 17 months
Across 15 active repos (Salesforce work isn't in this count — those
repos are private). The work spans platform / observability rollouts,
Terraform modules, ArgoCD across EKS / DigitalOcean / Linode, Helm
chart maintenance, and a steady stream of personal AI projects.
Receipts: real products in the Products section below
MAHA Healer (health-data SaaS, AI lab
parsing), TasteKeeper (photo-to-recipe with
an iOS companion), and 1000+ downloads across my Garmin Connect IQ
watch faces in Monkey C.
Claude inside platform workflows
Shipped a "Claude skill" + escalated service into a production
microservice flow on the Genius Link platform I still maintain. Use
Claude Code as the default scaffolding layer; Cursor for editor-level
edits; agents for repeatable platform-ops tasks.
04 Products shipped with Claude
I treat Claude as a tool for shrinking the gap between "I want this to
exist" and "it exists." Three things that came out of that loop:
A health-data platform for families. Upload lab PDFs, photos or scans
and Claude parses out biomarkers, values and reference ranges; chart
trends across reports; track multiple family members; chat with the
data; bring your own model keys for end-to-end encrypted analysis.
Production SaaS with three pricing tiers — not a demo.
Capture your kitchen memories with AI. Photograph finished dishes and
prep shots; the app identifies ingredients, estimates quantities, and
generates the recipe — building a personal cookbook with a
review/edit/approve flow. Web app plus a companion iOS app built end-to-end with Claude
as collaborator.
A custom dashboard I built to track sales and revenue across my own
Garmin watch face catalogue — pulling from the Garmin developer API,
charting trends, attributing revenue per face. Same pattern from the
Salesforce work, smaller domain: instrument an opaque API, surface
the data, build the interface I actually want to use.
A growing catalogue of watch faces I design and ship to the Garmin
Connect IQ Store — Monkey C source, AI-assisted scaffolding,
hand-tuned visuals. Recent shipped faces include the Burger Watch
(now v2 with a supersize option), Thin Blue Line Tactical, Corgi
Tamagotchi, T-Rex, Spooky, Golden Gate, and a Sales Tracker. Hardware-shaped distribution and a 1000+ download footprint
across the catalogue.
05 The 2026 bet — inference performance
The next compound move is the same shape I've made before. Take a system
I understand at the infra layer. Push it down to where the performance is
actually decided.
I'm going deep on vLLM and CUDA at the level of the
kernels that matter for LLM-serving throughput — paged attention,
continuous batching, KV-cache reuse, speculative decoding, the
compile-time choices that decide tail latency on a real serving fleet.
Plan: contribute upstream to vLLM, tune a sizeable deployment end-to-end,
and pair up with people who already live at this layer.
Talk to me if you're running a large inference fleet, building one,
or hiring at a lab / inference startup. The strongest fit is somewhere
that values "can ship the platform AND own the perf story end-to-end"
over either dimension alone.