We’ve organized every stage and persona in the AI supply chain, informed by real recruiting at frontier companies. Click any row to see matching profiles from our talent graph.







Summary
Known as: Data Center Operations Lead, Hardware Operations Engineer, Site Reliability Engineer, Facilities Engineer, Power Engineer, Procurement Manager (Hardware)
Manages physical AI infrastructure at scale: hardware deployment, bring-up, maintenance, incident response, site selection, power procurement, and vendor coordination.
Specializations
Covers on-prem infrastructure; cloud-native capacity planning (GPU-as-a-service, spot vs. reserved) is a different hiring profile typically staffed alongside serving teams.
Where the Work Lives
Manages physical infrastructure: site selection, power procurement, cooling, and facilities engineering.
Deploys, maintains, and operates GPU clusters and networking hardware at scale.
Candidate Archetypes
Deploys, repairs, and keeps GPU fleets available under real outage pressure.
Secures megawatt-scale power, cooling, and permitting pathways that make cluster growth possible.
Forecasts demand, manages vendors, and closes timing gaps between model roadmap and hardware arrival.
Company Scale
No longer hyperscalers-only. Growth-stage companies building clusters (xAI, CoreWeave, Anthropic) hire aggressively. Expanding as GPU-intensive training moves in-house.
Featured Roles
If you’re hiring at the AI frontier, let’s talk.