xelys jobs xelys jobs

AI Infrastructure Engineer

Utilidata

full-remoteseniorpermanentbackenddevops United States Yesterday via LinkedIn
170,000 - 210,000 USD/annual

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

AI InfrastructureML Model ServingMLOpsDistributed SystemsGPU InferenceKubernetesDockerCI/CDPythonObservability

About the role

AI Infrastructure Engineer

Own and build the end-to-end infrastructure that powers Utilidata’s AI/ML models across edge, cloud, and data center deployments.

Responsibilities

  • Lead the design and build of Utilidata’s AI inference platform, including architecture patterns, deployment standards, and operational practices.
  • Own end-to-end model serving infrastructure for both on-prem and datacenter environments.
  • Build fault-tolerant, high-performance inference systems at scale with emphasis on low latency, reliability, and cost efficiency.
  • Collaborate with algorithms engineers to integrate power/inference data and configuration with power optimization algorithms.
  • Optimize GPU utilization and inference performance across the hardware fleet (including NVIDIA accelerators).
  • Establish MLOps best practices (CI/CD pipelines for model deployment, monitoring, and rollback across environments).
  • Contribute to infrastructure roadmap decisions (build vs. buy, tooling selection, platform evolution).

Minimum Qualifications

  • 5+ years of software engineering with a strong focus on AI infrastructure, backend systems, or distributed systems.
  • Hands-on experience with AI model serving frameworks such as vLLM, SGLang, Triton, TensorRT, or TorchServe (or similar).
  • Container orchestration/cluster management experience (Kubernetes, Docker).
  • Experience deploying and operating infrastructure across both datacenter and on-prem environments.
  • Strong understanding of GPU workloads and inference vs. training tradeoffs.
  • Proficiency in Python; C++, CUDA, Go, or Rust a plus.
  • Strong communication skills and ability to work cross-functionally in a lean environment.
  • Willingness to travel up to 10% of the time.

Nice to Have

  • Dynamo experience.
  • Edge AI deployments or constrained compute environments.
  • Infrastructure as code (Terraform, Helm).
  • Observability tooling (Datadog, Prometheus, Grafana).
  • Background in energy, utilities, or industrial IoT.

Compensation & Location

  • $170,000–$210,000 base + stock options (commensurate with experience).
  • Fully remote within the United States; periodic travel for retreats and key on-site engagements.

About Utilidata

Utilidata is a fast-growing NVIDIA-backed edge AI company that improves visibility and control of power utilization in energy-intensive infrastructure such as electric grids and data centers. Its distributed AI platform (Karman), powered by a custom NVIDIA module, helps utility companies operate at the grid edge and enables data centers to unlock more compute for the same provisioned power.

Scraped 4/15/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.