Machine Learning Ops Engineer
Great Value Hiring
seniorcontractbackenddata United States 4 days ago via LinkedIn
180,000 - 260,000 USD/daily
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
MLOpsJAXPyTorchTritonPallasDistributed TrainingFSDPGPU KernelsCI/CDTechnical Writing
About the role
Role Overview
Great Value Hiring is looking for a Machine Learning Ops (MLOps) Engineer with hands-on expertise in JAX, PyTorch, and kernel-level optimization using Pallas/Triton. The role focuses on improving AI training quality and elevating performance of ML training data and MLOps infrastructure.
Responsibilities
- Guide research and engineering teams to close knowledge gaps in MLOps, training infrastructure, and ML framework-level topics
- Design domain-relevant tasks and produce accurate solutions for MLOps/ML systems problems
- Evaluate MLOps tasks/solutions and provide clear written technical feedback
- Create guidelines, rubrics, and evaluation frameworks for:
- training pipeline design
- distributed systems reasoning
- kernel-level optimization
- Collaborate with subject matter experts to ensure consistency and accuracy in training data
- Engage reliably 30+ hours/week on weekdays
Requirements
- 5+ years of professional experience in ML infrastructure, MLOps, or ML systems engineering
- Hands-on production experience with JAX and/or PyTorch at scale, including distributed training approaches such as:
- FSDP
- tensor parallelism
- pipeline parallelism
- Experience with memory optimization and framework-level debugging
- Ability to write or optimize custom GPU kernels using Pallas (JAX) or Triton, including:
- tiling strategies
- memory layout design
- kernel fusion
- Strong written communication skills to explain complex technical decisions clearly
Nice-to-haves
- Demonstrable career progression
Contract / Work Type
- W-2 employment position
- Contingent role
Scraped 5/12/2026