Senior Machine Learning Engineer (AI Platform)
Mozilla
full-remoteseniorpermanentbackenddata Full remote - New York, US Today via WTTJ
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Machine LearningModel ServingInference OptimizationPythonGPUDockerKubernetesCI/CDObservabilityPrivacy-Preserving ML
About the role
Role Overview
Join Mozilla’s AI Platform team as a Senior Machine Learning Engineer (AI Platform). You will design, build, and operate core AI platform components, optimize inference systems, and improve the ML model lifecycle—while meeting strict performance and privacy requirements.
Responsibilities
- Design, build, and operate core AI platform components for training, deployment, and production serving of ML models.
- Own model serving and inference workflows end-to-end, improving reliability, scalability, performance, and operational excellence.
- Lead efforts to optimize inference systems for throughput, latency, and cost efficiency across CPU and GPU workloads.
- Collaborate closely with product, infrastructure, and security teams to enable fast iteration with production-grade constraints.
- Debug and resolve performance and reliability issues in distributed systems.
- Develop and support ML development workflows and CI/CD for reliable deployment.
- Build observability for distributed services (metrics strategy and performance profiling).
Requirements
- Solid understanding of model serving architectures, inference pipelines, and performance tradeoffs (latency, throughput, cost, scaling).
- Ability to independently scope and drive initiatives while balancing product and operational priorities.
- Strong problem-solving and ability to debug performance/reliability issues in distributed systems.
- Clear communication skills and experience collaborating across engineering, product, and infrastructure.
- Hands-on experience with GPU-based workloads and accelerated computing in production.
- 4–6 years relevant experience with a Bachelor’s degree, or a Master’s with significant hands-on experience (or equivalent).
- Strong Python experience for ML systems, backend services, or distributed data processing.
- Proven experience deploying and operating ML workloads in cloud environments.
- Experience with inference optimization strategies such as batching, quantization, compilation, model conversion, or hardware-specific tuning.
- Familiarity with Docker and Kubernetes in production.
- Experience designing observability systems for distributed services.
- Exposure to privacy-preserving ML, security best practices, or responsible AI design.
- Experience designing CI/CD pipelines and development workflows for reliable ML deployments.
Nice to Have
- Contributions to open-source ML infrastructure projects.
- Leadership in building reusable internal ML tooling.
About Mozilla
Mozilla builds internet products and services with a focus on privacy, security, and open-source collaboration. Its AI Platform team develops and operates core AI systems that support training, deployment, and serving of machine learning models in production.
Scraped 6/13/2026