Senior Software Engineer (ML Infrastructure)
Voxel
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreAbout the role
Join Voxel as a Senior Software Engineer and take ownership of the ML Infrastructure that powers the training and deployment of vision models. You will build systems that enable the applied ML team to train multiple models concurrently, manage experiments, and ship optimized models to production. You will set technical direction, write code, make architecture decisions, and collaborate closely with applied CV, ML Data, and Platform engineers. Additionally, you will establish ML experiment tracking and lifecycle management, implement DevOps-for-ML best practices on AWS, and design scalable solutions that support model development. Key missions: Construire et maintenir l'infrastructure d'entraînement qui permet à l'équipe ML appliquée de former plusieurs modèles simultanément, de gérer les expériences et d'itérer rapidement sur de nouvelles architectures.. Posséder le passage de l'entraînement au déploiement - exporter les modèles entraînés vers des formats d'inférence optimisés (TensorRT, ONNX), quantifier l'impact sur la précision et la latence.. Établir le suivi des expériences ML et la gestion du cycle de vie - choisir les bons outils (Weights & Biases, MLflow, ClearML, ou similaires) afin que les chercheurs puissent exécuter, comparer et reproduire les expériences efficacement. Profile: - Hands-on experience building ML training pipelines in PyTorch - Track record of owning infrastructure end-to-end: scoping, building, shipping, and improving systems that internal teams depend on - Strong Python. Write performant code that scales well in production environments - Experience with AWS (S3, EC2, EKS, or similar) for ML workloads - 4+ years of experience building and shipping large scale software solutions - Hands-on experience with ML experiment tracking and lifecycle tools (Weights & Biases, MLflow, ClearML, or similar) - Strong communication skills - Bias toward shipping. You'd rather ship something good this week than something perfect next quarter - Background in computer vision model training - Familiarity with GPU performance profiling and optimization (Nsight, PyTorch profiler, or similar) - Experience with modern ML orchestration tools (Ray, Sematic, Flyte, Metaflow, Prefect, or similar)
Scraped 5/13/2026