xelys jobs xelys jobs

DevOps - Start-Up Spécialisée en Deep Learning - Fullremote H/F

Octopus It

full-remotemidpermanentdevops Anywhere in the World 27 days ago via WWR

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

KubernetesAWSGCPCI/CDPythonDockerTerraformMLOpsKubeflowMonitoring

About the role

Role Overview

As the second DevOps/MLOps engineer, you'll be responsible for building and maintaining cloud infrastructure, CI/CD pipelines, and deployment automation for a deep learning platform serving agricultural applications.

Key Responsibilities

  • Cloud Infrastructure: Create, maintain, and update cloud infrastructure (AWS/GCP)
  • CI/CD & Automation: Design and implement efficient CI/CD pipelines and automation tools for rapid model iteration and deployment
  • Monitoring & Observability: Set up and maintain comprehensive monitoring, logging, and alerting systems
  • ML/DevOps Bridge: Work closely with the Data Science team to understand their workloads, challenges, and artifacts (model weights, training outputs, predictions, etc.); create or adapt tools to support their needs
  • Technical Support & Enablement: Train the technical team on cloud and DevOps best practices; provide ongoing support
  • Technology Research: Stay current with emerging tools and technologies specific to ML/Deep Learning/Vision workloads to improve infrastructure
  • Incident Response: Plan for failure scenarios and respond quickly to production issues during peak seasons; investigate root causes and build resilience
  • On-Call Support: Seasonal on-call responsibilities during summer harvest periods when operations run 24/7

Requirements

  • At least 2 years of DevOps, SRE, or MLOps experience
  • Kubernetes experience
  • AWS or GCP experience
  • CI/CD toolchain experience
  • Python
  • Docker

Nice-to-Haves

  • Experience with ML/Deep Learning workloads and their operational challenges
  • Familiarity with the current stack components (Kubeflow, Helm, Terraform, ArgoCD, Argo Workflow, Prometheus, Grafana, Datadog)

About Octopus It

A deep learning-focused startup founded in 2018 that provides AI-powered agricultural prediction services through a mobile app. The company builds scalable production solutions using cutting-edge technologies like Kubernetes and Kubeflow to help farmers diagnose crop quality and inventory management at scale.

Scraped 3/31/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.