xelys jobs xelys jobs

DevOps Engineer

Drexel University

full-remotemidfixed-termdevops Philadelphia, PA 46 days ago via LinkedIn
90,430 - 135,640 USD/annual

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

DevOpsKubernetesLinuxAnsiblePythonBashHPCiRODSGlobusInfrastructure-as-Code

About the role

Role Overview

The DevOps Engineer will help build and operate Drexel URCF’s new shared computing platform for GPU-accelerated workloads, including AI model training. The platform is actively under development and is transitioning toward container-native tools and workflows from a traditional HPC environment.

Responsibilities

  • Automation & cluster operations: Develop and maintain automation for provisioning, configuring, and managing the cluster (e.g., Ansible, Warewulf, Kubernetes manifests, shell scripting).
  • Kubernetes platform layer: Contribute to Kubernetes networking, storage integration, security policies, and workload orchestration.
  • Storage infrastructure & integrations: Help build storage systems including iRODS and Globus/Globus Connect Server for data transfer, plus integrations between storage and compute.
  • End-to-end troubleshooting: Diagnose issues across the stack, from bare-metal boot problems to container orchestration bugs.
  • Documentation: Write and maintain operational and user-facing documentation.
  • Coordinate with IT: Work with Drexel IT on shared infrastructure topics such as networking, DNS, and firewall rules.
  • User-facing portal: Contribute to web application development for a portal supporting project management, permissions, and usage tracking.

Requirements

  • Education: Bachelor’s degree in Computer Science/Engineering or related field (or equivalent education and work experience).
  • Experience: 1–3 years.
  • Skills:
    • Linux systems administration and/or configuration management
    • Containers and/or container orchestration
    • Comfort working in a terminal with Git, SSH, and a text editor
    • Proficiency in at least one scripting language (Python or Bash)
    • Strong written communication
    • Ability to work independently and manage time in a fully remote role

Preferred Qualifications

  • Kubernetes experience
  • Bare-metal provisioning and/or HPC cluster management experience
  • Familiarity with one or more of: Ansible, Warewulf, RKE2, Cilium, Kubeflow, Weka, iRODS, Globus (and general infrastructure-as-code)
  • Web application development experience
  • Experience in an academic/research computing environment

Contract / Funding Notes

  • Grant-funded position through September 1, 2027 (employment contingent on continued funding).

About Drexel University

Drexel University’s University Research Computing Facility (URCF) is building a new shared, GPU-accelerated research computing platform for AI and other workloads. The platform combines GPU/CPU compute nodes, Kubernetes-based orchestration, and large-scale storage/metadata and data-transfer systems to support research projects.

Scraped 4/1/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.