xelys jobs xelys jobs

Machine Learning Engineer | Remote

Crossing Hurdles

full-remoteseniorfreelance United States Yesterday via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

About the role

Role Overview

Part-time position for PhD-level experts to design and evaluate AI models through challenging STEM benchmark problems.

Key Responsibilities

  • Design challenging, real-world STEM benchmark problems in data science, machine learning, finance, and software engineering
  • Implement tasks within an agentic development environment using Python
  • Create reproducible problem setups with clear specifications and executable tests
  • Evaluate and analyze AI model behavior, including reasoning traces and agent workflows
  • Diagnose reasoning failures, logic gaps, and problem-solving limitations in AI systems
  • Contribute to improving benchmark quality and evaluation frameworks for frontier AI models

Requirements

  • Active or recently graduated PhD
  • Deep expertise in data science, machine learning, finance, and/or Python-based software development
  • Strong research background in advanced STEM topics
  • Ability to commit reliably for 30+ hours per week
  • Demonstrated technical output such as high-quality open-source contributions or research work
  • Ability to analyze agent behavior traces and diagnose failures beyond surface-level errors

About Crossing Hurdles

Crossing Hurdles is a company focused on AI evaluation and benchmark development, working on improving frontier AI models through rigorous STEM-based problem design and analysis.

Scraped 3/31/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.