Machine Learning Engineer | Remote
Crossing Hurdles
full-remoteseniorfreelance United States Yesterday via LinkedIn
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreAbout the role
Role Overview
Part-time position for PhD-level experts to design and evaluate AI models through challenging STEM benchmark problems.
Key Responsibilities
- Design challenging, real-world STEM benchmark problems in data science, machine learning, finance, and software engineering
- Implement tasks within an agentic development environment using Python
- Create reproducible problem setups with clear specifications and executable tests
- Evaluate and analyze AI model behavior, including reasoning traces and agent workflows
- Diagnose reasoning failures, logic gaps, and problem-solving limitations in AI systems
- Contribute to improving benchmark quality and evaluation frameworks for frontier AI models
Requirements
- Active or recently graduated PhD
- Deep expertise in data science, machine learning, finance, and/or Python-based software development
- Strong research background in advanced STEM topics
- Ability to commit reliably for 30+ hours per week
- Demonstrated technical output such as high-quality open-source contributions or research work
- Ability to analyze agent behavior traces and diagnose failures beyond surface-level errors
About Crossing Hurdles
Crossing Hurdles is a company focused on AI evaluation and benchmark development, working on improving frontier AI models through rigorous STEM-based problem design and analysis.
Scraped 3/31/2026