xelys jobs xelys jobs

Research Engineer

Anthropic

hybridmidpermanentbackenddata Full remote Today via WTTJ

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

Reinforcement LearningReward DesignLarge Language ModelsData CurationFine-tuningVendor ManagementDistributed SystemsCloud InfrastructureData PipelinesQA Frameworks

About the role

Role overview

Join Anthropic’s Environment Scaling team to improve the intelligence of public models for novel verticals and use cases. As a Research Engineer, you will build and iterate on reinforcement learning (RL) environments, measure their impact on model performance, and collaborate closely with domain experts.

Key missions

  • Own the end-to-end creation of RL environments for new capabilities, including:
    • Identifying high-value tasks
    • Designing reward signals
  • Manage technical relationships with external data vendors, including:
    • Evaluating data quality
    • Informing reward design
  • Collaborate with domain experts to design:
    • Data pipelines and evaluations
    • Novel approaches for creating RL environments for high-value tasks
  • Explore novel RL-environment creation methods and develop QA frameworks for evaluation quality.

Requirements

  • Bachelor’s degree in a related field or equivalent experience
  • Comfort managing technical vendor relationships and iterating quickly on feedback
  • Strong project management and interpersonal skills
  • Motivated by a mix of ML research, data operations, and project management
  • Domain expertise in an area where models should become more useful
  • Experience with fine-tuning large language models for specific domains or real-world use cases
  • Familiarity with distributed systems and cloud infrastructure
  • Experience with reinforcement learning, reward design, and/or training data curation for LLMs
  • Ability to read and analyze datasets to understand them and spot issues
  • Experience working with external vendors/technical partners
  • Value-driven mindset focused on making AI more useful and accessible
  • Experience training production ML systems

Remote / location policy

  • Listed as full remote, but the posting also states a hybrid expectation: staff should be in an office at least 25% of the time (some roles may require more).

About Anthropic

Anthropic is an AI company focused on building public models. The company works on advancing model capabilities and making AI more useful across industries, including through research and evaluation of model performance in real-world settings.

Scraped 5/12/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.