Principal Data Scientist (Agent Builder)
Elastic
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
About the role
Role Overview
Join Elastic as a Principal Data Scientist (Agent Builder) in the Search Conversational Experiences team. You will set the technical direction for evaluating, improving, and scaling chat quality across Elastic’s agentic platform—turning experimental results into product and business decisions.
Key Missions
- Evaluation strategy: Define how product decisions will be guided by evaluation, including:
- which models to standardize,
- how to route requests between agents,
- which tools to enable.
- Quality metrics & decision frameworks: Lead the design of quality metrics and decision frameworks for:
- RAG (retrieval-augmented generation),
- agents and tools,
- model selection,
- agent routing,
- prompt behavior.
- Retrieval improvements: Build, compare, and guide improvements in retrieval and re-ranking approaches, including:
- sparse vs dense retrieval,
- vector search,
- query understanding,
- context enrichment.
Responsibilities
- Lead evaluation and experimentation for production AI/ML systems.
- Mentor other data scientists and engineers.
- Communicate outcomes via clear documentation and cross-functional reviews.
Requirements
- Proven experience defining and leading evaluation for production AI/ML systems, including:
- offline metrics,
- online experimentation,
- LLM-as-judge methods,
- groundedness and citation quality,
- model comparison.
- Practical experience with Elasticsearch (or similar search/distributed data systems). ES|QL familiarity is a plus.
- Strong communication skills: explain complex technical trade-offs to engineering, product, design, and leadership.
- 8+ years applied DS/ML experience with deep expertise in:
- IR, NLP, ranking,
- semantic search,
- RAG,
- LLM-powered product experiences.
- Strong understanding of retrieval systems and evaluation metrics, including:
- dense/sparse retrieval,
- re-ranking,
- vector search,
- query understanding,
- metrics such as nDCG, MRR, Recall@k, precision, and latency/cost trade-offs.
- Collaborative, low-ego mindset; ability to mentor, raise standards, and increase transparency in a distributed team.
- Hands-on engineering ability with:
- Python,
- PyTorch/Transformers,
- Pandas, notebooks,
- reproducible experiments,
- versioned datasets,
- clean, reviewable code.
- Experience partnering with engineering teams to move from prototype to production, including telemetry design, dashboards, CI guardrails, and quality regression tracking.
- Ability to influence product/technical strategy using data, especially in ambiguous/emerging domains.
About Elastic
Elastic builds enterprise search and observability products, leveraging AI to help users find and understand information at scale. The role sits within Elastic’s Search Conversational Experiences and agentic platform, focused on improving conversational/chat quality and evaluation for production AI systems.
Scraped 6/20/2026