Principal Data Scientist (Agent Builder)
Elastic
full-remoteleadpermanentdata Full remote Today via WTTJ
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Principal Data ScientistRAGInformation RetrievalSemantic SearchLLM-as-JudgeRankingEvaluation MetricsnDCGMRRElasticsearch
About the role
Role overview
Join Elastic as a Principal Data Scientist (Agent Builder) in the Search Conversational Experiences team. You’ll set the technical direction for evaluating, improving, and scaling chat quality across Elastic’s agentic platform.
Key missions
- Define the evaluation strategy that drives product decisions, including:
- which models to standardize
- how to route requests between agents
- which enabling tools to adopt
- Lead the design of quality metrics and decision frameworks for:
- RAG, agents, tools
- model selection
- agent routing
- prompt behavior
- Partner with engineering to bring evaluation pipelines to production, including:
- evaluation pipelines and telemetry
- dashboards and CI guardrails
- detection and monitoring of quality regressions
Responsibilities
- Turn experimental results into product and business decisions.
- Mentor other data scientists and engineers.
- Share outcomes via clear documentation and cross-functional reviews.
- Collaborate with engineering to move from prototype to production.
Requirements
- 8+ years of applied DS/ML experience.
- Deep expertise in IR/NLP, ranking, semantic search, and RAG/LLM-powered product experiences.
- Strong understanding of retrieval systems, including:
- dense + sparse retrieval
- re-ranking
- vector search
- query understanding
- evaluation metrics such as nDCG, MRR, Recall@k, precision and latency/cost trade-offs
- Proven ability to define and lead evaluation for production AI/ML systems, including:
- offline metrics and online experimentation
- LLM-as-judge approaches
- groundedness and citation quality
- model comparison
- Hands-on skills with Python and modern ML tooling (e.g., PyTorch/Transformers, Pandas, notebooks, reproducible experiments, versioned datasets, reviewable code).
- Collaborative, low-ego mindset; strong mentoring and ability to raise standards in distributed teams.
- Excellent written and verbal communication across engineering, product, design, and leadership.
- Experience building evaluation-to-production systems (telemetry, dashboards, CI guardrails, regression tracking).
- Practical experience with Elasticsearch, or similar search/distributed data systems (ES|QL is a plus).
Nice to have
- ES|QL familiarity.
About Elastic
Elastic is a company focused on search, analytics, and observability software built to help organizations find insights from data. In this role, you’ll work within Elastic’s search and conversational/agentic platform efforts, improving how chat and agent experiences are evaluated and scaled.
Scraped 6/20/2026