Senior Software Engineer
recruitAbility
hybridseniorpermanentbackendproduct-management United States 47 days ago via LinkedIn
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
LLMOpsAI ObservabilityPythonC#/.NETLLM EvaluationRAGPrompt Regression TestingCI/CDVector SearchLangfuse
About the role
Role Overview
Senior Software Engineer — AI Observability (Senior AI Engineer, Observability) You’ll join a product delivery team to ensure AI-powered features (RAG pipelines, semantic search, and agentic workflows) are instrumented, evaluated, and monitored with production-grade rigor. This is a hybrid engineering + platform role focused on turning trace data into actionable quality signals and scaling observability practices across product lines.
What You’ll Do
Instrumentation & Integration
- Partner with product teams to instrument LLM, RAG, and agent workflows into observability platforms (e.g., Langfuse, Arize)
- Define and enforce standards for tracing, metadata, and token tracking
- Build shared SDKs/libraries to make correct instrumentation easier
- Integrate observability into CI/CD to surface quality signals before production
Evaluation & Dataset Development
- Define AI quality metrics with product/engineering teams
- Build and maintain versioned “golden” datasets for real-world and edge cases
- Implement evaluation pipelines including LLM-as-judge, heuristics, and human feedback
- Establish prompt regression testing and support A/B experimentation
Monitoring, Cost & Incident Response
- Own dashboards and alerts for latency, cost, quality, and failure signals
- Implement cost controls (e.g., budgeting, caching, rate limiting, usage visibility)
- Monitor guardrails and content safety as a distinct signal
- Proactively surface issues and track model/provider changes
- Maintain runbooks for common LLM failure modes and incidents
- Deliver regular AI quality reports with business insights
Platform & Enablement
- Administer and evolve the LLMOps platform (access, environments, integrations)
- Evaluate tools to improve efficiency and quality
- Ensure compliance (e.g., SOC 2, ISO 27001, PII handling, data residency)
- Scale observability practices via reusable patterns
- Share knowledge via documentation, workshops, and guidance
Requirements
- 5+ years of production software engineering experience
- Hands-on experience with LLM-powered features or AI pipelines (instrumentation, evaluation, or monitoring)
- Experience with LLMOps/observability tools (e.g., Langfuse, Arize, W&B, LangSmith)
- Solid understanding of RAG, prompt engineering, vector search, and orchestration frameworks (e.g., LangChain, Semantic Kernel)
- Experience designing LLM evaluations (datasets, scoring, LLM-as-judge, regression testing)
- Proficiency in Python and/or C#/.NET; able to work across production codebases
- Strong observability fundamentals (tracing, logging, metrics, alerting)
Nice-to-haves / Implied Fit
- Strong AI quality mindset: not only shipping, but consistent performance, graceful degradation, cost control, and continuous improvement
- Experience scaling standards across multiple product lines
Scraped 4/2/2026