xelys jobs xelys jobs

Ai Data Engineer - Remote

ProfitSolv

full-remoteseniorpermanentfullstack United States 2 days ago via LinkedIn

See how well this job matches your profile

Sign up to get an AI match score and generate a tailored application in seconds.

Get your match score

Tags

AWSApache Icebergdbtdbt Semantic LayerAirflow (MWAA)Change Data Capture (CDC)RAGRetrieval-Augmented GenerationOpenSearchTerraform

About the role

Role Overview

ProfitSolv is hiring an AI Data Engineer (Remote) to build a greenfield centralized data platform on AWS. You’ll combine data engineering and AI engineering—e.g., writing dbt models in the morning and designing a RAG pipeline in the afternoon.

Responsibilities

  • Build and maintain a Medallion Lakehouse (Bronze/Silver/Gold) on S3 using:
    • Apache Iceberg, AWS Glue Data Catalog, dbt Cloud (Athena adapter)
  • Configure and manage AWS DMS for ongoing CDC from ~1,000 SQL Server instances
  • Ingest data using Amazon ECS Fargate tasks for SaaS API ingestion
  • Orchestrate pipelines with Amazon MWAA (Airflow)
  • Develop dbt Cloud transformations from Bronze → Silver → Gold
  • Define business metrics in the dbt Semantic Layer for BI tools and AI agents
  • Manage Redshift Serverless + Spectrum as the read engine
  • Tune Iceberg table layouts, partitioning, and compaction for performance
  • Implement Lake Formation tag-based governance for multi-product data isolation
  • Onboard acquisitions to the platform in weeks, not months
  • Build batch embedding pipelines for legal documents and client records
  • Manage vector storage using OpenSearch Serverless or pgvector on Aurora
  • Design and ship RAG pipelines for legal domain use cases (chunking, retrieval ranking, context management)
  • Build MCP servers exposing the dbt Semantic Layer and platform APIs to AI agents (Claude, internal copilots, customer-facing features)
  • Ensure compliance/security/governance with IAM roles, encryption policies, and metadata cataloging

Requirements

  • 5+ years of hands-on data engineering experience, focused on AWS (S3, Glue, Athena, Redshift or equivalent)
  • Production-grade dbt experience (Core or Cloud), including testing, macros, and documentation best practices
  • Experience implementing CDC patterns with AWS DMS, Debezium, or similar tools
  • Ability to design and operate production Airflow DAGs (MWAA or self-hosted)
  • Hands-on experience building at least one production-ready RAG pipeline (chunking, embeddings, vector storage, retrieval)
  • Strong SQL (primary) and Python (data pipelines + AI workflows)
  • Working knowledge of TypeScript for MCP server development
  • Infrastructure-as-code experience (e.g., Terraform)
  • Comfortable making architectural decisions independently in a high-autonomy environment

Nice to Haves

  • Experience building MCP servers or similar AI tool-use integrations
  • dbt Cloud Semantic Layer / MetricFlow experience
  • Experience with Apache Iceberg (explicitly mentioned as desirable)

About ProfitSolv

ProfitSolv is a SaaS business services provider focused on the legal and accounting industry. The company builds platforms that unify data across portfolios and enable AI-driven experiences for customers.

Scraped 4/9/2026

xelys jobs xelys jobs

Built for remote job seekers. Powered by AI.