Data Engineer (AI/ML)
micro1
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
About the role
Role Overview
Data Engineer (AI/ML) at micro1 (Full-time, Remote). You’ll support data infrastructure and experimentation in an AI research environment by building reliable pipelines and transforming raw data into structured datasets for research and model development.
Responsibilities
- Design, build, and maintain scalable data pipelines to ingest, process, and transform data from multiple sources.
- Collaborate with AI researchers and data scientists to structure and prepare datasets for experimentation and training.
- Develop and maintain data models, schemas, and storage systems optimized for large-scale datasets.
- Write efficient SQL queries and Python scripts to extract, transform, and analyze data.
- Ensure data quality, integrity, and reliability across pipeline and storage layers.
- Implement data validation, monitoring, and automation workflows to support iterative research cycles.
Requirements
- Strong proficiency in Python and SQL.
- Experience designing and maintaining ETL/ELT pipelines.
- Solid data manipulation skills with Pandas and NumPy.
- Experience with structured and semi-structured datasets.
- Familiarity with relational databases such as PostgreSQL or MySQL.
- Strong analytical thinking and the ability to work collaboratively in research-driven environments.
- Excellent written and verbal communication skills.
Nice to Have
- Exposure to AI/ML workflows or research environments.
- Experience with data visualization tools: Matplotlib, Seaborn, or Plotly.
- Familiarity with LLM-related data workflows (training/evaluation datasets, prompt experimentation).
About micro1
micro1 provides a data engine for AI labs and enterprises building AI agents. It supports foundational model training with frontier evaluations and reinforcement learning environments, and includes tooling for contextual evaluations, including an AI recruiter agent and data-pipeline performance systems. The platform helps produce and monitor high-quality training and evaluation datasets at scale.
Scraped 4/16/2026