Data Engineer
Jellyfish
full-remotemidpermanentbackenddata Full remote 10 days ago via WTTJ
See how well this job matches your profile
Sign up to get an AI match score and generate a tailored application in seconds.
Get your match scoreTags
Data EngineeringPythonSQLTerraformAirflowPrefectDagsterMedallion Data ModelSnowflakeData Quality
About the role
Role overview
As a Data Engineer at Jellyfish (full remote), you’ll translate architecture into clean, production-grade data pipelines. You’ll build and tune orchestration workflows, deploy infrastructure, collaborate with product developers, and join an incident response rotation.
Key missions
Core Pipeline Engineering
- Write modular, production-grade Python and optimized SQL for daily data transformations
- Implement Medallion-layer data models
Modern Orchestration & Tuning
- Manage and tune workflow orchestration engines
- Optimize execution paths and ensure efficient distributed processing jobs
Infrastructure as Code (IaC)
- Own data platform infrastructure deployment using Terraform
- Manage warehouse schemas, permissions, and tables
What you’ll bring
- Strong hands-on production experience with Python, advanced SQL, and data transformation concepts
- Ability to build and schedule workflows using programmatic orchestrators such as Prefect, Dagster, or Airflow
- Experience with enterprise warehouse/catalog platforms such as Snowflake, Databricks, or BigQuery
- Pragmatic judgment: know when to optimize distributed jobs vs. use indexed tables/cached views
- Automation mindset for turning manual fixes/backfills into permanent solutions
- Collaborative engineering habits: readable code, thorough documentation, and clear data lineage
- Experience working in rapidly scaling, multi-tenant B2B SaaS environments
Nice to have
- Data quality testing frameworks such as Great Expectations or Soda
- Experience with cloud cost allocation tracking or token-level spend tracking for LLM/AI integrations
About Jellyfish
Jellyfish is a company that provides data and engineering infrastructure for elite engineering organizations. It focuses on building and operating reliable data pipelines, orchestration, and data platform capabilities to help teams deliver and measure outcomes effectively.
Scraped 6/11/2026