Data Engineer

- Location
- Remote
- Job type
- Permanent
As Data Engineer
job description
WHAT YOU’LL DO
Design, develop, and maintain ETL/ELT pipelines to transform raw, multi-source data into clean, analytics-ready tables in Google BigQuery, using tools such as dbt for modular SQL transformations, testing, and documentation.
Emphasize data quality, consistency, and reliability by implementing robust validation checks, including schema drift detection, null/missing value tracking, and duplicate detection using tools like Great Expectations or Soda.
Integrate and automate affiliate data workflows, replacing manual processes in collaboration with the related stakeholders.
Proactively monitor and manage data pipelines using tools such as Airflow, Prefect, or Dagster, with proper alerting and retry mechanisms in place.
Build a Data Consistency Dashboard (in Looker Studio, Power BI, Tableau or Grafana) to track schema mismatches, partner anomalies, and source freshness, with built-in alerts and escalation logic.
Ensure timely availability and freshness of all critical datasets, resolving latency and reliability issues quickly and sustainably.
Control access to cloud resources, implement data governance policies, and ensure secure, structured access across internal teams.
Monitor and optimize data infrastructure costs, particularly related to BigQuery usage, storage, and API-based ingestion.
Document all pipelines, dataset structures, transformation logic, and data contracts clearly to support internal alignment and knowledge sharing.
Build and maintain postback-based ingestion pipelines to support event-level tracking and attribution across the affiliate ecosystem.
Collaborate closely with Data Scientists and Product Analysts to deliver high-quality, structured datasets for modeling, experimentation, and KPI reporting.
Act as a go-to resource across the organization for troubleshooting data discrepancies, supporting analytics workflows, and enabling self-service data access.