- •
3+ years of experience in data engineering
- •
Degree in Computer Science, Data Engineering, Computer Engineering, Information Systems, or equivalent technical background
- •
Solid understanding of the ML training lifecycle and what properties make a dataset suitable for model training
- •
Familiarity with layered data architecture patterns such as Medallion Architecture (Bronze/Silver/Gold) or Data Mesh
- •
Proficiency in Python, with focus on data manipulation, pipeline development, and automation
- •
Workflow orchestration using code-based tools such as Temporal, Airflow, Prefect, Dagster, or equivalent
- •
Distributed data processing with Spark, Databricks, or similar
- •
REST and gRPC API integration
- •
Strong SQL skills, both for data modeling and query optimization
- •
Experience with streaming systems and event-driven pipelines (Kafka, Kinesis, or equivalent)