About the role
Help a biopharma scale secure, high‑quality data pipelines that power manufacturing analytics and AI use cases (yield, deviation prediction, batch release). You’ll own ingestion and transformation from OT/IT systems into a governed lakehouse, ensuring GxP-compliant design and documentation.
What you’ll do
- Build and optimize batch/streaming pipelines (ETL/ELT) across MES/LIMS/ERP/Historians into a lake/lakehouse (e.g., Delta/Snowflake).
- Engineer reusable frameworks for data quality, lineage, and observability aligned to ALCOA+ and data integrity.
- Partner with Manufacturing, QA, and Data Science to productionize features for ML and BI.
- Implement role-based access, PII handling, and audit trails aligned to GxP/Annex 11.
- Contribute to validation packages (requirements, risk, test, traceability).
Your background
- 5–8+ years in data engineering; pharma/biotech or other regulated industry experience.
- Strong in Databricks/Spark (or PySpark), Azure/AWS/GCP, SQL, and orchestration (ADF, Airflow, Prefect).
- Hands-on with Delta Lake/Snowflake and med‑device/pharma data sources: MES (e.g., PAS‑X), LIMS, OSIsoft PI/AVEVA, SAP S/4HANA.
- Working knowledge of GxP, GMP, CSV/CSA, Annex 11, and documentation in regulated environments.
- Familiarity with data governance (catalog/lineage), e.g., Purview/Collibra.
Nice to have
- CDC tools (Fivetran/DBT), Kafka/Event Hubs, Infra-as-code (Terraform).
- Exposure to feature stores, MLflow, or MLOps pipelines.
What’s on offer
- Competitive day rate; remote-first within Europe
- Engagement via compliant models (e.g., B2B/SRL/BV, umbrella, or EOR) subject to country guidance
- Work that directly accelerates safe medicines to patients
Job Type: Contract
Job Location: Europe
Salary: Competitive