img
Contract

Principle Data Engineer

City of London
money-bag Negotiable
Posted 2 days ago

Overview

Our client in the Life Science industry is a startup in stealth mode backed by strong funding. They are seeking a

Principal Data Engineer

to lead the data and infrastructure systems powering the foundation model transforming drug development.Responsibilities

Lead data and infrastructure systems powering foundation model initiatives in drug development.Own data workflows end-to-end, from extraction and transformation to clean Parquet outputs for machine learning teams.Collaborate closely with wet lab teams; practically understand assays and protocol development.Set up cloud data infrastructure from scratch, including compute, storage, networking, and access controls.Build reliable, repeatable pipelines with testing, version control, and clear documentation.Maintain data quality, lineage, and monitoring; implement sound data modeling practices.Qualifications (Requirements)

Principal-level data engineering experience in life sciences is essential.End-to-end ownership of data workflows from extraction to machine learning-ready outputs (Parquet).Hands-on familiarity with genomics data, including raw FASTQ files and Illumina sequencer outputs.Experience with metabolomics data, particularly untargeted mass spectrometry.Strong collaboration with wet lab teams and practical understanding of assays and protocol development.Cloud data infrastructure built from scratch (compute, storage, networking, access controls).Strong Python and SQL skills; proficient in data modeling, data quality, lineage, and monitoring.Ability to design and maintain reliable pipelines with testing and documentation.Preferences

Experience building data lakes or lakehouses and automating batch workflows (e.g., Airflow).Familiarity with NGS pipelines (quality control, alignment/assembly, variant calling) and mass spectrometry data analysis.Use of Infrastructure as Code (Terraform), containerization (Docker), and CI/CD for deploying data systems.Prior 0-to-1 startup experience and close collaboration with ML and biology teams.Why Join

Design and build cloud infrastructure and data pipelines powering distributed ML training and scalable biological data workflows—without legacy constraints.Work with first-of-their-kind, multi-modal datasets to support foundation model training at AlphaFold scale; this is a builder role with deep technical ownership.Join as a founding member of the engineering team with significant equity and end-to-end system ownership.See your work directly enable drug discoveries that will impact millions, collaborating with world-leading scientists in microbiome research and machine learning.Location:

London - 3 days onsiteSalary:

£ 80 000 - £ 120 000 plus ..... full job details .....

Other jobs of interest...

Citywire Financial Publishers Limited
City of LondonYesterday
money-bagNegotiable
Fractal
LondonYesterday
money-bagNegotiable
Pacific RE
City of LondonYesterday
money-bagNegotiable
hackajob
City of LondonYesterday
money-bagNegotiable

Perform a fresh search...

  • Create your ideal job search criteria by
    completing our quick and simple form and
    receive daily job alerts tailored to you!

Jobs. Straight to your inbox!