Data Engineer - Bioinformatics

Job expired!

Data Engineer - Bioinformatics Opportunity at Our Future Health

Join Our Future Health, the UK's largest health research program, in an exciting and impactful role as a Data Engineer specializing in Bioinformatics. This prestigious position not only offers up to £60,000 per annum but also sits at the heart of a pioneering initiative supported by the UK Government, partnering with charities and industry, and working intimately with the NHS and public authorities across the country.

What You’ll Be Doing

As a Data Engineer, you will be a vital part of a multidisciplinary team tasked with creating and owning innovative data pipelines for a program with global reach. Key responsibilities include:

  • Building and maintaining data pipelines from various providers to our primary data storage and trusted research environments.
  • Developing transformation logic as code to produce curated, accessible, and high-quality data for analysis.
  • Prototyping pipelines for complex data transformations, drawing upon existing workflows in industry and academia.
  • Maintaining abreast of best practices across data engineering fields within industry, research, and government, facilitating standard adoption.
  • Providing technical input to the upstream aspects of data pipelines, from specification to data transfer.
  • Engaging in ad-hoc data curation and developing bespoke ETL cleaning scripts, predominantly in Python.
  • Collaborating with researchers to understand their data needs and assisting in the delivery of essential data for projects.

Skills and Requirements

To thrive in this pivotal role, you'll need a robust background in bioinformatics, particularly with tools and methodologies linked to genomic data. The ideal candidate will demonstrate:

  • Experience in an Agile development environment, with a focus on code review and pairing.
  • Familiarity with version control, especially Git/GitHub.
  • Proficiency in designing, building, and testing pipelines across various technologies with a focus on repeatability and reusability.
  • Strong capabilities in managing large-scale genomic data storage, search, and filtration.
  • A solid grasp of cloud environments (ideally Azure), distributed computing, and scaling workflows.
  • Experience with Python and workflow management tools like Nextflow, WDL/Cromwell, Airflow, Prefect, and Dagster.
  • Knowledge of common data transformation and storage formats, such as Apache Parquet, and data lakes technologies like Spark and Databricks.
  • Understanding of containerization technologies, e.g., Docker, and data standards like GA4GH and FAIR.
  • Comprehension of information governance and data security strategies pertinent to sensitive health data.

Benefits

Our Future Health offers a generous compensation and benefits package, including:

  • Up to £60,000 annual basic salary.
  • Robust pension package with employer contributions up to 12%.
  • 30 days of annual leave in addition to bank holidays.
  • Continuous opportunities for career development, with regular appraisals.
  • Modern office in Holborn, Central London, with flexible and remote working options.

We are on a mission to prevent disease and improve health for future generations. By contributing vital information, our target of engaging 5 million UK volunteers will support researchers in making new health and disease discoveries. Your role as a Data Engineer will directly contribute to building one of the most detailed health datasets available. Apply today to make a difference in global health and prevention!