Data Engineer - Bioinformatics Opportunity at Our Future Health
Join Our Future Health, the UK's largest health research program, in an exciting and impactful role as a Data Engineer specializing in Bioinformatics. This prestigious position not only offers up to £60,000 per annum but also sits at the heart of a pioneering initiative supported by the UK Government, partnering with charities and industry, and working intimately with the NHS and public authorities across the country.
What You’ll Be Doing
As a Data Engineer, you will be a vital part of a multidisciplinary team tasked with creating and owning innovative data pipelines for a program with global reach. Key responsibilities include:
- Building and maintaining data pipelines from various providers to our primary data storage and trusted research environments.
- Developing transformation logic as code to produce curated, accessible, and high-quality data for analysis.
- Prototyping pipelines for complex data transformations, drawing upon existing workflows in industry and academia.
- Maintaining abreast of best practices across data engineering fields within industry, research, and government, facilitating standard adoption.
- Providing technical input to the upstream aspects of data pipelines, from specification to data transfer.
- Engaging in ad-hoc data curation and developing bespoke ETL cleaning scripts, predominantly in Python.
- Collaborating with researchers to understand their data needs and assisting in the delivery of essential data for projects.
Skills and Requirements
To thrive in this pivotal role, you'll need a robust background in bioinformatics, particularly with tools and methodologies linked to genomic data. The ideal candidate will demonstrate:
- Experience in an Agile development environment, with a focus on code review and pairing.
- Familiarity with version control, especially Git/GitHub.
- Proficiency in designing, building, and testing pipelines across various technologies with a focus on repeatability and reusability.
- Strong capabilities in managing large-scale genomic data storage, search, and filtration.
- A solid grasp of cloud environments (ideally Azure), distributed computing, and scaling workflows.
- Experience with Python and workflow management tools like Nextflow, WDL/Cromwell, Airflow, Prefect, and Dagster.
- Knowledge of common data transformation and storage formats, such as Apache Parquet, and data lakes technologies like Spark and Databricks.
- Understanding of containerization technologies, e.g., Docker, and data standards like GA4GH and FAIR.
- Comprehension of information governance and data security strategies pertinent to sensitive health data.
Benefits
Our Future Health offers a generous compensation and benefits package, including:
- Up to £60,000 annual basic salary.
- Robust pension package with employer contributions up to 12%.
- 30 days of annual leave in addition to bank holidays.
- Continuous opportunities for career development, with regular appraisals.
- Modern office in Holborn, Central London, with flexible and remote working options.
We are on a mission to prevent disease and improve health for future generations. By contributing vital information, our target of engaging 5 million UK volunteers will support researchers in making new health and disease discoveries. Your role as a Data Engineer will directly contribute to building one of the most detailed health datasets available. Apply today to make a difference in global health and prevention!