Join Our Team: Senior Data Engineer at Causaly
About Us
Founded in 2018, Causaly is revolutionizing how humans acquire knowledge and develop insights in Biomedicine. Our cutting-edge generative AI platform enhances research insights and knowledge automation, enabling thousands of scientists to discover vital evidence from millions of academic publications, clinical trials, regulatory documents, patents, and other data sources in just minutes.
We are proud to partner with some of the world's largest biopharma companies and institutions, focusing on use cases such as Drug Discovery, Safety, and Competitive Intelligence. and how we accelerate knowledge acquisition and improve decision-making on our blog.
Backed by top venture capital firms like ICONIQ, Index Ventures, Pentech, and Marathon, Causaly is on a mission to make a significant impact in the biomedicine industry.
About the Role: Senior Data Engineer
We are looking for an experienced Senior Data Engineer to join and help grow our established Data & Semantic Technologies team. This team is crucial in designing and building the scalable and flexible data backend we need at Causaly to bring our vision to life.
The role involves working on incremental data pipelines for both batch and targeted updates, maintaining massive knowledge graphs and ontologies, and feeding our continuously growing data warehouse. You will collaborate closely with the Applied AI and Application teams to create real business value through data.
Your Responsibilities:
- Gather and comprehend data based on business requirements.
- Import large datasets (millions of records) from formats such as CSV, XML, SQL, JSON to BigQuery.
- Process and combine data on BigQuery with external data sources.
- Implement and maintain high-performance data pipelines adhering to industry best practices for scalability, fault tolerance, and reliability.
- Develop tools for monitoring, auditing, exporting, and extracting insights from data pipelines.
- Engage with technical, product, and business stakeholders to deliver backend data solutions.
- Manage data processes related to delivery, curation, and machine learning operations.
- Build a strong data-engineering function, mentor other engineers, shape our technology strategy, and innovate our data infrastructure.
Requirements for Success:
Minimum Requirements:
- Master’s degree in Computer Science, Mathematics, or a related technical field.
- 5+ years of experience in backend data processing and data pipelines.
- Proficiency in Python and related libraries (e.g., pandas, Airflow).
- Strong SQL and database skills.
- Solid understanding of modern software development practices (testing, version control, documentation, etc.).
- Product and user-centric mindset.
- Excellent problem-solving, ownership, organizational skills, and high attention to detail.
Preferred Qualifications:
- Experience with NoSQL and big data technologies (e.g., Spark, Hadoop).
- Experience with full-text search databases (e.g., ElasticSearch).
- Experience with knowledge graphs and graph databases (e.g., Neo4J).
- Experience with MLOps / DataOps in production.
- Familiarity with Terraform, Kubernetes, and/or Docker Containers.
Our Benefits:
- Competitive compensation package.
- Private medical insurance.
- Life insurance (4x salary).
- Individual training/development budget through Learnerbly.
- Individual wellbeing budget through Juno.
- 25 days of holiday, plus public holidays and 1 day of birthday leave per year.
- Hybrid working (home + office).
- Potential for real impact and accelerated career growth as an early member of a diverse team.
Be Yourself at Causaly
At Causaly, diversity, equity, and inclusion are more than just words. They guide how we work together, build teams, grow leaders, and celebrate differences. We believe that nurturing these values helps us innovate