R&D Data Engineer in AI and Computer Vision

Join Eviden as an R&D Data Engineer in AI and Computer Vision

Eviden, part of the Atos Group, is a global leader in data-driven, trusted, and sustainable digital transformation, with an annual revenue of approximately €5 billion. As a next-generation digital business, we hold leading positions worldwide in digital, cloud, data, advanced computing, and security. Our deep expertise spans over 47 countries, uniting unique high-end technologies with 47,000 world-class talents, expanding the possibilities of data and technology for generations to come.

About the Team

We are developing the Eviden Computer Vision Platform, a cutting-edge real-time video analytics solution applicable across various verticals. Utilizing AI technologies and Big Data software components, we design and enhance our product to manage comprehensive data operations.

Position Overview: Data Engineer

We are looking for a skilled and motivated Data Engineer to join our team. This role involves end-to-end data pipeline implementation and data lake operation to support our innovative projects.

Key Responsibilities

  • Build and maintain robust data pipelines for ingesting, transforming, and loading data from diverse sources, ensuring data quality, consistency, and reliability.
  • Implement data transformation logic to convert raw data into structured formats suitable for analysis and reporting, leveraging ETL/ELT processes.
  • Manage data platform infrastructure, optimizing storage utilization and ensuring data accessibility.
  • Implement and enforce data security measures, access controls, and compliance standards to maintain data integrity and privacy.
  • Develop efficient data search and retrieval mechanisms, considering relevancy, query performance, and user experience.
  • Monitor and optimize data pipelines and storage systems performance for efficient data processing and retrieval.
  • Maintain comprehensive documentation of data pipeline designs, processes, and configurations.
  • Automate the building, testing, and deployment of data lake components following DevOps practices.
  • Implement unit and integration tests, propagating knowledge across the team.
  • Securely manage AI assets such as datasets and models.
  • Integrate meta-data extraction components leveraging AI models and third-party tools.
  • Collaborate effectively with cross-functional teams including data scientists, data engineers, frontend and backend developers, and product owners.

Education

Bachelor's, Master's, or PhD in Computer Science, Electrical Engineering, or a related field.

Essential Knowledge and Professional Experience

  • Proven experience (3+ years) in designing, building, and maintaining large-scale data pipelines and data lake infrastructure.
  • Strong proficiency in programming languages such as Python.
  • Hands-on experience in REST API development.
  • Experience with Elasticsearch, including data ingestion, indexing, and search capabilities.
  • Knowledge of data modeling, schema design, and ETL/ELT processes.
  • Experience with Docker and Kubernetes for software application deployment.
  • Proficient in using Git and GitHub Actions.
  • Practice of agile methodology.
  • Proficient in Linux environments (bash or shell).
  • English level B2.

Additional Knowledge

  • Experience in MLOps tools such as MLFlow or Kubeflow.
  • Experience with Google Cloud Platform (GCP).
  • Knowledge of CPU vs GPU programming.
  • General knowledge about clusters.

Competences

  • Autonomy: Ability to seek and read documentation independently.
  • Collaboration: Provide constructive comments and embrace best practices and guidelines.
  • Fluency in English.
  • Good writing and presentation skills.
  • Strong personal soft skills: communicative, enthusiastic, highly collaborative, proactive, and self-driven.
  • Ability and enthusiasm to learn new technologies quickly.

Benefits

  • Half-day Fridays.
  • Intensive workdays during the summer.
  • Personalized training and upskilling programs.