Data Engineer Spark Scala | Devoteam Morocco Nearshore

  • Full Time
Job expired!

Company Description

At Devoteam, we are "Digital Transformakers". Respect, honesty, and passion drive our tribe every day.

Together, we help our clients win the Digital battle: from consulting to the implementation of innovative technologies, up to the adoption of uses.

Cloud, Cybersecurity, Data, DevOps, Fullstack Dev, Low Code, RPA are no longer a secret for our tribe!

Our 10,000+ collaborators are certified, trained, and supported daily to take on new innovative challenges.

Leader in Cloud, Cybersecurity, and Data in EMEA, the Devoteam Group achieved a turnover of 1.036 billion euros in 2022 and aims to double it in the next 5 years.

Devoteam Morocco, a reference in IT expertise for over 30 years (350+ consultants), accelerates its growth by developing its nearshore expertise activities to meet the needs of our French, European, and Middle Eastern clients.

Are you ready to join us and take on this challenge together?

Job Description

Data Engineer Spark Scala @ Devoteam Data Driven.

In a world where data sources are constantly changing, Devoteam Data Driven helps its clients transform their data into actionable information, making them impactful for more business value.

Data Driven addresses the following 3 major dimensions: Data Strategy, Data for Business, and Data Foundation, providing expertise support to its clients to make them even more efficient and competitive daily.

Within the Nearshore teams of Devoteam Morocco, you will join the Data Foundation tribe: a enthusiastic team of Data Engineers, Data Ops, Tech lead architects, and project managers working on platforms and the Data ecosystem: designing, building and modernizing Data platforms and solutions, designing data pipelines with a focus on agility and DevOps applied to Data.

You will be the essential link to provide reliable and valued data to the businesses, enabling them to create their new products and services, and you will also support the Data Science teams by providing them with the necessary "datalab" data environments to successfully conduct their exploratory approaches in the development and industrialization of their models, namely:

  • Design, develop and maintain efficient data pipelines to extract, transform, and load data from different sources to Lakehouse type data storage systems (datalake, datawarehouse)
  • Write Scala code, often associated with Apache Spark for its concise and expressive features, to perform complex transformations on large volumes of data
  • Use the features offered by Apache Spark, such as distributed transformations and actions, to process large scale data quickly and efficiently
  • Identify and solve performance issues in data pipelines, optimizing Spark queries, adjusting Spark configuration, and implementing best practices.
  • Collaborate with other teams to integrate data pipelines with SQL databases, noSQL, Kafka streaming, bucket type file systems …
  • If needed, design and implement real-time data processing pipelines using Spark's streaming features
  • Implement security mechanisms to protect sensitive data using features of authentication, RBAC/ABAC authorization, encryption, data anonymization
  • Document code, data pipelines, data schemas, and design decisions to ensure their understanding and maintainability
  • Set up unit and integration tests to ensure code quality and debug any potential issues in data pipelines

You will give your full measure by mastering your technical fundamentals, your fingertip knowledge of the data you process and manipulate, and above all by asserting your desire to understand the needs and the business for which you will work.

Your playground: distribution, energy, finance, industry, health, and transportation with plenty of use cases and new Data challenges to take on together, especially Data in the Cloud.

What we expect from you.

  • That you believe in Data
  • That you help your colleague
  • That you are kind to your HRs
  • That you enjoy your mission
  • And that Codingame does not scare you (you will not be alone: we will help you)

And more seriously:

  • That you master the fundamentals of Data: Hadoop technologies, Spark, data pipelines: ingestion, processing, valuation and data exposure
  • That you want to invest yourself in the new paradigms of Data: Cloud, DaaS, SaaS, DataOps, AutoML and that you commit with us in this adventure
  • That you enjoy working in an agile mode
  • That you produce efficient data pipelines
  • That you maintain this dual Dev & Infra competence
  • That you are close to the business, that you support them in the definition of their needs, their new products & services: in workshops, by defining user stories and by testing through POC
  • And coding is your passion: you work on your code, you commit in Open Source, you do some competition so come join us

What we will bring you.

  • A manager by your side in all circumstances
  • A Data community where you will find your place: Ideation Lab, Hackathon, Meetup...
  • A training and certification journey via "myDevoteam Academy" on current and future technologies: Databricks, Spark, Azure Data, Elastic.io, Kafka, Snowflake, GCP BigQuery, dbt, Ansible, Docker, k8s ...
  • Strengthening your expertise in the field of Data to become a Tech Lead Cloud (Azure, AWS, GCP ...), an architect of future Data platforms, a DataOps expert at the service of the business (Data as a Service) and Data Science (AutoML), a Data Office Manager in charge of Data Product projects, in short, plenty of new jobs in perspective ...
  • The possibility to invest yourself personally: be an internal trainer, community leader, participate in candidate interviews, help develop our offers and why not manage your own team ...

Some examples of missions.

  • The conception, implementation, and support of data pipelines
  • The deployment of data solutions in an Agile and DevOps approach
  • The development of REST APIs to expose data
  • The support and expertise on Data technologies and deployed solutions: Hadoop, Spark, Kafka, Elasticsearch, Snowflake, BigQuery, Azure, AWS ...

Qualifications

What qualities to join the team?

  • Engineering degree or equivalent
  • Expert in the field of Data: 3 to 5 years post-graduate experience
  • Mastery and proven practice of Apache Spark
  • Mastery and proven practice of Scala
  • Practice of Python and pySpark
  • Knowledge and practice of orchestration tools such as Apache Oozie, Apache Airflow, Databricks Jobs
  • Certifications will be a plus, especially on Spark, Databricks, Azure, GCP
  • Mastery of /ELT principles
  • Practice of ETL/ELT tools such as Talend Data Integration, Apache Nifi, dbt are a plus
  • Practice of Kafka and Spark Streaming are also a plus
  • A dual competence dev (java, scala, python) infrastructure (linux, ansible, k8s)
  • A good knowledge of Rest APIs and microservices
  • Mastery of CI/CD integration tools (Jenkins, Gitlab) and work in agile mode
  • An excellent relationship, you enjoy working in a team
  • A pronounced sense of service and committed in your activities
  • Ability to communicate and listen in all circumstances and write without mistakes …
  • and you are fluent in english, indeed !

Additional Information

Additional Information.

  • Position based in Morocco in our offices in Rabat and/or Casablanca and only open for permanent contract
  • Hybrid position with possibility of remote work
  • By joining Devoteam, you will have the opportunity to exchange with your peers, share their experience, and develop your skills by joining the Data Driven community gathering the consultants from the 18 countries of the Group

Stay connected:

  • https://www.linkedin.com/company/devoteam
  • https://twitter.com/devoteam
  • https://www.facebook.com/devoteam