Senior Data Engineer (Spark and Java)

  • Full Time
Job expired!
Responsibilities - Design, develop, and maintain data pipelines using Apache Spark to efficiently process and transform large volumes of data. - Collaborate with data architects and other stakeholders to define data architecture and best practices. - Ensure data models and structures align with business requirements and are scalable for future needs. - Work on real-time data processing and streaming using Spark Streaming. - Optimize Spark jobs and Java code for performance, scalability, and resource utilization. - Monitor and troubleshoot data pipeline issues to ensure minimal downtime and maximum efficiency. - Implement data quality checks, data validation, and error handling mechanisms to maintain data integrity. - Ensure compliance with data governance and security policies. - Document data engineering processes, data flows, and configurations for future reference. - Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver solutions that meet their needs. - Set up monitoring and alerting systems to proactively identify and address data pipeline issues. - Perform routine maintenance tasks and keep software and systems up to date. Requirements - A Bachelor's or higher degree in Computer Science, Information Technology, or a related field. - Knowledge of Java for software development. - Extensive experience with Apache Spark, including Spark SQL and Spark Streaming. - Proficiency in big data technologies and frameworks such as Hadoop, HDFS, and related tools. - Knowledge of data warehousing concepts and technologies. - Experience with database systems (SQL and NoSQL). - Strong problem-solving skills and the ability to work in a collaborative, team-oriented environment. - Excellent communication and documentation skills. - Understanding of data security, privacy, and compliance best practices.