Big Data Engineer
Kaizen Analytix LLC, an analytics products and services company delivering unmatched speed to value through analytics solutions and actionable business insights, is seeking qualified candidates for the role of Big Data Engineer. We are looking for highly skilled and experienced professionals responsible for designing, developing, and maintaining data pipelines and data warehouses using the Hadoop ecosystem, including HDFS, Spark, Hive, HBase, Sqoop, Pig, Oozie or equivalent cloud offerings such as AWS EMR, GCP Dataproc, Azure HDInsights. The ideal candidate will have a robust understanding of data engineering principles and best practices, along with experience working with massive datasets.
Responsibilities:
Analysis and Design
- Conducts fact-gathering sessions with users.
- Consults with Technical Managers and Business Owners to identify and analyze technological needs and problems.
- Performs data flow diagramming and/or process modeling (code architecture).
- Designs, develops, and maintains data pipelines and data warehouses on the desired cloud platforms (e.g., AWS, GCP, Azure).
- Works with stakeholders to gather requirements and define data models.
- Develops and deploys data pipelines on Cloud Platforms using big data tools and services.
- Implements data quality checks and monitoring.
- Troubleshoots data issues and performance problems.
- Works with other engineers to develop and maintain the company's data infrastructure.
- Stays up to date on the latest data engineering technologies and trends.
Strategy Alignment
- Works with other technical team members to continually improve implementation strategies, development standards, and other departmental processes and documentation.
- Provides technical assistance and mentoring to lower-level Data Engineers.
- Communicates plans, status, and issues to management regularly.
- Adheres to department standards, policies, procedures, and industry best practices.
Job Requirements:
- Bachelor's/master’s degree in computer science, Information Systems, or a related field.
- 4+ years of experience in data engineering and big data tools.
- Experience in any migration projects that involve data warehousing, migrating databases from one technology to a different technology.
- Strong Scala/Java programming for developing ETL scripts.
- Robust understanding of data engineering principles and best practices.
- Solid implementation knowledge of Spark using Scala/Java.
- Proficient in Map Reduce, big data file formats, partitioning, replica maintenance compression techniques.
- Experience with any cloud platform and their offerings of the Hadoop toolsets such as Google Cloud Platform - Dataproc, Cloud Dataflow, and Cloud Data Fusion, AWS Elastic Map-Reduce.
- Experience with data modeling and data warehousing.
- Experience with data quality checks and monitoring.
- Must be familiar with CI/CD pipelines and proficient in using tools like Jenkins, Cloud Build, and TeamCity for creating required pipelines for CI/CD.
- Self-motivated with the ability to propose solutions and workarounds and meet strict deadlines.
- Ability to troubleshoot key customer implementation issues and drive them to successful resolution.
- Capability to partner with domain architects to develop the end-to-end solution architecture, including application, infrastructure, data, integration, and security domains.
Good-To-Have:
- Professional Data Engineer Certification is preferred.
- Knowledge in Python Fundamentals and HiveQL/SQL is favorable.
- Experience with social media data analytics involving high volume and high-frequency data.
- Experience in application development projects focusing on data engineering activities using any of the programming languages (Python, SQL, Java) is desirable.
- Prior experience with big data tools and concepts such as Hadoop, MapReduce, Spark, Hive, HBase, Apache Airflow (orchestration) will be advantageous.