Data Engineer

  • Full Time
Job expired!
Rapsodo Inc. is a sports analytics firm that employs machine learning and computer vision to assist athletes in maximizing their performance. Our exclusive technology applications range from assisting PGA Tour golfers in optimizing their launch conditions to enabling MLB pitchers to enhance the efficiency of their breaking balls. Our current partners include all 30 MLB teams, MLB, USA Baseball, Golf Digest, PGA of America, and more than 1000 NCAA athletic departments. We are innovative, focused, and growing rapidly. We are continually on the lookout for highly motivated team members who will stop at nothing to deliver cutting-edge solutions as part of Team Rapsodo. Requirements Responsibilities: - Lead the design, development, and maintenance of our extensive data warehouse architecture, incorporating Google BigQuery, Kafka, GCP Pub/Sub, and other relevant technologies. - Collaborate closely with business units to gather requirements and convert them into effective and scalable data solutions. - Develop and enhance ETL processes to extract, transform, and load data from various sources into our data warehouse, ensuring data quality and accuracy. - Establish and manage real-time data streaming pipelines using Kafka and GCP Pub/Sub for fast data ingestion and processing. - Collaborate with data scientists and analysts to supply them with clean, structured data for analysis and reporting purposes. - Design and put in place data governance strategies to ensure data security, compliance, and privacy. - Monitor and troubleshoot data pipelines, identifying and resolving performance bottlenecks and data quality issues. - Stay up-to-date with emerging technologies and trends in data engineering, proposing innovative solutions to improve our data infrastructure. Qualifications: - Bachelor's or higher degree in Computer Science, Data Engineering, or a related field. - Significant experience as a Data Engineer, specializing in Google BigQuery, Kafka, GCP Pub/Sub, and related technologies. - Profound knowledge of data warehouse architecture, ETL processes, and data integration methodologies. - Proficiency in SQL and experience with optimizing complex queries for performance. - Solid understanding of event-driven architecture and real-time data streaming using Kafka and GCP Pub/Sub. - Familiarity with cloud-based solutions, particularly Google Cloud Platform (GCP). - Experience in designing and implementing data governance and security measures. - Strong problem-solving abilities and the capacity to troubleshoot and resolve complex data-related issues. - Excellent communication skills to collaborate effectively with technical and non-technical stakeholders. - Leadership experience or the ability to guide junior team members is a plus. - Relevant certifications in GCP, Google BigQuery, and Kafka are highly desirable. If you believe you have what it takes and are eager to work independently as well as contribute in an innovative, passionate and driven environment, apply now!
Close menu