Staff Data Engineer (Generative AI)

Job expired!

Company Description

At NBCUniversal, we create world-class content, distributing it through our extensive portfolio of film, television, and streaming services, as well as bringing it to life in our theme parks and consumer experiences. We own and operate premier entertainment and news brands including NBC, NBC News, MSNBC, CNBC, NBC Sports, Telemundo, NBC Local Stations, Bravo, USA Network, and Peacock, our premium ad-supported streaming service. We produce and distribute top-tier filmed entertainment through Universal Filmed Entertainment Group and Universal Studio Group, and feature internationally renowned theme parks and attractions through Universal Destinations & Experiences. NBCUniversal is a subsidiary of Comcast Corporation.

At NBCUniversal, you can be your authentic self. We are uniquely positioned to educate, entertain, and empower through our diverse platforms. Our commitment to Diversity, Equity, and Inclusion (DEI), along with our Corporate Social Responsibility (CSR) work, is shaped by our employees, audiences, park guests, and communities. We aim to cultivate a diverse, inclusive, and equitable culture where employees feel valued, empowered, and heard. Together, we will continue to create content that mirrors the ever-evolving face of the world.

Job Description

We are seeking a Staff Data Engineer (Generative AI) to build the next generation of data pipelines and applications by leveraging cutting-edge technologies including generative AI and large language models (LLMs). As a Staff Data Engineer, you will design data integration frameworks and pipelines, and remain hands-on throughout development to ensure robust production environments. This role involves collaborating with internal stakeholders, data engineers, visualization experts, data scientists, and other technologists to provide rapid, innovative tech solutions.

The ideal candidate should possess expertise in designing, building, and supporting APIs, machine learning services, LLMs, lang-chain, and foundational data warehousing technologies. An enthusiasm for generative AI and its potential to accelerate business processes is essential. This role focuses on creating scalable, efficient pipelines for LLMs and defining our vision for LLM analytics.

Key Responsibilities

  • Design, build, and scale data pipelines across various source systems and streams, distributed environments, and downstream applications.
  • Deep understanding of machine learning best practices and algorithms.
  • Solid grasp of data modeling, warehousing, and architecture principles.
  • Implement design patterns optimizing performance, cost, security, scale, and user experience.
  • Work with cross-functional teams to develop efficient data acquisition and integration strategies.
  • Become a subject matter expert for data engineering technologies and designs.
  • Coach others on scalable pipeline creation using foundational principles.
  • Participate in development sprints, demos, and retrospectives.
  • Build and manage relationships with supporting engineering teams.
  • Collaborate with data scientists, business analysts, and ML infrastructure teams to bridge business and technology.
  • Develop automated tests ensuring compatibility across NBCUniversal’s systems and platforms.
  • Create documentation for developers and business users.
  • Troubleshoot application, cloud, and configuration issues as necessary.
  • Utilize tools for code and test generation to accelerate feature delivery.

Qualifications

Essential:

  • 6+ years of data engineering experience with leadership roles.
  • Critical problem-solving skills and resourcefulness in finding solutions.
  • Proven experience working in an agile development environment.
  • Understanding of REST-based APIs and AI workload components like vectorized embeddings.
  • Experience with data modeling, ETL/ELT, cloud development, and data warehousing.
  • Knowledge of AWS, Azure, GCP, and data management fundamentals.
  • Proficiency in Python/SQL or similar programming languages.
  • Bachelor’s degree in Computer Science, Data Science, Statistics, Informatics, Information Systems, or a related field.

Desired:

  • Experience integrating large language models and AI-generated content technologies.
  • Familiarity with LLM integration ecosystems like LangChain.
  • Adaptability in a fast-paced tech environment with strong problem-solving skills.
  • Effective communication and collaborative skills in a large organization.
  • Experience with Snowflake, AWS,