Software Engineer, Analytics Data Infrastructure

Job expired!

Join Our Innovative Analytics Data Infrastructure Team at OpenAI!

About the Team: At OpenAI, the Research Platform Analytics team is at the forefront of designing, building, and operating foundational data and analytics structures essential for AI research. Our objective is singular: propel AI research towards AGI by managing critical components of the research training stack. This includes everything from advanced data processing pipelines to sophisticated libraries that support our distributed training models, alongside providing observability and analytics systems to enhance research quality and manage data lifecycle at scale.

Role Overview: Software Engineer, Analytics Data Infrastructure

About the Role: As OpenAI expands, we require dedicated and skilled engineers to support the growing demands of our researchers and engineers. Your role will involve enhancing data processing pipelines, improving observability systems, and executing data lifecycle management projects with a focus on efficiency, security, and scalability. This position is ideal for those experienced in scaling Kubernetes services, debugging Kafka consumer lag, diagnosing distributed systems failures, and developing end-to-end data processing pipelines. Whether you're based in San Francisco, CA, or prefer to work remotely within the US, we offer a flexible hybrid work model and relocation assistance.

Key Responsibilities:

  • Make impactful architecture and engineering decisions using your advanced experience and knowledge.
  • Uphold the security, integrity, and compliance of our data in line with industry and company standards.
  • Scale our analytics and data platforms to support substantial growth.
  • Enhance company productivity by developing superior data tooling and systems for our team.
  • Collaborate with various teams to introduce new features and foundational capabilities.
  • Manage system reliability and participate in an on-call rotation for critical incident responses.

Who Should Apply?

You are likely a stellar fit for this role if you:

  • Have built both stream and batch data processing pipelines using tools like Kafka, Spark, or Flink.
  • Are skilled in modern infrastructure management with systems like Kubernetes and Terraform.
  • Possess a strong interest or background in observability systems, particularly in the context of ML training.
  • Have significant experience in ML training organizations, especially with pre-training data transformations.
  • Are an adept software engineer with expertise in Python and are experienced in managing large codebases.
  • Have handled data lifecycle management in large-scale environments, addressing access control, data movement, metadata management, etc.
  • Thrive in fast-paced environments and are a proactive self-starter.

About OpenAI

OpenAI is a leading AI research and deployment company aimed at ensuring the extensive benefits of general-purpose AI for humanity. As an equal opportunity employer, we are committed to diversity and inclusion and welcome applicants from all backgrounds. We adhere to fair hiring practices and also provide accommodations for applicants with disabilities.

Embark on a career that challenges and fulfills you. Join us at OpenAI and help shape the future of AI technology.

Additional Information: Company Name: OpenAI Job Title: Software Engineer, Analytics Data Infrastructure