Tech Lead Manager (TLM) - Supercomputing Scheduling

Job expired!

Join Our Supercomputing Scheduling Team at OpenAI

About the Team: The Supercomputing Scheduling Pillar at OpenAI focuses on reliability, scalability, and user-friendliness in job lifecycle management. We pride ourselves on providing efficient and flexible job scheduling, quota management, and streamlined job execution workflows. Our goal is to enhance researcher productivity by ensuring high goodput, efficient packing, and a consistent, ergonomic training workflow, scaling up to larger supercomputers while minimizing operational load.

Job Role: Tech Lead Manager (TLM) - Supercomputing Scheduling

About the Role: As a Tech Lead Manager (TLM) / Engineering Manager within our Scheduling Pillar, you will lead a dynamic team that designs, deploys, and manages job lifecycle management systems for model training on some of the world's largest supercomputers. This role offers an immense scale, tight timelines, and the chance to significantly impact OpenAI’s mission. A deep technical understanding is essential, though not specifically in ML/DL.

This position is based in San Francisco, CA, and follows a hybrid work model with three days in-office per week. Relocation assistance is available for qualified candidates.

Key Responsibilities

  • Direct management of Individual Contributors (ICs) developing our supercomputing scheduling technology.
  • Build and lead high-performing teams to deliver our technology safely and reliably to users globally.
  • Design, implement, and manage crucial components of our job scheduling, quota management, and queuing systems.
  • Collaborate closely with researchers to align supercomputing resources with project demands.
  • Integrate job lifecycle features with cluster infrastructure, storage solutions, and hardware health protocols.

Who Should Apply?

You might be a perfect fit if you:

  • Have extensive experience with hyperscale scheduling systems.
  • Possess robust programming skills and a strong record in public cloud environments, particularly Azure.
  • Are driven, with a sharp focus on execution and user needs.
  • Can lead technical teams effectively, fostering a diverse, equitable, and inclusive workplace culture.
  • Are proactive in problem-solving and eager to acquire new knowledge as needed.
  • Excel in communication, with a knack for clear expression and attentive listening.

Experience with AI/ML workloads is an asset but not required.

About OpenAI

OpenAI is committed to advancing AI technology that can profoundly benefit all of humanity. Our core mission is to ensure that the development of artificial intelligence is conducted with safety and public welfare in mind. We welcome diverse perspectives and are proud to be an equal opportunity employer.

If you’re ready to shape the future of technology, apply today to join our team at OpenAI!

For more information on our privacy policies and employment regulations, please visit our career page.

Apply now to revolutionize the future of artificial intelligence with OpenAI!