Research Software Engineer, Data Quality

  • Full Time
Job expired!

Research Software Engineer, Data Quality


About the Team

Our team is tasked with training the ChatGPT models utilized by millions of users across both the ChatGPT and API platforms.

About the Role

One of the most critical aspects of enhancing ChatGPT is the construction and training on high-caliber datasets. We're on the hunt for engineers who can assist us in developing data pipelines and conducting research probes relevant to data quality.

The ideal candidates should possess a sturdy technical background and broad-ranging knowledge. Owing to the interconnected nature of our data systems with the underlying large language models, candidates should be familiar with the rapidly evolving LLM stack and have experience in an applied Machine Learning environment.

This role is anchored in San Francisco, CA. We employ a hybrid work model, necessitating 3 days in the office each week, and offer relocation assistance to new hires.

In this role, you will:

  • Construct systems and pipelines for the continuous processing and filtration of large data volumes.
  • Train and implement Machine Learning classifiers and embeddings on voluminous data to sort and categorize by data quality and other characteristics.
  • Collaborate with researchers on dataset generation and preparation to initiate experiments.
  • Take the lead on other research ventures involving data and data quality.

You might excel in this role if you:

  • Have 3+ years of experience working with data-intensive applications or distributed systems and 6+ years of any software engineering experience (including data engineering).
  • Are highly proficient with Python.
  • Have the ability to write, debug and optimize Spark code, and have understanding of data orchestration and automation tools (e.g. Airflow).
  • Are familiar with Machine Learning, Deep Learning, and Large Language Models and corresponding infrastructure (e.g. PyTorch)
  • Experience working with embeddings and vector libraries is a bonus.

We are an equal opportunity employer and do not discriminate based on race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other status protected by law. In compliance with the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations for applicants with disabilities, and requests can be made via this link.

OpenAI US Applicant Privacy Policy

Compensation, Benefits and Perks

Total compensation incorporates generous equity and benefits.

  • Medical, dental, and vision insurance for you and your family
  • Support for mental health and wellness
  • 401(k) plan with 4% match
  • Unlimited time off and more than 18 company holidays per year
  • 20 weeks of paid parental leave and support for family-planning
  • Annual learning & development stipend ($1,500 per year)