Research Scientist, GenAI - Multimodal Audio (Speech, Sound and Music)

Job expired!

Join Our GenAI Team as a Research Scientist in Multimodal Audio at Meta

At Meta, our GenAI organization is pioneering the development of cutting-edge large language models (LLM) and multimodal generative foundation models. Our mission is to continually raise the bar for open-source foundation models, powering a wide array of Meta products and setting industry standards. We are currently seeking a dynamic Research Scientist to focus on multimodal audio within our GenAI team, enhancing platforms with innovations in speech, sound, and music.

Key Responsibilities

As a Research Scientist, you will be deeply involved in the full lifecycle of research surrounding multimodal generative foundation models focused on audio modality. Your role will include:

  • Conceptualizing and initiating ideas and bringing them through to fruition.
  • Designing, implementing, and refining models and algorithms.
  • Managing the collection and selection of training data and the training/tuning/scaling of models.
  • Evaluating performance, contributing to open sourcing efforts, and publication of findings.
  • Collaborating seamlessly with language and vision research teams to enhance collective goals and outputs.

Minimum Qualifications

  • A Bachelor's degree in Computer Science, Computer Engineering, or a closely related technical field.
  • A demonstrable track record of research in domains such as audio or vision, evidenced by publications or significant industrial experience.
  • Advanced degrees preferred, such as a PhD with 3+ years of experience or a Bachelor's with at least 5 years of industrial research experience in relevant fields.
  • Expertise in neural networks, with proficiency in ML frameworks like Pytorch, Tensorflow, or JAX.
  • Strong programming skills in Python and solid communication abilities.

Preferred Qualifications

  • Robust publication record in related fields of audio and visual technologies.
  • Experience in audio dataset curation, model scaling, and evaluation of audio generation models.
  • Ability to handle large-scale data processing and tackle complex problems requiring cross-functional collaboration.

About Meta

Meta connects the world in transformative ways through leading platforms like Facebook, Messenger, Instagram, and WhatsApp. Our continuous innovations in augmented and virtual realities are reshaping the possibilities of digital connectivity. By joining Meta, you engage with a legacy of pushing boundaries, envisioning a future where engagement extends beyond screens and distances, into experiences dictated by imagination.

Compensation & Benefits

Our competitive compensation package includes a base salary ranging from $177,000 to $251,000 per year, depending on skills and experience. This is complemented by a bonus, equity, and comprehensive benefits.

Meta is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We also provide accommodations to candidates with disabilities, long-term conditions, or other needs as part of our recruitment process.

Ready to redefine the frontiers of technology and connectivity? Apply now to become a Research Scientist in Multimodal Audio at Meta and help shape the future of social technology.

Location: Meta offers a flexible working location policy, designed to help you do your best work from anywhere.

For further details or to submit an application, please visit our careers page.