Machine Learning Research Engineer

Job expired!

Join Neural Magic as a Machine Learning Research Engineer

About Neural Magic

Located in Somerville, Massachusetts, Neural Magic is a Series A startup backed by top-tier investors, including Andreessen Horowitz, NEA, Pillar, VMware, Verizon Ventures, Comcast Ventures, and Amdocs. At Neural Magic, we're passionate about making AI open and accessible. Our mission is to empower enterprises worldwide with open-source LLMs and VLLM, accelerating AI adoption and simplifying GenAI deployments. As a key contributor to the vLLM project and pioneers in model quantization and sparsification, Neural Magic offers a robust platform for enterprises to build, optimize, and scale their LLM implementations.

Our Mission

We're on a mission to democratize the power of open-source LLMs and vLLM, bringing them to every enterprise globally.

Your Role

As a Machine Learning Research Engineer at Neural Magic, you will drive innovation by collaborating with our team to solve the most critical challenges in model performance and efficiency. Your work will significantly influence the advancement of our state-of-the-art software platform, shaping the future of AI deployment and utilization.

Be a part of our exciting journey in transforming the landscape of AI!

Responsibilities

  • Research & Innovate: Lead the development of groundbreaking research projects focused on enhancing LLM performance, efficiency, and scalability.
  • Prototype & Experiment: Design and implement prototypes to test new algorithms and techniques, continually pushing the boundaries of model optimization and inference serving.
  • Analyze & Evaluate: Conduct comprehensive experiments and analyses, documenting findings and sharing insights with the team.
  • Collaborate & Communicate: Work closely with product and engineering teams to convert research prototypes into production-ready features, ensuring seamless integration with our platform.
  • Contribute & Share: Stay updated with the latest field advancements, contribute to open-source projects, and disseminate your findings through publications and presentations.

Requirements

  • Research Expertise: Demonstrated experience in conducting independent research or contributing to research projects, especially in LLMs or generative AI.
  • Technical Proficiency: Proficient in Python programming with a deep understanding of PyTorch or similar deep learning frameworks.
  • Optimization Experience: Knowledgeable in model optimization techniques like pruning, quantization, distillation, or other performance-enhancing methods.
  • Problem-Solving Skills: Strong ability to identify and resolve complex technical challenges, applying theoretical knowledge to practical applications.
  • Communication Skills: Excellent written and verbal communication skills for effectively conveying research findings and collaborating with cross-functional teams.

Benefits

  • Competitive compensation and stock option plan
  • Comprehensive healthcare (medical, dental, vision)
  • Retirement plan (401k, IRA)
  • Generous paid time off (vacation, sick leave, holidays)
  • Family leave (maternity, paternity)
  • Disability coverage
  • Professional development opportunities
  • Flexible work arrangements (remote options)
  • Wellness resources
  • Free food and snacks (in the office)

Neural Magic is an equal-opportunity employer committed to fostering a diverse and inclusive workplace. All applicants will be considered for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran, or disability status.