Machine Learning Engineer - Inference (Accelerated AI)

Join Together AI as a Machine Learning Engineer - Inference (Accelerated AI)

Together AI is actively seeking a talented Machine Learning Engineer to join our dynamic Inference Engine team. This critical role focuses on optimizing and enhancing the performance of our AI inference systems, leveraging state-of-the-art large language models to ensure efficiency and scalability.

If you are passionate about AI inference, proficient with PyTorch, and skilled in developing high-performance systems, we want to hear from you. This position offers an exceptional opportunity to collaborate with leading AI researchers and engineers in creating cutting-edge AI solutions. Come shape the future of AI with Together AI!

Key Responsibilities

Design and develop advanced production systems that power the Together AI inference engine, ensuring reliability and performance at a large scale.

Optimize runtime inference services for extensive AI applications.

Collaborate with researchers, engineers, product managers, and designers to introduce new features and research capabilities.

Conduct thorough design and code reviews to uphold the highest quality standards.

Create services, tools, and comprehensive developer documentation to support the inference engine.

Implement robust and fault-tolerant systems for data ingestion and processing.

Job Requirements

A minimum of 3 years of experience in writing high-performance, well-tested, production-quality code.

Proficiency in Python and PyTorch.

Demonstrated experience in building high-performance libraries and tooling.

Exceptional understanding of low-level operating systems concepts, including multi-threading, memory management, networking, storage, performance, and scalability.

Preferred Qualifications

Knowledge of existing AI inference systems such as TGI, vLLM, TensorRT-LLM, Optimum.

Familiarity with AI inference techniques, such as speculative decoding.

Experience with CUDA/Triton programming.

Nice to have: Familiarity with Rust, Cython, and compilers.

About Together AI

Together AI is a pioneering artificial intelligence company driven by research. We are committed to open and transparent AI systems that drive innovation and generate the best outcomes for society. Our mission is to significantly reduce the cost of modern AI systems through the co-design of software, hardware, algorithms, and models.

We have contributed to leading open-source research, models, and datasets to advance the frontier of AI. Our team has been instrumental in technological breakthroughs such as FlashAttention, Hyena, FlexGen, and RedPajama. Join our passionate group of researchers and engineers in our journey to build the next-generation AI infrastructure.

Compensation and Benefits

We offer a competitive compensation package, including startup equity, health insurance, and other impressive benefits. The US base salary range for this full-time position is $160,000 - $220,000, plus equity and benefits. Our salary ranges are determined by location, level, and role, with individual compensation based on experience, skills, and job-related knowledge.

Machine Learning Engineer - Inference (Accelerated AI)

Join Together AI as a Machine Learning Engineer - Inference (Accelerated AI)

Key Responsibilities

Job Requirements

Preferred Qualifications

About Together AI

Compensation and Benefits

Equal Opportunity Employer

For Candidates

For Employers

About Us

Machine Learning Engineer - Inference (Accelerated AI)

Join Together AI as a Machine Learning Engineer - Inference (Accelerated AI)

Key Responsibilities

Job Requirements

Preferred Qualifications

About Together AI

Compensation and Benefits

Equal Opportunity Employer

Report

Related Jobs

Staff Technical Program Manager (Artificial Intelligence-Machine Learning)

Senior Manager, Machine Learning Credit Modeling

Internship for Machine Learning Engineer, Generation - US Remote

Login

Sign Up