Software Engineer- AI/ML, AWS Neuron Applications

Job expired!
AWS Neuron is the comprehensive software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators and the Trn1 and Inf1 servers that utilize them. This position is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. The role is accountable for the development, enablement and performance tuning of a wide variety of ML model families, such as large-scale language models like GPT2, GPT3 and beyond, as well as stable diffusion, Vision Transformers and more. The ML Apps team collaborates closely with chip architects, compiler engineers and runtime engineers to design, build and tune distributed training solutions with Trn1. Previous experience training these large models with Python is essential. Tools such as FSDP, Deepspeed and other distributed training libraries are central to this role, as well as the expansion of all these aspects on the Neuron based system. This role will primarily involve building distributed training and inference support into Pytorch, Tensorflow using XLA and the Neuron compiler and runtime stacks. Additionally, this role will have the responsibility of tuning these models to ensure maximum performance and efficiency when running on customer AWS Trainium and Inferentia silicon and the Trn1, Inf1 servers. Proficiency in software development as well as extensive Machine Learning knowledge are key requirements for this position. We at AWS cherish our diversity and are committed to fostering an inclusive work culture. We offer flexibility in working hours to promote work-life balance, and we value knowledge sharing and mentorship in our team. This position is open to candidates willing to work out of Cupertino, CA, USA or Seattle, WA, USA. Basic qualifications: - At least 3 years of professional software development experience - A minimum of 2 years experience in design and architecture of new and current systems - Experience in programming with at least one software language Preferred qualifications: - A minimum of 3 years experience in the entire software development life cycle - Bachelor's degree in computer science or related field Amazon is an equal opportunity employer and upholds a diverse and inclusive workplace. Our compensation ranges from $115,000/year to $223,600/year, dependent on factors such as location and job-related skills and experience. Amazon is a total compensation company that may offer equity, sign-on payments, and other forms of compensation in addition to a full range of medical, financial, and/or other benefits.