Specialist, HPC Systems Research & Development

Job expired!

Join KLA: Innovators in Semiconductor Technology

Company Overview

KLA is a global leader in diversified electronics within the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced utilizing our advanced technologies. Whether it's laptops, smartphones, wearable devices, voice-controlled gadgets, flexible screens, VR devices, or smart cars, all have been shaped by KLA’s innovations. KLA creates systems and solutions integral to manufacturing wafers, reticles, integrated circuits, packaging, printed circuit boards, and flat panel displays.

Our success hinges on continuous inspiration, rigorous research, and cutting-edge development. KLA prioritizes innovation, reinvesting 15% of sales back into R&D. Our dedicated teams of physicists, engineers, data scientists, and problem-solvers collaborate with the world’s leading technology providers to fast-track the delivery of future electronic devices. Experience an exciting work life where teams are energized by solving tough challenges every day.

Our Division

KLA maintains a deep-rooted connection with physics and data. Our tools for optical and electron beam inspection and measurement utilize advanced physics models for both hardware design and algorithm development. Artificial Intelligence, including various machine learning techniques and deep learning models, is routinely employed to process data and meet application requirements.

The AI & Modeling Center of Excellence was established to enhance KLA’s traditional strengths in physics and data. This center spearheads implementation solutions for multiple KLA Inspection and Metrology products tailored to the semiconductor manufacturing industry. The AI & Modeling Center is integrated within KLA’s Central Engineering organization, offering product development expertise in crucial areas for a variety of KLA products.

As a team member, you will collaborate with world-class physicists, HPC system designers, machine learning engineers, and application engineers to innovate models for complex imaging techniques and semiconductor processes. You will also work alongside data scientists and AI infrastructure engineers whose mission is to develop scalable machine learning solutions for our semiconductor customers. If you have a passion for Physics Modeling, High Performance Computing (HPC), Machine Learning, Deep Learning, Data Sciences, or cutting-edge Cloud technologies, KLA is the ideal place for you.

Job Opportunity: Specialist, HPC Systems Research & Development

Job Description

KLA’s AI Advanced Computing Labs is seeking an exceptional HPC System R&D Engineer to join our team. You will play a crucial role in developing advanced system-level HPC technologies, forming the foundation of next-generation clusters used in KLA tools. These tools leverage AI to push the boundaries of process control in semiconductor manufacturing. Developed technologies will be tested on on-prem clusters, serving as prototypes for future KLA tools.

Your Day-to-day Roles

  • Identify limitations in existing solutions based on clusters of CPUs & GPUs, and deploy AI-based solutions on on-prem and cloud infrastructures at scale.
  • Develop distributed frameworks and system-level solutions to scale out image processing and AI loads from single GPU to multi-node clusters with multiple GPUs.
  • Install, benchmark, and evaluate pre-release hardware for early-stage evaluation and prototyping by developing relevant workloads.

Minimum Qualifications

  • Masters / PhD in Computer Science or related fields; Bachelor's degree holders with relevant experience and an extraordinary track record will also be considered.
  • Deep understanding of operating systems, computer networks, and high-performance applications.
  • A strong mental model of modern distributed systems architecture comprised of CPUs, GPUs, and accelerators.
  • Experience deploying deep learning frameworks such as TensorFlow and PyTorch on large-scale on-prem or cloud infrastructures.
  • Comprehensive knowledge of modern and advanced C++ concepts.
  • Proficiency in scripting languages such as Bash, Python, or similar.
  • Excellent communication skills.

Additional Qualifications That Impress

  • Experience in heterogeneous programming languages like CUDA, Triton, etc.
  • Strong track record of model development on DL frameworks such as TensorFlow and PyTorch.
  • Substantial experience building open-source operating systems and software stacks on pre-release hardware.
  • In-depth knowledge of container infrastructure such as Docker or Singularity, and Kubernetes.
  • Active participation in C++ standards bodies or similar organizations.

KLA offers a competitive and family-friendly total rewards package. Our programs reflect our commitment to creating an inclusive environment, ensuring benefits