Senior Python Deep Learning Automation Engineer, Deep Learning Algorithms

  • Full Time
Job expired!
We are currently seeking a Senior Python Automation Engineer for our Deep Learning Algorithms team! Join the team that is developing software set to be used globally in the world of AI. You'll work alongside top-notch software engineers to create a large scale toolset that tests deep learning models and frameworks on the most powerful computers. Due to the multifaceted and fast-paced nature of the environment, good interpersonal skill is required. In this role, you'll interface with internal partners, users, and members of the open source community to implement solutions for building, testing, integrating, and releasing NVIDIA AI Services and Deep Learning Frameworks on high-powered, enterprise-grade GPU clusters that can facilitate hundreds of Peta FLOPS. This position spans multiple products such as PyTorch, TensorFlow, JAX, PaddlePaddle. You will collaborate with internal engineering teams to deploy and operationalize AI models and services at scale, promoting end-to-end Machine Learning and Deep Learning solutions in the cloud and on-premise. We're looking for passionate and dedicated Python developers to scale our AI and deep learning services, platforms, models, and internal tools. Your duties will include implementing and maintaining tools, and infrastructure that facilitate our team in productizing NVIDIA software stack: from Deep Learning Frameworks (PyTorch, TF, JAX, PaddlePaddle), to Deep Learning models and AI services. Are you up for this challenge? Your responsibilities will include: - Automating and optimizing testing of Deep Learning models and AI Services from different data domains with a focus on inference - Developing shared utilities for setting up systems, executing tests, recording results, and visualizing them on dashboards - Configuring, maintaining, and building upon deployments of industry-standard tools (e.g., GitLab, Docker, Bash) - Leading best practices for building, testing, and releasing software, including AI Services and Deep Learning models - Identifying infrastructure needs and converting them into action - Building tools for automatic content generation mechanisms that save significant engineering hours Required qualifications: - BSc or MS degree in Computer Science, Computer Architecture, or a related technical field or equivalent experience - 5+ years of work experience in software development - Excellent Python programming skills, superb coding skills, and a profound understanding of OOP concepts - Familiarity with DevOps concepts, like CI/CD, Docker, Jenkins, and automation tools - Experience constructing both front-end (e.g., JS, React, Vue, Dash, Streamlit) and back-end (e.g., Flask, FastAPI, Django) services - Knowledge of Deep Learning allowing benchmarking on Deep Learning models - A willingness to take action and robust analytical skills - Strong time-management and organizational skills for coordinating multiple initiatives, setting priorities, and implementing new technology and products into very complex projects - Good communication and documentation habits To stand out from the crowd, you may have: - Experience with containerization technologies like Docker - Experience building monitoring or dashboarding solutions to support CI/CD pipelines - Hands-on experience configuring complex CI pipelines - Experience with High-Performance Computing (HPC) based compute clusters and scheduling solutions like Slurm - Solid understanding of Linux environments NVIDIA is widely considered to be one of the most desirable employers in the technology sector. We employ some of the most brilliant and forward-thinking individuals globally. If you are creative and autonomous, we would love to hear from you! The base salary range is 144,000 USD - 270,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. You will also be eligible for equity and benefits.