Site Reliability Engineer - EMEA Remote

Job expired!

Join Hugging Face as a Site Reliability Engineer - EMEA Remote

At Hugging Face, we are on a mission to advance the state of Machine Learning and make it more accessible to everyone. As part of our journey, we contribute to the development of cutting-edge technology for the greater good.

We are proud to host the world's fastest-growing, open-source library of pre-trained models. With over 1 million models and 320K+ stars on GitHub, Hugging Face technology is utilized by more than 15,000 companies, including top AI organizations such as Google, Elastic, Salesforce, Grammarly, and NASA.

About the Role

We are seeking a Site Reliability Engineer to help maintain and scale our product infrastructure. The perfect candidate will have substantial experience in managing large-scale infrastructure for AI workflows and a solid background in supporting teams to implement best practices for both reliability and scalability.

Responsibilities

  • Design, develop, deploy, and maintain reliable and scalable infrastructure
  • Manage large Kubernetes clusters
  • Measure and optimize system performance
  • Patch infrastructure to prevent vulnerabilities
  • Ensure important, revenue-critical systems remain operational despite outages and configuration errors
  • Provide primary operational support and engineering expertise to multiple teams

Qualifications

  • 7+ years of experience in a Site Reliability Engineer or Infrastructure Engineer role
  • Strong knowledge of cloud providers such as AWS and GCP, as well as infra-as-code frameworks and observability tools
  • Excellent communication, collaboration, and documentation skills
  • Proficiency with Linux, Git, containers, networking, and command line tools
  • Experience in collaborating and communicating asynchronously

About You

If you are a passionate Site Reliability Engineer with a strong interest in AI and thrive in dynamic, innovative environments, we'd love to hear from you. Join our team to contribute to the advancement of AI technologies while collaborating with talented professionals in a stimulating environment.

More About Hugging Face

We foster diversity, equity, and inclusivity: We are committed to creating a workplace where everyone feels respected and supported, regardless of their background. We believe this is essential for building a great company and community. Hugging Face is an equal opportunity employer and we do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

We value professional development: You will work with some of the brightest minds in the industry. We prioritize impact and constantly challenge ourselves to grow. We provide reimbursement for relevant conferences, training, and education.

We care about your well-being: We offer flexible working hours and remote options, along with health, dental, and vision benefits for employees and their dependents. Additionally, we offer parental leave and flexible paid time off.

We support our employees regardless of location: Although we have office spaces in NYC and Paris, we are widely distributed. Remote employees are given the opportunity to visit our offices, and we will equip your workstation to ensure your success.

We want our teammates to be shareholders: All employees receive company equity as part of their compensation package. If we succeed in becoming a category-defining platform in machine learning and AI, everyone shares in the success.

We support the community: We believe major scientific advancements come from collaboration within the field. Join us in supporting the ML/AI community.