Site Reliability Engineer

Other
Other places
06/27/2024
-

Job expired!

Join Phaidra - Pioneer the Future of Industrial Automation

Welcome to Phaidra, where we are revolutionizing the world of industrial automation.

About Phaidra

Current industrial infrastructures, such as factories and power plants, rely on outdated control systems that can't adapt to new conditions. At Phaidra, we create AI-powered control systems that enable these infrastructures to automatically learn and improve over time.

Reinforcement Learning Algorithms: Transform raw sensor data into high-value decisions.
Industrial Applications: Ideal for sensorized environments with measurable KPIs.
Code-Free Configuration: Domain experts can set up AI control systems without coding.

Our dedicated team has a proven track record, from achieving superhuman performance with DeepMind's AlphaGo to reducing energy consumption at Google Data Centers.

We are a 100% remote company with a team spread across the USA, Canada, UK, Norway, Italy, Spain, Portugal, and India. We hire globally with the help of our partner, OysterHR.

Open Position: Site Reliability Engineer

Phaidra is looking for a passionate and innovative Site Reliability Engineer to join our engineering team. You will work on building and maintaining world-class infrastructure, ensuring the smooth operation and continuous improvement of our systems.

Location: North America/India

Responsibilities

As a Site Reliability Engineer, your core responsibilities will include:

Managing cloud infrastructure on AWS, GCP, or Azure
Setting up large-scale data ingestion and processing systems
Building distributed model training and evaluation platforms
Automating CI/CD pipelines and system improvements
Ensuring multi-cloud deployments
Utilizing Cloud Native technologies like Kubernetes, Prometheus, and gRPC
Applying SRE principles for observability, automation, and change management

Key Qualifications

5+ years of experience
Bachelor's or Master's in Computer Science or equivalent
Experience with AWS, GCP, or Azure
Proficiency in Linux, Docker, and Kubernetes
Familiarity with Terraform and monitoring stacks like Prometheus
Programming skills in Python, Go, or Bash
Understanding of DevOps, SRE principles

Preferred Skills & Experience

Multi-cloud environment expertise
Software engineering experience
Experience with scalable, multi-tenant systems

Our Tech Stack

Languages: Python, Go, JavaScript/TypeScript, React; C# .NET

PyTorch
Docker, Kubernetes, Terraform, Kapitan
Gitlab CI, ArgoCD, Atlantis, Vercel
GCP (GKE, PubSub, CloudSQL, etc.)
Ray.io, REST, and gRPC micro-services
Poetry, Pantsbuild

Your Onboarding Journey

First 30 Days

Introduction to Phaidra and our product
Engage with the Engineering team
Setup development environment

By 60 Days

Solid understanding of our operations
Complete onboarding exercise

By 90 Days

Fully integrated with the team
Conduct on-call monitoring and improvements
Share knowledge across the organization

Interview Process

Initial Screening: People Operations (30 minutes)
Meeting: Director, Infrastructure Engineering (30 minutes)

Date Posted:
Posted 06/27/2024
Expiration date:
07/27/2024
Location:
Other places
Hours:
30h / week
Experience:
Fresh
Gender:
Both

Site Reliability Engineer

Join Phaidra - Pioneer the Future of Industrial Automation

About Phaidra