Join Phaidra - Pioneer the Future of Industrial Automation
Welcome to Phaidra, where we are revolutionizing the world of industrial automation.
About Phaidra
Current industrial infrastructures, such as factories and power plants, rely on outdated control systems that can't adapt to new conditions. At Phaidra, we create AI-powered control systems that enable these infrastructures to automatically learn and improve over time.
- Reinforcement Learning Algorithms: Transform raw sensor data into high-value decisions.
- Industrial Applications: Ideal for sensorized environments with measurable KPIs.
- Code-Free Configuration: Domain experts can set up AI control systems without coding.
Our dedicated team has a proven track record, from achieving superhuman performance with DeepMind's AlphaGo to reducing energy consumption at Google Data Centers.
We are a 100% remote company with a team spread across the USA, Canada, UK, Norway, Italy, Spain, Portugal, and India. We hire globally with the help of our partner, OysterHR.
Open Position: Site Reliability Engineer
Phaidra is looking for a passionate and innovative Site Reliability Engineer to join our engineering team. You will work on building and maintaining world-class infrastructure, ensuring the smooth operation and continuous improvement of our systems.
Location: North America/India
Responsibilities
As a Site Reliability Engineer, your core responsibilities will include:
- Managing cloud infrastructure on AWS, GCP, or Azure
- Setting up large-scale data ingestion and processing systems
- Building distributed model training and evaluation platforms
- Automating CI/CD pipelines and system improvements
- Ensuring multi-cloud deployments
- Utilizing Cloud Native technologies like Kubernetes, Prometheus, and gRPC
- Applying SRE principles for observability, automation, and change management
Key Qualifications
- 5+ years of experience
- Bachelor's or Master's in Computer Science or equivalent
- Experience with AWS, GCP, or Azure
- Proficiency in Linux, Docker, and Kubernetes
- Familiarity with Terraform and monitoring stacks like Prometheus
- Programming skills in Python, Go, or Bash
- Understanding of DevOps, SRE principles
Preferred Skills & Experience
- Multi-cloud environment expertise
- Software engineering experience
- Experience with scalable, multi-tenant systems
Our Tech Stack
Languages: Python, Go, JavaScript/TypeScript, React; C# .NET
- PyTorch
- Docker, Kubernetes, Terraform, Kapitan
- Gitlab CI, ArgoCD, Atlantis, Vercel
- GCP (GKE, PubSub, CloudSQL, etc.)
- Ray.io, REST, and gRPC micro-services
- Poetry, Pantsbuild
Your Onboarding Journey
First 30 Days
- Introduction to Phaidra and our product
- Engage with the Engineering team
- Setup development environment
By 60 Days
- Solid understanding of our operations
- Complete onboarding exercise
By 90 Days
- Fully integrated with the team
- Conduct on-call monitoring and improvements
- Share knowledge across the organization
Interview Process
- Initial Screening: People Operations (30 minutes)
- Meeting: Director, Infrastructure Engineering (30 minutes)