Senior MLOps Engineer

  • Full Time
Job expired!

About SecurityScorecard:

SecurityScorecard is the worldwide leader in cybersecurity ratings, continuously rating over 12 million businesses operating across 64 nations. SecurityScorecard was founded in 2013 by security and risk experts Dr. Alex Yampolskiy and Sam Kassoumeh. Our innovative, patented rating technology is used by over 25,000 organizations for self-monitoring, third-party risk management, board reporting, and cyber insurance underwriting; making every organization more resilient by allowing them to easily find and fix cybersecurity risks across their digital footprint.

SecurityScorecard, based in New York City, has been recognized by Inc Magazine as a "Best Workplace,” by Crain’s NY as one of the "Best Places to Work in NYC," and as one of the top 10 hottest SaaS startups in New York for two consecutive years. Most recently, SecurityScorecard was named to Fast Company’s annual list of the and to the Achievers 50 Most Engaged Workplaces in 2023 award recognizing “proactive employers for their unwavering commitment to employee participation.” SecurityScorecard is proud to be supported by world-leading investors including Silver Lake Waterman, Moody’s, Sequoia Capital, GV, and Riverwood Capital.

About the Role:

We are looking for an experienced Senior MLOps Engineer to join our Data Science team. In this role, you will collaborate with a multifunctional team of ML engineers, data engineers, and data science researchers. You will combine efforts with other specialists to design, build, deploy, and operate production pipelines and microservice systems focusing on MLOps best practices. You will manage and streamline infrastructure, including feature stores, data mesh, and our AI platform, automating the training, delivery, and updating of our machine learning models. If you're a problem solver, effective communicator, and enthusiastic about advancing AI and ML in the security sector, we want you on our team.

What You'll Do:

  • Lead the creation, operation, and maintenance of critical infrastructure projects and automation for the data science team.
  • Educate and guide team members in applying best practices in operations and safety.
  • Conduct code reviews and provide feedback on Github pull requests.
  • Identify opportunities for technical and process improvement and implementation.
  • Utilize best practices such as immutable containers, Infrastructure as Code, stateless applications, and software observability.
  • Adjust large-scale distributed system performance to meet SLA metrics for stability, uptime, scalability, and latency while managing costs.
  • Continually enhance CI/CD processes to automate builds and deployments.
  • Collaborate with scientists and engineers to understand KPIs and setup observability, monitoring, and alerting systems.
  • Oversee the set up of Terraform/Kubernetes and the related tools needed for data pipelines, feature stores, data mesh, and machine learning model delivery.
  • Identify and resolve networking issues or communicate them clearly to centralized IT teams.
  • Analyze system layer abstractions to investigate and resolve complex distributed system performance issues. 

What We Need You To Have:

  • 4-5+ years of experience in MLOps/DevOps in the cloud (AWS, GCP, or Azure).
  • Experience with Apache Spark and big data streaming infrastructure (data lakes, Snowflake, Databricks, S3).
  • Experience in a production environment with Amazon Web Services (AWS) or an equivalent.
  • Experience supporting data stores such as RDMBS (Postgres), KVS (Cassandra/ScyllaDB) and queues/streaming (Kafka).
  • Proficient with Terraform, Git, Python, bash/shell scripting, and Docker containers.
  • Experience with CI/CD processes (Jenkins, Ansible) and automated configuration tools (Terraform, Ansible, etc.).
  • Experience setting up container orchestration (AWS ECS, Kubernetes/K8s).
  • Proficient with dashboard creation and monitoring tools such as Prometheus and DataDog.
  • Ability to plan future infrastructure and project timelines.
  • Ability to work with our highly collaborative team.
  • Strong written and verbal communication skills.
  • Willingness to educate and mentor others.

Preferred Qualifications:

  • Bachelor's degree or higher in computer science, STEM, or a related field.
  • Experience implementing data mesh and feature stores.
  • Strong understanding of networking concepts, including OSI layers, firewalls, DNS, split-horizon DNS, VPN, routing, BGP, etc.
  • Proficient with tools such as Airflow, Argo, Kubeflow, MLFlow, and vector databases.

Benefits:

Specific to each country, we offer an appealing , stock options, healthcare benefits, and unlimited PTO, parental leave, tuition reimbursements, and much more!

SecurityScorecard values Equal Employment Opportunity and embraces diversity. We believe that our team is improved through hiring and retaining employees from diverse backgrounds, skills sets, ideas, and perspectives. We make hiring decisions based on merit and do not discriminate based on race, color, religion, national origin, sex or gender (including pregnancy) gender identity or expression (including transgender status), sexual orientation, age, marital, veteran, disability status or any other protected category in accordance with applicable law.

We also consider qualified applicants regardless of criminal histories, in line with applicable law. We are committed to providing reasonable accommodations for qualified individuals with disabilities in our job application procedures. If you need assistance or accommodation due to a disability, please contact [email protected].

All information you submit to SecurityScorecard as part of your application will be processed in line with the Company’s privacy policy and applicable law.

SecurityScorecard does not accept unsolicited resumes from employment agencies. It should be noted that we do not provide immigration sponsorship for this position.

#LI-DNI