Senior Data Engineer

Job expired!

About You.com:

You.com, a leading AI-powered search platform, is built on the principles of truthfulness, accuracy, and transparency, addressing the common issues of AI hallucinations. Founded by distinguished AI research scientists Richard Socher and Bryan McCann, You.com stands as a groundbreaking force in the realm of natural language processing (NLP).

Richard Socher, formerly the Chief Scientist at Salesforce, is renowned for being the third most-cited researcher in NLP with over 170,000 citations. Bryan McCann was a lead research scientist at Salesforce Research, focusing on deep learning and NLP. Together, their impactful research has revolutionized word vectors, contextual vectors, and prompt engineering. Richard's accomplishments were recently acknowledged by Time Magazine’s TIME100 AI list in 2023, as one of the “most influential people in AI,” and the 2023 ACL Test-of-Time Paper Award for his seminal 2013 publication.

Since inception, You.com has redefined how users interact with information online through its AI Assistant, addressing everyday needs with unparalleled precision. Acclaimed as one of Fortune Magazine’s 50 AI Innovators for 2023 and featured in Time Magazine’s “Best Inventions of 2022,” You.com has significantly contributed to solving Large Language Model (LLM) challenges concerning trust and accuracy. Notably, You.com introduced the first consumer-facing LLM with internet access, delivering real-time, cited answers. Its API supports other LLM-based chatbots in enhancing their accuracy through real-time web integration.

You.com emphasizes personalized AI chat experiences, tailoring responses based on user preferences while safeguarding privacy and ensuring transparent control over personal data. The platform is accessible on desktop, Chrome web extensions, iOS and Android apps, and WhatsApp.

About the Role:

We are seeking a Senior Data Engineer - Analytics to join our team. In this role, you will work cross-functionally to establish data engineering and data science excellence, enhancing our product growth. Your responsibilities include optimizing data warehouse design and performance, evolving critical product analytics systems, expanding product data use cases, and developing a world-class data culture. The ideal candidate will have dual expertise as both a data engineer and a data scientist, with a passion for understanding user behavior and promoting growth.

Responsibilities:

  • Data Pipeline Development: Design, build, and maintain robust data pipelines and APIs. Collect, process, and serve data from various sources such as backend events, customer interactions, marketing channels, and LLM evaluations to drive data-driven product growth.
  • Cross-functional Collaboration: Work collaboratively with product managers, marketing teams, and data scientists. Identify opportunities for significant business impact, understand requirements for data infrastructure, drive engineering decisions, and quantify impact.
  • Scale and Optimize: Design and implement scalable data architectures and ETL processes. Optimize data pipelines for performance, scalability, and reliability in managing our growing user base.
  • Operational Excellence: Efficiently manage cloud resources (AWS/Azure) using tools like Terraform and Kubernetes. Ensure end-to-end event instrumentation, guaranteeing data completeness and correctness.

Qualifications:

  • Educational & Professional Experience: Bachelor’s degree in Computer Science or related field, or at least 4 years of experience in a Data Engineering role.
  • Technical Expertise: Proficiency in distributed processing frameworks (Databricks/Spark), stream processing, and event-driven technologies (e.g., Kafka). Advanced skills in Python and Spark (Spark SQL, DataFrames, Spark Streaming, RDD caching, Spark MLib). Experience with infrastructure automation using Terraform and familiarity with cloud platforms (Azure and AWS).
  • Communication and Mindset: A proactive attitude with a track record of using data to drive product improvements. Excellent problem-solving and analytical abilities with the capacity to communicate complex technical topics to diverse audiences.

Our Perks:

  • Remote-first work environment with hubs in California, NYC, and Canada offering monthly in-person gatherings.
  • Unlimited PTO with 11 U.S. holidays observed and a week-long shutdown in December.
  • Competitive health insurance plan covering 100% of the policyholder.
  • 12 weeks of paid paternity leave in the US, with additional time off considered.
  • 401k program.
  • $500 work-from-home stipend valid up to a year from the start date