Senior Data Scientist - NLP

Job expired!

About Builder.ai

We're on a mission to simplify app building so everyone can do it – no matter their background, technical knowledge, or budget. So far, we've assisted thousands of entrepreneurs, small businesses, and even global brand names, including the BBC, Makro and Pepsi, in achieving their software ambitions - and we're only getting started.

Builder.ai has been awarded 'Most Innovative Company in AI' for 2023 by Fast Company, and 'Scaleup of the Year' at the 2022 Europas. With an international team of over 800 members, our recent announcement of $250m in series D funding (and partnership with Microsoft), joining us has never been more thrilling.

Life at Builder.ai

At Builder.ai, we inspire experimentation! Every role at Builder offers unlimited opportunities for growth, challenge, and learning. We aim to become even better at helping our customers take AI app building to new heights.

Our international team is diverse, cooperative, and extremely talented. Our shared belief in Builder’s HEARTT values define us: (Heart, Entrepreneurship, Accountability, Respect, Trust and Transparency). Above all, we love getting things done.

We offer a host of excellent benefits in return for your skills and dedication. These include hybrid working, variable annual bonuses, company stock options, generous paid leave and overseas trips. #WhatWillYouBuild

Why you should join

As we carry on innovating and growing, our Intelligent Systems team is looking for a dedicated Senior Data Scientist with a special focus on Natural Language Processing (NLP). In this role, you'll closely collaborate with global product and engineering teams, leading landmark initiatives in data science, machine learning, and AI to drive smart decision-making, with the potential for major growth in the coming year and beyond.

This role offers ownership of a variety of existing use cases and a unique opportunity to investigate ideas presently within pure research. Key challenges include:

  1. Automatic Speech-to-Text Transcription: Develop advanced models for precise audio call transcription between customers, partners, and team members.
  2. Feature Extraction from Transcripts and Documents: Create innovative techniques for extracting informative entities and features from call transcriptions and documents.
  3. App Template Recommendations: Employ NLP to recommend app templates based on customer ideas and requirements.
  4. Feature Recommendations: Build models to propose app features based on customer-provided descriptions and requirements.
  5. Conversational AI Engagement: Implement conversational AI solutions, such as chatbots, to engage with customers and team members, collect requirements, create project "buildcards," and provide project progress updates.
  6. Custom Speech Recognition Models: Refine and build bespoke speech recognition models to enhance transcription accuracy.
  7. Custom Language Models: Develop custom language models to improve our understanding of language semantics within the Builder domain.
  8. Customer Question Answering: Create models to automatically answer customer questions, enhancing support and engagement.

We are expanding our IS-NLP team, which is responsible for managing NLP services like Template Recommendation, Feature Search, Story Similarity, Feature Tagging, and Natasha. As we broaden our service portfolio to enhance Builder's delivery procedures, we are eager to grow our team in line with our ambitious objectives.

Requirements

  • Entrepreneurial Mindset: Show an entrepreneurial spirit and a "can-do" mentality, thriving in a dynamic and innovative environment.
  • Python Proficiency: Proven proficiency in Python programming, demonstrating your ability to construct solid data science solutions.
  • Data Manipulation Skills: Real-world experience with SQL for data querying, data manipulation, and feature engineering to draw valuable insights from complex datasets.
  • Data Science Libraries: Competence with essential data science libraries such as Pandas, Numpy, Scipy, and Seaborn for data analysis and visualization.
  • Deep Learning Expertise: Practical experience with Deep Learning libraries, particularly PyTorch and HuggingFace, demonstrating your ability in advanced machine learning techniques.
  • NLP Proficiency: Familiarity with NLP toolkits like Spacy, NLTK, and TextBlob, and a track record of solving NLP problems, including text classification, named entity recognition, search, and recommendation.
  • Version Control and CI/CD: Proficiency in using GitHub and CI/CD pipelines for automated deployment of data science solutions.
  • MLOps: Experience in MLOps, including setting up model monitoring and optimization, ensuring models perform at their best.
  • Web Services: Knowledge of web service frameworks like FastAPI and Flask for hosting models and integrating them with existing services through RESTful APIs.
  • Communication Skills: Excellent communication skills with the ability to engage and effectively present to a diverse range of stakeholders.
  • Interdisciplinary Collaboration: Demonstrated ability to work collaboratively within interdisciplinary teams including product, engineering, business, and technology experts.

Desired Qualifications

  • Advanced Education: A PhD or advanced Master's degree in a scientific discipline such as Statistics, Computer Science, Operational Research, Mathematics, or Physics.
  • Machine Learning Expertise: Experience in one or more advanced machine learning areas, including Supervised Learning, Deep Learning, Probabilistic Inference, Statistical Modeling, Bayesian Statistics, Unsupervised Learning, and Reinforcement Learning.
  • Passion for Software Development: A strong passion for software development and engineering, complementing your data science skills.
  • Industry Experience: 2-4 years of industry experience, with a proven record of bringing concepts and models from inception to production, quantifying their business impact.
  • Consumer/Product Expertise: Previous experience in a consumer, product, or eCommerce business is beneficial, demonstrating your ability to handle real-world challenges.
  • Academic Research: Academic research experience is advantageous, showing your ability to propose novel and creative solutions to non-standard machine learning problems.
  • Containeriz