Data Trainer - Machine Learning & NLP

Job expired!

Data Trainer / Data Scientist - Machine Learning & NLP at MindTech

MindTech, a pioneer in delivering comprehensive data security and compliance solutions, is in search of a seasoned Data Trainer / Data Scientist with a focus on Machine Learning and NLP. Our ideal candidate thrives in a high-tech environment, shaping high-quality datasets to enhance data-driven solutions across various business domains.

Description of the Role:

The Data Trainer / Data Scientist will be instrumental in generating and managing robust datasets utilized by AI/software developers, QA teams, and field engineers. This position primarily focuses on creating and maintaining datasets containing personal identifiable information, crucial for training AI models and facilitating QA testing. While the core aim is not to develop new models, capabilities in modeling will be considered a valuable addition.

Seniority:

We are seeking a senior team member who can operate independently and inject creativity into our operations, enhancing our business offerings and data solutions.

Key Responsibilities:

  • Development of representative data sets that mimic customer data for training modules, aiding QA and development teams.
  • Extraction of sensitive data elements tailored to specific product and customer requirements.

Requirements:

  • Proven track record in developing complex ETL pipelines, particularly those handling natural language text and patterns.
  • Expertise in Python and tools like pandas, numpy, Gensim, spaCy, NLTK; proficiency with SQL and NoSQL databases.
  • Demonstrated diligence in data quality and a deep understanding of varying business needs.
  • Skilled in writing modular code and participating in collaborative environments including code reviews.
  • Experience liaising with software developers, product managers, and other stakeholders to integrate data solutions and refine business requirements.
  • Strong communication skills with a knack for clear and organized documentation of software and data.

Nice to Have:

  • Experience with text analytics pipelines and machine learning models focused on text classification and entity detection.
  • An interest or background in web scraping, automated content creation, ML or AI life cycles, CI/CD pipelines, and MLOps.
  • Curiosity and eagerness to remain informed on latest industry trends in machine learning and artificial intelligence.

Other Technologies:

  • Experience with Large Language Models (LLMs) applied in real business scenarios, particularly in content or data generation.
  • Familiarity with cloud computing platforms such as Google Cloud and AWS is preferable.

Benefits:

Join MindTech and enjoy a friendly, professional atmosphere with benefits such as a high-end laptop or workstation, access to the well-being platform "Rozumi" for you and your family, paid sick leave, vacation days, and national holidays. We are committed to your professional growth and advancing your career.

About the Project:

Our product offers a precise master catalog of sensitive data usage, enabling businesses to manage data security and compliance. This fully automated solution caters to all types of data, supporting comprehensive data processing and integration into a single, detailed master catalog.

If you are passionate about leveraging your analytical skills to advance data-driven solutions, apply now to become part of a forward-thinking company dedicated to innovation and quality.