Hands-on feature engineering (including feature validation, feature transformation, feature pipeline, serving and training database, feature metadata, and artefact collection), as well as machine learning model lifecycle.
Practical experience with JAVA full stack development, Python, expertise in TensorFlow, ReactJS, and ADO.
Experience with Data as a Service implementation.
Strong statistical knowledge, as well as analytical and problem-solving abilities, are desirable.
Familiarity with cache DB and vector DB.
A thorough understanding of Responsible AI Workflows and Model Management would be highly advantageous.
Applicable experience with Big Data technologies (including Hortonworks HDP, Apache Hadoop, HDFS, Hive, Sqoop, Flume, Zookeeper and HBase, Oozie, Spark, Ni-Fi, Kafka, Snap Logic, AWS, Redshift).
Experience with monitoring tools.
Development capabilities using Python, Spark, and R programming languages.
Excellent management and analytical skills.
Strong written and oral communication skills.
A decent understanding of, and experience in, project methodologies (e.g., SDLC, Agile).
Experience designing and implementing ETL pipelines using Apache Spark, Hive, Snowflake Structured Streaming, and Python for event stream data processing.
Experience tuning the performance of Apache Spark and Hadoop YARN.
Experience with Java programming.
Capability to provide oversight and guidance to Hadoop and Development teams.
Knowledge of Camunda, Angular.
Ability to debug and modify Shell script/Python.
Thorough understanding of Big Data ecosystem.
Candidate should also have a solid understanding of Big Data architecture patterns, design patterns, estimation techniques, performance tuning, and troubleshooting.
Availability to work on-call support over weekends.
Ability to liaise with multiple application teams and coordinate issue resolution.
Strong analytical and interpersonal skills.
Constant monitoring and managing of the Hadoop cluster.