#WeAreCrowdStrike and our mission is to stop breaches. As a global leader in cybersecurity, CrowdStrike has revolutionized the industry. Our market-leading cloud-native platform offers unmatched protection against sophisticated cyberattacks. We seek passionate individuals with a relentless focus on innovation and a deep commitment to customer satisfaction to join us in shaping the future of cybersecurity.
Recognized as a top workplace, CrowdStrike is dedicated to fostering an inclusive, remote-first culture that enables you to balance work and life while advancing your career. Interested in joining a company that sets industry standards and leads with integrity? Be part of a mission that matters - one team, one fight.
About the Role
We are hiring a Data Analyst for our Generative AI Research Center. This junior/entry-level position offers rapid career growth opportunities. As a Data Analyst, you will focus on data and corpus labeling, supporting our large language models (LLMs) and cybersecurity initiatives. Your role is critical for enhancing our products by ensuring the accuracy and quality of the data used to train models and detect threats. We will mentor and train you in security topics as needed. A strong interest in our mission and willingness to meet the needs of our product teams and customers is essential.
If you are an engineer who thrives on technical challenges and wants to work at scale, apply now!
Interview Process
Our interviewing process includes both online and onsite stages, where applicable.
Responsibilities
- Label and annotate cybersecurity-related datasets for analysis and machine learning tasks.
- Ensure labeling accuracy and consistency across various datasets including threat intelligence data, incident reports, and network logs.
- Gather data from multiple cybersecurity sources, including threat intelligence feeds, logs, and internal reports.
- Clean and preprocess data for analysis and modeling.
- Perform exploratory data analysis to identify patterns and insights related to cybersecurity threats and vulnerabilities.
- Utilize statistical methods to interpret data and identify potential security issues.
- Create and maintain dashboards and reports to communicate findings to cybersecurity stakeholders.
- Develop clear and concise data visualizations, highlighting key security metrics and trends.
- Collaborate closely with analysts, data scientists, and engineers to support their data needs.
- Support MLOps pipeline implementation and optimization, leveraging data insights to deploy, monitor, and scale machine learning models.
- Participate in team meetings and contribute to project planning with data-driven insights.
- Document processes, methodologies, and insights from data analysis activities.
- Maintain clear records of data sources, cleaning steps, and labeling criteria for reproducibility and auditability.
Requirements
- Bachelor's degree in Computer Science or a related STEM field.
- Proficiency in data manipulation and analysis tools (e.g., Python, SQL).
- Familiarity with relevant libraries and frameworks (e.g., TensorFlow, PyTorch).
- Experience with data labeling and annotation tools.
- Strong analytical and problem-solving skills, with an understanding of cybersecurity concepts.
- Excellent communication and collaboration abilities.
- Attention to detail and a commitment to data accuracy.
Tech Stack
(Not mandatory to know everything; a robust learning capacity is essential)
- Python
- SQL
- Data Labeling and Annotation Tools like Labelbox, Prodigy
- Data Analysis and Visualization tools like Pandas, NumPy, Matplotlib, Seaborn
- Docker
- Kubernetes
- AWS
- Kafka
- GIT
Bonus Points
- Exposure to Go, AWS, Cassandra, Kafka, Elasticsearch
- Experience with Language Models, Data Science, Data Engineering
- Experience with data labeling and annotation tools, particularly in a cybersecurity context
#LI-JP2 #LI-EV1 #LI-GT1 #LI-Remote
Benefits of Working at CrowdStrike