Data Engineer
- Data Engineer
- New York
- $86 K - $149 K
- Full Time
About Cybersyn
Cybersyn is a new DaaS (Data-as-a-Service) entity, supported by Sequoia, Coatue, and Snowflake. Our objective is to render the global economic data accessible to governments, businesses, and entrepreneurs, thereby facilitating a new generation of key decision-makers. We amass unique data assets (corporations, licenses, data rights, consumer dividends) and develop derivative products centering on identifying where consumers and businesses spend money. Cybersyn vows to disrupt the traditional market intelligence domain by functioning as a blend of an investment firm and a technology company focused on data. If we are successful, we could revolutionize an industry worth $100Bs and create a SimCity for the real world.
We have launched a significant number of public datasets we meticulously cleaned, restructured, and made compatible on the Snowflake Marketplace.
Find our current data .
Test our data on our Streamlit App .
About the role:
Cybersyn is hiring a seasoned engineer to refine our technology stack for our data science and product team and install ingestion pipelines of public and private data sources. We are specifically seeking an engineer who is passionate about the Snowflake Data Cloud and enhancing cost efficiency and workloads.
What you will do:
Aid in data transfer from origination to our necessity (in Snowflake): this typically implies creating jobs to extract, download, or modify data as effectively as feasible. It will be necessary to prioritize computing efficiency and building some context for what the data actually comprises.
Optimize Snowflake for performance and cost
Offer infrastructure guidance on Snowflake capabilities to accommodate business/technical use scenarios
Provide operational support for Data Warehouse issues such as data loading problems, transformation translation issues, and query optimization
Assume end-to-end ownership of your assignments and relish collaborating with different functions across the company
Who you are:
Experience working with multiple (external) datasets, cleaning, joining, and munging data; experience with public data sources (i.e. US Census, ACS Survey) is a major advantage
Experience with Snowflake is essential
Proficiency in Python and SQL is crucial
Experience with dbt and orchestrator systems (Dagster, Prefect, Mage, Kestra, or some equivalent) is highly appreciated
Experience in establishing and operating data pipelines for real customers in production systems
What you will gain:
Opportunity to influence Cybersyn’s initial technology decisions
Access to some of the most intriguing and extensive economic data worldwide, including real-time expenditure, transaction, clickstream data from both third-party and proprietary sources.
Most of our data is exclusive and not available to any external parties.
Our system is designed with diverse data sources in mind: we aren't confined to data from a single product or theme. We're dealing with data from governments, payment processing systems (like bank records), mobile devices and apps, and SaaS exhaust (data collected by B2B SaaS)
A fast-paced culture, immense responsibility and autonomy from day one.