Lead Data Engineer, DeNexus, US-Remote
September 2021 — present
One man army role. Designed, implemented, and deployed a Data Platform from scratch using AWS and Databricks. Delta Lake and MLflow utilized to provide reliable and versioned data to models, with Silver and Golden tables ensuring data consistency. Creation of a Data Catalog to facilitate access to the Data Lake tables (>100 created), such as MITRE, CVEs, Threat Actors, anonymized customer data…
This role involved end-to-end data management, from developing connectors (APIs, Web Scraping, Syslog, GitHub…) for data extraction to transformation, table creation, and data modeling; while ensuring the security and confidentiality of sensitive data.