About Me

Hi I am Iván Gómez :wave:,
I’m a Software Engineer with 6+ years of experience, specialized in Data Engineering processes.
As the saying goes, “When one teaches, two learn” - I truly believe in sharing my knowledge from working on projects and overcoming challenges, benefiting both others and myself.
Accustomed to the dynamic and fast-paced environment of startups, I have developed the ability to adapt and effectively manage demanding workloads.
Want to know more about me? Take a look to my personal projects or to my list of books :blush:.

Programming

  • Python

  • SQL

  • C#

  • Scala

  • Java

  • JavaScript

Other

  • Data Processing: Spark, Pandas, Hive, Impala

  • Mobile/Web Development: Xamarin, React

  • AWS: Lambda, Cloudwatch, DynamoDB, API Gateway…

  • AZURE: Data Factory, Functions, EventGrid, EventHub…

  • DB: SQLite, Postgres, SQL Server, SAP HANA…

  • Technical writing.

  • Databricks, GIT, Linux, Docker…

  • Web scraping, asyncio, scikit-learn, NLP…

Certifications

  • Databricks Certified: CRT020 Associate Developer for Apache Spark 2.4 with Scala 2.11

Experience

Lead Data Engineer, DeNexus, US-Remote

September 2021 — present

One man army role. Designed, implemented, and deployed a Data Platform from scratch using AWS and Databricks. Delta Lake and MLflow utilized to provide reliable and versioned data to models, with Silver and Golden tables ensuring data consistency. Creation of a Data Catalog to facilitate access to the Data Lake tables (>100 created), such as MITRE, CVEs, Threat Actors, anonymized customer data…
This role involved end-to-end data management, from developing connectors (APIs, Web Scraping, Syslog, GitHub…) for data extraction to transformation, table creation, and data modeling; while ensuring the security and confidentiality of sensitive data.

Data Engineer, BASF, Madrid

March 2020 — September 2021

Automated and orchestrated ETL processes across billions of rows of data which increased replication frequency from daily to hourly. Number of daily failed jobs decreased by 90%. Provided cloud environments and reliable data to more than 100 different teams. Number of data quality incidents decreased by 70%.
Built data pipelines that replicated data from more than 20 source systems (SAP,Salesforce, APIs, RDBMS, Kafka…) into a Delta Lake.
Created a dedicated logging system for monitoring processes, generating customized alerts and allowing to query real-time statistics on all processes and the replication status of each table.

Consultant, Bosonit, Logroño+Madrid+London

January 2017 — March 2020

Contracts with Santander bank (Madrid, 1 year) and Santander UK bank (London + remote, 2 years).
Translated business needs into Spark/Hive/Impala code for the creation of an Anti Money Laundering (AML) tool. Optimized and orchestrated existing data pipelines to decrease execution time by 50%. Configured and deployed on premise Cloudera clusters with a S3-based Data Lake (Scality).
Built, orchestrated and optimized several data pipelines for the creation of a credit risk forecast model.

Software Engineer, IT Service Universidad de La Rioja, Logroño

January 2016 — December 2016

As a sole developer, created a Java application for remote control, monitoring, and management of university computer classrooms. Implemented real-time data processing using sockets, AWK, and regular expressions, as well as Powershell scripting with PStools and SNMP.
This application was later expanded and refined for my final degree project.

Education

Master’s Degree in Visual Analytics & Big Data, UNIR

November 2017 — July 2018

Bachelor's Degree in Computer Science, Universidad De La Rioja

September 2013 — June 2017

The final degree project was recognized with an award.