Kevin M.

Kevin M.

Senior Data Engineer

Tachira State, Venezuela
Hire Kevin M. Hire Kevin M. Hire Kevin M.

About Me

Kevin is a driven Data Engineer focusing on building real-time Data ETL pipelines using Python, data streaming with Kafka, and further processing or even modeling with Spark for creating real-time interactive dashboards for visual insight. He creates expert models through Machine Learning or Deep Learning with Python for Time Series problems, Classification tasks as well as NLP. He also has good knowledge of AWS, GCP, and Azure for serverless applications that involve data manipulation, storing, processing or streaming.

Work history

UpStack
UpStack
Senior Data Engineer
2022 - Present (2 years)
Remote
  • Build and improve databases, acquire data, ETL/ELT, big data pipelines and deploy cloud services on projects.

  • Administer infrastructure solutions to improve data models, increase data accessibility and foster data-driven solutions for clients.

  • Implement monitoring solutions to ensure data integrity - working closely with engineers, product managers and other stakeholders.

Clevertech
Clevertech
Senior Data Engineer
2020 - Present (4 years)
Remote
  • Worked on designing and building Data Pipeline for BI solutions. Integrated REST API endpoints from applications like Shopify and Rutter

  • Migrated Cloud based warehouse Data from Snowflake and AWS Redshfit to Google Big Query using Airflow. Used Machine Learning to predict the likelihood of an organic visit to one of the client stores.

  • Automated data upload/aggregation in Postgres using Python for easy extraction from backend. Setup migration scripts with rollbacks in knex for frontend DB using JavaScript. Set up an ETL pipeline using AWS transfer, s3, lambda (Python) and Postgres RDS.

number8
number8
Data Engineer/Team Lead
2019 - 2020 (1 year)
Remote
  • Worked on the creation of Digital Twins for Supply Chain procedures, presenting the data architecture proposal to management.

  • Streamed real-time data from MySQL source to GBQ and then replicated it to Azure and AWS using a Multi-Node Kafka Cluster.

  • Applied real-time processing to data using Spark for streaming from Database to the Cloud. Automating data extraction from Cognos Framework Manager XML model using Python.

KPMG
KPMG
Data Scientist
2018 - 2019 (1 year)
Germany
  • Performed web scrapping, mining and validating data relevant to client requests, going from commodity futures, general financial market data and Geo-location.

  • Worked with Data ETL for customer insight, as well as feature engineering, data modeling using supervised and unsupervised learning for forecasting and classification tasks of multiple phenomena and events.

  • Created a Python script to upload batches of data directly into Google BigQuery for testing a serverless approach to data migration. Performed string matching and database merging by implementing NLP techniques such as Edit-based measures and Token-based measures combined with machine learning .

Portfolio

Data Engineer - Parker Financial
Data Engineer - Parker Financial

The project involved assisting the team in building the ETL pipeline that connected to different endpoints using Typescript. The transformation/validation of the data was done using EMR, however, the orchestration of the pipeline was done with Kubernetes clusters. I also built most of the views for the dashboard that the underwriting team would use, by leveraging the DBT cloud.

Data Engineer - KPMG
Data Engineer - KPMG

The project involved building a product that would flag potential fraudulent invoices. In this project, I leveraged my skills in Machine Learning and Feature Engineering to build my solutions. Another of my tasks was to build a Pipeline that would convert Geolocation data into actual readable addresses for our clients.

Data Engineer - Suffolk
Data Engineer - Suffolk

The project involved creating an Airflow ETL Pipeline. Worked on connecting it to different endpoint be it Datalakes, API endpoints or SFTP servers extracting data and pushing to AWS s3, from which point I created a Glue job that transformed, validated the data and pushed it to the warehouse in Redshift.

Education

Master's degree, Data Engineering
Master's degree, Data Engineering
Jacobs University Bremen
2017 - 2019 (2 years)
Masters, International Business Management
Masters, International Business Management
Universitat Autònoma de Barcelona
2009 - 2010 (1 year)
Bachelor's degree, Economics and Business Administration
Bachelor's degree, Economics and Business Administration
Universidad Nororiental Gran Mariscal de Ayacucho
2004 - 2008 (4 years)