Kevin is a driven Data Engineer focusing on building real-time Data ETL pipelines using Python, data streaming with Kafka, and further processing or even modeling with Spark for creating real-time interactive dashboards for visual insight. He creates expert models through Machine Learning or Deep Learning with Python for Time Series problems, Classification tasks as well as NLP. He also has good knowledge of AWS, GCP, and Azure for serverless applications that involve data manipulation, storing, processing or streaming.
Worked on designing and building Data Pipeline for BI solutions. Integrated REST API endpoints from applications like Shopify and Rutter
Migrated Cloud based warehouse Data from Snowflake and AWS Redshfit to Google Big Query using Airflow. Used Machine Learning to predict the likelihood of an organic visit to one of the client stores.
Automated data upload/aggregation in Postgres using Python for easy extraction from backend. Setup migration scripts with rollbacks in knex for frontend DB using JavaScript. Set up an ETL pipeline using AWS transfer, s3, lambda (Python) and Postgres RDS.
Worked on the creation of Digital Twins for Supply Chain procedures, presenting the data architecture proposal to management.
Streamed real-time data from MySQL source to GBQ and then replicated it to Azure and AWS using a Multi-Node Kafka Cluster.
Applied real-time processing to data using Spark for streaming from Database to the Cloud. Automating data extraction from Cognos Framework Manager XML model using Python.
Performed web scrapping, mining and validating data relevant to client requests, going from commodity futures, general financial market data and Geo-location.
Worked with Data ETL for customer insight, as well as feature engineering, data modeling using supervised and unsupervised learning for forecasting and classification tasks of multiple phenomena and events.
Created a Python script to upload batches of data directly into Google BigQuery for testing a serverless approach to data migration. Performed string matching and database merging by implementing NLP techniques such as Edit-based measures and Token-based measures combined with machine learning .