Keshav M.

About Me

Keshav is a Databricks Certified Data Engineer with 3+ years of experience in ETL development, reporting automation, and cloud data solutions across advertising, finance, and retail. He specializes in pipeline optimization and end-to-end ETL solution development using Python, Spark, SQL, Databricks, Airflow, Apache Superset, PySpark, and AWS (S3, Glue, Lambda, RDS). Some of Keshav's accomplishments include managing 15+ Apache Airflow data pipelines, developing a Python application for end-to-end automation of monthly performance report data extraction from six portfolio companies, architecting a decision analytics pipeline using AWS to ETL financial models to a bitemporal database for audits, and successfully migrating a pipeline to an EKS-hosted framework as an individual contributor.

AI, ML & LLM

Apache Airflow Generative AI

Backend

Database

DevOps

AWS Amazon CloudWatch AWS Glue AWS S3 AWS Lambda AWS RDS

QA & Testing

Other

Work history

MiQ
MiQ
Senior Analyst
2024 - 2025 (1 year)
Bangalore, India
  • Generated insights from 20+ programmatic campaigns by analyzing log-level data using PySpark and Databricks.

  • Performed audience analysis, shopper segmentation, and sales trend identification, contributing to a 15% ROI uplift and spotlighting categories driving 40% of seasonal sales.

  • Engineered Python automation workflows using python-pptx to dynamically update weekly campaign reports, reducing manual effort by 93% and accelerating insights delivery for hospitality and education clients.

  • Collaborated with account managers to deliver persona and trend insights using shopper behavior data that shaped creative strategy and timing, driving a 20% lift in CTR and 12% VCR improvement across top platforms.

Data AnalysisPythonBig DataMarket Research & Analysis AnalyticsPysparkDatabricks Customer Segmentation Workflow Automation Consumer Behavior Workflow Optimization
TresVista
TresVista
Senior Data Analyst
2021 - 2023 (2 years)
Pune, India
  • Managed 15+ Apache Airflow data pipelines, ingesting terabytes of data from 10+ vendors, including credit/debit card transactions, web traffic, location data, and financial metrics via API, S3, and SFTP.

  • Used Python, Apache Spark, and Databricks for transformations, achieving a 40% processing time reduction.

  • Created a retail-based consumer data dashboard with Apache Superset using SQL to develop virtual datasets for rolling calculations of key metrics, e.g., retail store foot traffic, sales, and app/website active monthly users at company and sector levels.

  • Architected a decision analytics pipeline using AWS stack (S3, Lambda, Glue, RDS) to ETL financial models to a bitemporal database for audits and downstream analytics.

  • Curated an email-alerting system providing a summary of the models for 1,000+ analysts using SMTP (summary included an automated quality check for formula-checking for key metrics and highlighting missing year-periods).

  • Successfully migrated the pipeline to an EKS-hosted framework as an individual contributor deciding all timelines, changes, and developing unit tests within 4 weeks.

  • Developed a Python application for end-to-end automation of monthly performance report data extraction from six portfolio companies of a private equity firm and populated the extracted data into Tableau for easy viewing and analysis, saving over 80 work hours per month.

  • Facilitated team efficiency through comprehensive documentation of pipeline tasks, average processing time, notebook details, and creating automated QC notebooks on Databricks.

Enactus CVS
Enactus CVS
R&D Member
2018 - 2019 (1 year)
New Delhi, India
  • Conducted 10 research visits to various Delhi slums to understand ground-level problems and current methods of dealing with them.

  • Researched product-based ideas (cost analysis, local market research, etc.) as an alternative solution to some problems faced by certain communities in and around Delhi.

  • Worked on cost analysis and procurement of material for a pilot project of an aloe vera soap recipe made of waste vegetable oil supplied by restaurants (visited restaurants near the college for that purpose).

R&D Market ResearchCost Analysis

Education

Databricks Data Engineer Associate & Generative AI Fundamentals (Expires May 2027)
Databricks Data Engineer Associate & Generative AI Fundamentals (Expires May 2027)
Databricks
2025 - 2025
Machine Learning and Statistical Analysis | Scientific Computing and Python for Data Science
Machine Learning and Statistical Analysis | Scientific Computing and Python for Data Science
WorldQuant University
2020 - 2020
BSc Computer Science and Mathematics
BSc Computer Science and Mathematics
University of Delhi - India
2018 - 2021 (3 years)