Omkar P.

Omkar P.

Thane, Maharashtra, India
Hire Omkar P. Hire Omkar P. Hire Omkar P.

About Me

Omkar is a Data Engineer who manages complex data workflows, operates cloud-based data workflows, creates robust ETL pipelines, and optimizes data querying and issue analysis. He works closely with teams and product managers to understand and resolve issues, contributing and developing ideas for solutions, ensuring end-to-end data accuracy and integrity. Omkar's expertise includes Big Data, Spark, Kafka, Databricks, EMR, Airflow, SQL, Python, Data Warehousing, and he is also passionate about AWS cloud computing and solving SQL problems posted on YouTube and LeetCode.

AI, ML & LLM

Frontend

Backend

Database

SQL Azure SQL Databases

DevOps

Other

Data Engineering Big Data Spark Kafka Databricks ETL Pipelines Data Warehousing Data Queries Talend Data Modeling Business Requirements Data Flows Hadoop

Work history

Nielsen
Nielsen
Software Engineer
2022 - Present (3 years)
Mumbai, India
  • Redesigned an end-to-end pipeline for orchestrating channel schedules and their formats, replacing a legacy .NET-based system with Apache Airflow, AWS S3, AWS SQS, and AWS Glue, improving efficiency by 50%.

  • Worked on migrating on-premises data from SQL Server to AWS using Talend, ensuring seamless integration and better data accessibility in the cloud.

  • Partnered with the team to architect and document a video popularity scoring pipeline on AWS EMR, Apache Airflow, and AWS S3.

  • Led development on approximately 20% of project modules, driving enhancements and feature development within an Agile framework.

  • Providing technical guidance and mentorship to junior developers, conducting hands-on training for new team members to foster a collaborative learning environment.

PythonApache Airflow AWSTalendData ModelingSQLBig DataData Warehousing SQL Stored Procedures Data Extraction Data Transformation AWS SQSAWS S3Data ManagementData ProcessingETL Pipelines SQL ServerXMLPostgreSQLAWS GlueData MigrationAWS EMRAWS AthenaAWS Lake Formation
GEP Worldwide
GEP Worldwide
Software Engineer
2020 - 2022 (2 years)
Mumbai, India
  • Securely ingested data from on-premises SQL Server into Azure Data Lake Storage (ADLS) using Azure Data Factory.

  • Designed and documented the architecture for key features like Purchase Order Consolidation and an Automated Trigger Workflow, which automatically updates requisition statuses from draft to approved or directly creates orders based on predefined parameters using Azure Databricks and ADLS Gen2.

  • Optimized 40% of existing Databricks pipelines using Spark optimization techniques, reducing costs and improving efficiency.

  • Implemented a CI/CD pipeline for Databricks using GitHub Actions, enabling automated deployments and streamlined workflow management.

  • Contributed to UAT and production support by addressing critical and blocker bugs, providing timely fixes for issues raised by clients.

  • Developed modules/enhancements of smart products as per business requirements.

Azure Data FactoryAzure Logic Apps Data Flows Data ModelingBig DataData Warehousing SQLAngularBusiness Requirements Azure Data Lake StoreSQL ServerAzure DatabricksSparkCI/CD Pipelines GitHub Actions Workflow Optimization User Acceptance Testing (UAT)Production Support

Education

Certificate in Cloud Big Data Engineering
Certificate in Cloud Big Data Engineering
TrendyTech
2024 - 2025 (1 year)
BE Information Technology
BE Information Technology
University of Mumbai - India
2016 - 2020 (4 years)