Mayur is a Data Architect and Engineer with 17 years of IT experience managing and supporting business process-led technology and strategic management initiatives. He builds products from scratch, developing PoCs for CXOs and converting them to production-grade solutions. With a strong analytical background, Mayur is a high-caliber Big Data/Data Warehouse/ETL architect with expertise in data management focused on Big Data, EDW, Cloud Data Warehouse, real-time analytics, and data lakes. He also boasts good balance of technical and management skills and a proven ability to lead large, complex projects and globally distributed teams.
Planning, conceiving, and developing a project to switch from UKG-managed Cassandra clusters to a fully managed database as a service.
Migrating 45+ Cassandra clusters to a fully managed service with zero downtime to customers; saved ~$22 million by reducing the TCO of Cassandra by 1/8th in the organization.
Planning the architectural runway to support new business features and capabilities, establishing serviceability and observability of the application.
Worked on various client projects implementing metadata management, data warehouse modernization, ETL, syntax and data validation services.
Designed and developed various features: schema translation and target-based optimization DML/code transformation, ETL to PySnowSQL/PySpark automated translation, and translation support for major modern DWs.
Reduced the operating cost of a modernization project by 80% with AI-powered frameworks.
Architected and designed a Big Data cluster provisioning tool. Designed and developed the core framework for deployment and built provisioning for Hadoop and its ecosystem components like Ganglia, Kafka, Storm, Oozie, Zookeeper, etc. Implemented monitoring metrics using Ganglia, Prometheus and developed log, service management, and auditing of properties. Accountable for HA-based deployment of proprietary software on top of HDP and CDH clusters. Owned end-to-end deployment across clouds - AWS, Azure.
Worked on a Metadata Catalog and automated metadata crawler, data observability and quality profiler, data and cross-system lineage for impact analysis, end-to-end column level lineage, and inner system lineage. Collaborated with other teams, liaising with stakeholders, consulting with customers, updating their knowledge of industry trends, and ensuring data security. Technologies used: Spark, Spark Graphx/Graphframes, Java, Spring Boot, Apache Ranger, Antlr, Solr.
Designed and developed EDW Code and Query Log Analysis feature in Spark. Implemented a technical debt and dead code analysis feature to identify dormant code and tables in EDW. Designed ML-based target query execution time prediction on target (Spark, Snowflake, Redshift) DW. Developed various features: DW Migration’s effort estimation and project planning module, Target compatibility scope/matrix, SaaS-based deployment on cloud for self-serve assessments, and customized product offering for partner connect (AWS, Azure, GCP). Reduced time of analysis by 70% by re-platforming the product over Databricks and Snowflake.