Syed M.

About Me

Muneeb has been in the data management industry for over five years, focusing primarily on data engineering, warehousing, and analytics. He has worked with various companies handling both relational and big data needs. An expert SQL query writer and data developer, Muneeb is passionate about developing databases, ETL pipelines, and visualization dashboards. He also has hands-on experience with multiple cloud vendors like Alibaba Cloud, Azure, and BigQuery.

Backend

Database

SQL Databases SQL Stored Procedures MySQL Database Design Microsoft SQL Server DuckDB PostgreSQL

DevOps

Azure Alibaba Cloud Azure Stream Analytics Azure Logic Apps Cloud Computing

Workflow

Other

Python ETL Data Engineering Data Warehousing Query Optimization Business Intelligence (BI) Query Plan Data Integration Data pipelines Data Migration Data Analytics Data Queries Data Warehouse Design Realtime Big Data Apache Hive MinIO HDFS (Hadoop Distributed File System) ADF Talend ETL Microsoft Data Transformation Services (now SSIS) Data Modeling

Work history

Zoetis
Zoetis
Consultant Data Engineer
2024 - Present (1 year)
Remote
  • Working on a data warehousing project to integrate various source streams using both batch and streaming processes on Databricks with Apache Spark, storing data in ADLS as Parquet for faster retrieval. This project aims to eliminate multiple database and ETL hops, reducing costs by replacing Azure Streaming Analytics with Spark and Databricks for improved cost and time efficiency.

  • Designed and optimized ETL pipelines using ADF, ADLS Gen2, Databricks, and Azure SQL/Cosmos DB, while creating Databricks pipelines and Spark notebooks for batch and streaming data from IoT Hub.

  • Developed real-time data streams with Azure Streaming Analytics, implemented CI/CD pipelines for ADF using GitHub, and established robust alert and monitoring systems in ADF and Databricks to ensure reliable data workflows and streamlined deployments.

Data EngineeringData Warehousing ETL Implementation & DesignData Integration (ELT/ETL)Azure Azure Cloud Services Azure Blob StorageAzure Data FactoryAzure Data Studio Azure Data Lake Azure DatabricksAzure Cosmos DB PythonAzure SQL Azure Synapse MSSQL ServerPostgreSQLSQL Server Integration Services (SSIS) SQL Server Management Studio (SSMS) Big Data Architecture Big Data Architect Azure Stream AnalyticsAzure Service BusAzure Functions
Dataquartz
Dataquartz
Lead Data Engineer
2022 - 2024 (2 years)
Remote
  • Led the development of an in-house data ingestion product with Python, Flask, DuckDB, Postgres, Grafana for dynamic visualization, and Prefect for ETL workflow management. Moreover, successfully containerized the entire application using Docker for enhanced scalability and manageability in the data engineering workflow.

  • Orchestrated end-to-end ETL pipelines, incorporating audit logging and data integrity checks along with failure/data discrepancy alerts.

  • Pioneered bug tracking, resolution, and new feature development in the data model.

Data EngineeringData Warehouse DesignFlaskPython 3 SQLAlchemyDuckDB PostgreSQLGrafanaPrometheusDatabase DesignDatabase Development JIRAAgile Sprints ETL Development PrefectApache Airflow Big DataMinIO Cloud Deployment CI/CD Github
Seeloz
Seeloz
Data Engineering Manager
2022 - 2022
Remote
  • Developed ETL projects with SQL, PySpark, Scala, ADF, ADLS, and Azure Logic Apps, seamlessly extracting data from ERP systems and loading into the data models.

  • Implemented streaming data pipelines utilizing Azure Streaming Analytics to process real-time data and maintain data integrity. Implemented a robust monitoring framework using PySpark, Postgres, and Grafana to ensure data correctness. Executed ETL projects with SQL, PySpark

  • Crafted impactful data visualization reports, providing insights into key business metrics.

Azure Logic Apps Apache HiveGoogle BigQuery PysparkSpark SQLSQLDatabase DesignScalaData Warehousing Azure Blobs Data AnalysisData EngineeringDatabricks Query Optimization Big Data Architecture Data pipelinesData Quality Analysis Intellij IDEAShellData IntegrationData Queries Analysis ETLBusiness Intelligence (BI) Python 3 PycharmBig DataData Warehouse DesignAzure Cloud InfrastructureETL Tools DatabasesPythonCI/CD Pipelines GithubAzure SQL Data Analytics Database Analytics RDBMSData ProcessingBusiness Intelligence (BI) Platforms Azure SQL Databases API IntegrationSQL DML SQL Performance Performance Tuning T-SQL (Transact-SQL) Reports BI Reports Apache SparkRelational Databases Data ModelingDatabase Modeling MariaDBBusiness Logic APIsData ArchitectureDatabase ArchitectureLogical Database Design Database Schema Design Relational Database Design REST APIs Azure Service BusJSONQuality Management MySQLDimensional Modeling ELT PandasSparkMicrosoft AzureSchemas Jupyter NotebookRelational Data Mapping BigQuery ReportingBI Reporting Windows PowerShell XMLAnyDesk Apache Airflow noSQLPostgreSQLDatabase Optimization DuckDB Apache FlinkData Extraction CSV Export CSV Scripting MongoDB.NET
Daraz | Alibaba Group
Daraz | Alibaba Group
Big Data Engineering and Governance Lead
2019 - 2022 (3 years)
Pakistan
  • Built and managed a DWH architecture, and wrote automated ETL scripts using HiveQL, HDFS, HBase, Python, and Shell on a cloud platform for data ingestions.

  • Developed BI dashboards on Power BI, vShow, and FBI to gauge important metrics related to domains like customer funnel, marketing, and logistics.

  • Developed and maintained an enterprise data warehouse and monitored data ingestion pipelines on a daily basis using SQL, Python, Flink, ODPS, and ETL flows.

Apache HiveSQLAlibaba Cloud Data Data EngineeringData Warehousing Data Governance Big DataPython 3 ShellData VisualizationBusiness Intelligence (BI) Query Optimization Data IntegrationPostgreSQLMySQLData AnalysisBig Data Architecture Data pipelinesData Quality Analysis Data Queries Analysis ETLDatabase DesignData Warehouse DesignAzure Cloud InfrastructureETL Tools DatabasesPythonCI/CD Pipelines GithubData Analytics Database Analytics RDBMSData ProcessingBusiness Intelligence (BI) Platforms Azure SQL Databases Microsoft SQL Server API IntegrationSQL DML SQL Performance Performance Tuning T-SQL (Transact-SQL) Reports BI Reports PysparkApache SparkRelational Databases Data ModelingDatabase Modeling Stored Procedure TableauDashboards Dashboard Development MariaDBBusiness Logic Microsoft Power BI APIsData ArchitectureDatabase ArchitectureLogical Database Design Database Schema Design Relational Database Design REST APIs JSONMySQL WorkbenchQuality Management Intellij IDEADimensional Modeling ELT PandasSparkSchemas Jupyter NotebookRelational Data Mapping BigQuery ReportingBI Reporting Windows PowerShell XMLAnyDesk Apache Airflow noSQLDatabase Optimization HadoopHDFS DockerGoogle BigQuery Apache FlinkData Extraction CSV Export CSV Scripting MongoDB
Qordata
Qordata
Technical Consultant
2019 - 2019
Pakistan
  • Designed and developed end-to-end data ingestion pipelines to ensure data flow daily.

  • Implemented and managed data flow jobs for data modeling solutions relevant to the health and life science industry, using tools like SQL Server Integration Services (SSIS) and Microsoft SQL Server.

  • Developed SQL queries, stored procedures, and dynamic SQL and optimized existing complex SQL queries to speed up day-to-day processes.

SQLSQL Server Integration Services (SSIS) SQL Server Management Studio Data AnalysisData Quality Analysis Data Queries Query Plan Query Optimization SQL Stored Procedures Data EngineeringData pipelinesShellData IntegrationAnalysis ETLData Warehousing Business Intelligence (BI) Database DesignData Warehouse DesignETL Tools DatabasesData Analytics Database Analytics RDBMSData ProcessingBusiness Intelligence (BI) Platforms Microsoft SQL Server SQL DML SQL Performance Performance Tuning T-SQL (Transact-SQL) Relational Databases Data ModelingDatabase Modeling Business Logic Data ArchitectureDatabase ArchitectureLogical Database Design Database Schema Design Relational Database Design Visual Studio Quality Management MySQLDimensional Modeling ELT Schemas Jupyter NotebookRelational Data Mapping ReportingBI Reporting Windows PowerShell noSQLPostgreSQLDatabase Optimization Data Extraction CSV Export CSV Scripting MongoDB.NET
Afiniti
Afiniti
Data Engineer
2017 - 2019 (2 years)
Pakistan
  • Designed and developed a database architecture and data model for a business flow using Talend Open Studio, SSIS, and MySQL Workbench.

  • Performed large-scale data conversions, migrations, and optimization to reduce resource and time costs while maintaining data integrity.

  • Wrote SQL stored procedures and Python scripts for data quality checks and ad-hoc analyses.

SQLMySQLSQL Server Integration Services (SSIS) SQL Server Management Studio TalendTalend ETL Data EngineeringData pipelinesData AnalysisAnalysis Data VisualizationBusiness Intelligence (BI) Query Optimization Data Quality Analysis ShellData IntegrationData Queries SQL Stored Procedures ETLData Warehousing Database DesignPython 3 Data Warehouse DesignETL Tools DatabasesPythonData Analytics Database Analytics RDBMSData ProcessingBusiness Intelligence (BI) Platforms Microsoft SQL Server SQL DML SQL Performance Performance Tuning T-SQL (Transact-SQL) Relational Databases Data ModelingDatabase Modeling Stored Procedure Business Logic MariaDBMicrosoft Power BI Data ArchitectureDatabase ArchitectureLogical Database Design Database Schema Design Relational Database Design Visual Studio MySQL WorkbenchQuality Management Dimensional Modeling ELT PandasSchemas Jupyter NotebookRelational Data Mapping ReportingBI Reporting Windows PowerShell AnyDesk noSQLDatabase Optimization Data Extraction CSV Export CSV Scripting .NET

Showcase

Payment Risk Engine | COD Blocking
  • The Payment Risk Engine is a system which identifies and blocks cash-on-delivery (COD) option for customers with poor buying histories, thus preventing the company from bearing failed logistics costs.

  • This system, despite potentially decreasing the customer base due to the COD blocking, increases gross-to-net revenue.

  • A detailed data analysis was conducted to assess the business impact of the system, followed by the creation of data pipelines and a performance dashboard to continuously monitor the impact on Daraz's overall business.

Delayed Order Notification System
  • Developed end-to-end data pipelines for an automated alert system notifying customers about delayed orders to enhance customer experience

  • Designed the business flow and created a BI dashboard to gauge logistics performance

  • This system helped in improving customer experience, assessing logistics performance and highlighted key metrics for improvements

Dashboard Usage Analysis
  • Every data visualization dashboard consumes computing and memory resources, with over 700 dashboards currently in use at Daraz impacting performance.

  • Identification of the most and least used dashboards was required to decommission unnecessary ones and conserve resources.

  • A meta dashboard was created to rank dashboards by tracking user activity, monitor individual user history, and to filter out executive dashboards.

Enterprise Data Warehouse
  • Worked at Afiniti to optimize call center performance through data-driven decisions for customer-agent pairing

  • Identified and resolved issues in the existing enterprise data portal, like lack of change data capture and historical analysis of clients' performance metrics

  • Developed a comprehensive Enterprise Data Warehouse from scratch, enabling historical tracking, providing a holistic view of clients, and fitting different business requirements without architectural changes

Data Pull from Dynamics 365 Using Azure Logic Apps
  • Developed a data integration pipeline with Azure Logic Apps to extract data from Microsoft Dynamics 365 into Seeloz's meta-model

  • Utilized Azure Blob Storage to store data for later use in internal ETL processes; communication orchestrated via Azure Service Bus

  • Application was trigger-based, using HTTP POST requests with JSON payloads; Implemented comprehensive error handling and logging at each step

Education

Education
Master's Degree in Computer Science
National University of Computer and Emerging Sciences
2018 - 2021 (3 years)
Education
Bachelor's Degree in Computer Science
National University of Computer and Emerging Sciences
2013 - 2017 (4 years)