Using Azure Data Factory (ADF) to orchestrate and automate data ingestion pipelines from diverse source systems into Snowflake.
Developing robust ADF pipelines and using Databricks with PySpark for scalable data transformation, cleansing, and aggregation.
Building and managing Databricks clusters and integrating Kafka for streaming ingestion.
Integrated LLM-based automation within data quality and monitoring workflows using OpenAI APIs to auto-summarize pipeline alerts and anomaly reports in Databricks.
Designed and deployed AI-driven metadata generation scripts that automatically tagged datasets and lineage in Snowflake, improving data discoverability and governance.
Collaborated with ML engineers to build feature-ready datasets for AI/ML pipelines, ensuring scalable ingestion from streaming and batch data sources.
Partnered with analytics teams to fine-tune LLM prompts for telecom-specific use cases.
Collaborating with data scientists and business stakeholders to design analytical data models in Snowflake that support self-service BI, machine learning, and real-time dashboards.
Led the development of a Kafka-Spark-Snowflake prototype to simulate real-time data ingestion and analytics for Big Data consulting use cases.
Migrated legacy Cosmos DB event sourcing components into a modern Snowflake-based architecture using Snowpipe for real-time ingestion and DBT for data modeling.
Implemented Azure Active Directory (AAD) integration for secure access control across services including Databricks, ADF, and Snowflake.
Supported large-scale SQL environments involving complex queries, stored procedures, triggers, and performance tuning across multiple servers and databases.
Supporting microservices deployment and orchestration in containerized environments using Docker and Kubernetes.
Migrated legacy ETL workflows to modern ETL pipelines using ADF, DBT, and Snowflake, significantly improving pipeline maintainability, scalability, and auditability.
Designed and implemented incremental data loading strategies and integrated Azure Key Vault for secure credential management in ADF and Databricks.
Using Snowflake Streams and Tasks for real-time change data capture (CDC) and developing robust data quality checks and validations using DBT tests.
Created parameterized and dynamic ADF pipelines and built reusable PySpark modules for complex joins, aggregations, and data enrichment operations.