Wrote Python routines to log into the websites and fetch data for selected options, used Python modules of urllib, urllib2, requests for web crawling, and ML techniques: clustering, regression, classification, graphical models.
Used GCP, BigQuery, GCS Bucket, G-Cloud Functions, Cloud Dataflow, Pub/Sub Cloud Shell, GSUTIL, BQ command line utilities, Data Proc, and Stackdriver and worked on Confluence/Jira and data visualization like Matplotlib and Seaborn library.
Worked with fact dimensional modeling (Star Schema, Snowflake schema), transactional modeling, and SCD (slowly changing dimension), process and load bound and unbound data from Google Pub/Subtopic to BigQuery using cloud Dataflow with Python.
Developed different statistical Machine Learning and data mining solutions to various business problems and generated data visualizations using R, Python, and Tableau.
Involved in the development of web services using SOAP for sending and getting data from the external interface in XML format.
Worked on the development of SQL stored procedures on MySQL, reduced code redundancy to the optimal level, and designed and built a text classification application using different text classification models.
Built and architected multiple data pipelines and end-to-end ETL and ELT processes for data ingestion and transformation in GCP and used AWS components like EC2 and S3.
Performed data analysis, data migration, data cleansing, transformation, integration, data import, and data export through Python.
Developed and deployed data pipelines in cloud such as AWS and GCP.
Devised PL/SQL stored procedures, functions, triggers, views, and packages.
Implemented Apache Airflow for authoring, scheduling, and monitoring data pipelines.