Shreyank B.

Shreyank B.

McLean, VA, United States of America
Hire Shreyank B. Hire Shreyank B. Hire Shreyank B.

About Me

Shreyank is a Senior Full-stack Python Engineer

AI, ML & LLM

Airbyte Airflow MLFlow Vertex AI

Backend

Database

Data Build Tool (dbt) Spark SQL

DevOps

DevOps Azure AWS GCP Azure Event Grid Cloud Dataflow AWS Sagemaker Azure Synapse Google Cloud Spanner Docker Kubernetes AWS EKS Azure Kubernetes Service (AKS) Google Kubernetes Engine (GKE) Terraform AWS Cloud Development Kit (CDK) AWS CloudFormation GitLab CI/CD Jenkins Azure Pipelines Google Cloud Build AWS IAM AWS Key Management Service (KMS) AWS CloudWatch Azure Monitor AWS S3 AWS Glue

QA & Testing

Workflow

GitHub Actions

Other

Work history

Valley Bank
Valley Bank
Senior Python Developer
2023 - Present (2 years)
Morristown, NJ, United States of America
  • Led the modernization of financial task workflows by developing distributed data pipelines using PySpark and transforming legacy systems into scalable, real-time processing frameworks.

  • Designed and implemented real-time ingestion flows, schema management strategies, and CI/CD automation using industry-standard orchestration and version control tools.

  • Delivered cost-optimized ETL and ELT pipelines using dbt and PySpark and built secure and high-performance APIs with FastAPI and Flask.

  • Created observability dashboards to monitor system health and performance using modern visualization platforms.

  • Modernized legacy data pipelines by building distributed PySpark jobs on AWS EMR, reducing job execution time by 40% while enabling cost-efficient parallel processing across large-scale financial queues.

  • Migrated application workflows to Amazon DynamoDB for NoSQL and Amazon RDS (PostgreSQL) for relational workloads using Django ORM and SQLAlchemy.

  • Refactored legacy services into Flask/FastAPI microservices and developed REST APIs with Django REST Framework and Flask, deployed via AWS Fargate.

  • Integrated Amazon SageMaker models for fraud detection with real-time scoring shown in both Vue and React dashboards.

  • Used Terraform and GitHub Actions to manage CI/CD pipelines, deploy EMR clusters, Fargate services, Redshift objects, and API endpoints.

  • Built reusable front-end components using TypeScript, HTML5, and CSS3, improving load performance by 25% and ensuring WCAG-compliant accessibility.

  • Designed and developed back-end infrastructure orchestration services using Python and Terraform to automate cloud resource provisioning on AWS (EC2, IAM, S3, VPC).

  • Integrated ML scoring APIs into dashboards using CloudWatch streaming and React hooks, surfacing insights in near real time.

  • Developed Jest unit tests and snapshot tests for React components, achieving 90%+ coverage and automating regressions via GitHub Actions.

McKesson
McKesson
Senior Python Developer
2020 - 2023 (3 years)
Irving, TX, United States of America
  • Architected scalable batch and streaming data pipelines to support claims processing and supply chain analytics across large, high-volume datasets.

  • Built secure, compliant data lake architectures and deployed predictive ML models to improve operational forecasting.

  • Established automated CI/CD workflows for consistent and reliable deployments.

  • Developed secure APIs and interactive dashboards to deliver real-time visibility into inventory, logistics, and alerting metrics for key stakeholders.

  • Engineered scalable batch pipelines using Apache Spark (PySpark) on AWS EMR, improving throughput by 30% while reducing compute costs via auto-scaling and fine-tuned cluster configurations.

  • Developed real-time ingestion architecture with Amazon Kinesis Data Streams, Firehose, and AWS Lambda, enabling 95% classification accuracy for health claim data and reducing ingestion anomalies by 20%.

  • Established a HIPAA-compliant data lake on Amazon S3, centralizing structured and semi-structured data across providers, supply chains, and patient events, with lifecycle policies and encryption at rest.

  • Designed ETL/ELT pipelines using AWS Glue, Airflow (MWAA), and PySpark and developed automated ingestion pipelines with Snowflake.

  • Created dynamic Python modules and scripts for provisioning secure VPCs, IAM roles, and data pipelines, reducing infra setup time by 60%.

  • Engineered high-performance data access patterns across MySQL and PostgreSQL, including query rewriting, connection pooling, and custom profiling to improve performance by 40%.

  • Embedded monitoring hooks via OpenTelemetry and pushed metrics to Datadog, enabling fine-grained tracking of ingestion and API latencies.

  • Monitored data pipeline health and model performance using CloudWatch Logs, CloudTrail, and Slack-based alerts.

Venmo
Venmo
Python Developer
2018 - 2020 (2 years)
New York, NY, United States of America
  • Built real-time and batch data pipelines to support financial transaction analytics using distributed messaging, orchestration tools, and scalable processing frameworks.

  • Developed robust ETL workflows using Spark and workflow automation platforms, trained and deployed ML models for fraud detection and behavioral insights, and exposed secure APIs using FastAPI.

  • Automated infrastructure provisioning through IaC practices and deployed containerized services on a managed orchestration platform for scalable, resilient operations.

  • Collaborated with analysts and data scientists to define ingestion cadence, reporting KPIs, and real-time metrics visualized through React-based dashboards, Amazon QuickSight, and Kibana.

  • Built cloud-native data warehouse solutions using Amazon Redshift, applying star and snowflake schema designs, partitioning, and sort key strategies to improve query performance by 50%+.

  • Engineered real-time and batch ingestion pipelines using Amazon Kinesis, Apache Kafka, and AWS Glue, capturing high-volume payment event streams into Amazon S3 for downstream processing.

  • Played a key role in back-end modernization by building FastAPI-based APIs and migrating legacy Django logic to async-first services with Pydantic models and SQLAlchemy ORM.

  • Deployed containerized microservices using Docker and managed deployments using CI/CD (GitHub Actions) with pre-deployment PyTest checks and rollback strategies.

  • Trained and deployed fraud and churn models using SageMaker and implemented NLP pipelines using spaCy and NLTK to classify and tag user support chats and memos.

  • Monitored health of data pipelines using CloudWatch, CloudTrail, and Slack-based alerting, reducing MTTR and improving SLA compliance.

  • Built scalable data models in Redshift using dbt and schema strategies like star/snowflake schema, optimizing query latency by 50%.

  • Created reusable widgets in TypeScript, enhanced with CSS animations and responsive design using HTML5 flex/grid layouts.

  • Employed Gradle for front-end tooling and Maven to handle hybrid integrations with external Java-based fraud scoring systems.

  • Maintained dashboard scalability using React Context, Redux, and lazy-loaded components, reducing initial load time by 40%.

PythonData pipelinesFastAPISparkETLMachine LearningInfrastructure as Code (IaC) KibanaReact Amazon QuickSightAmazon Redshift Data WarehouseAWS S3AWS KinesisApache Kafka AWS GlueSQLAlchemyDjangoPydanticPyTestMicroservicesDockerCI/CD GitHub Actions AWS SagemakerNatural Language Processing (NLP) spacyNatural Language Toolkit (NLTK) AWS CloudWatchAWS CloudTrail AWS RedshiftData Build Tool (dbt) HTML5TypescriptWidgets CSSResponsive DesignGradleMavenJavaReact ContextRedux
Pacific Life
Pacific Life
Python Developer
2016 - 2018 (2 years)
Newport Beach, CA, United States of America
  • Engineered scalable ETL and real-time streaming pipelines to transform and process large-scale insurance datasets for analytics, reporting, and business insights.

  • Designed and maintained high-performance data warehouse solutions and deployed ML inference pipelines to support predictive analytics use cases.

  • Automated infrastructure provisioning using IaC methodologies and managed containerized applications on a secure, orchestrated platform to ensure high availability and scalability.

  • Built batch ETL jobs using Apache Spark (PySpark) on Azure HDInsight, improving processing speed by 40% across multi-TB datasets.

  • Automated cloud infrastructure provisioning with Terraform, ARM templates, reducing deployment time and ensuring consistent resource creation.

  • Optimized data models in Azure SQL Database, PostgreSQL, MySQL, and Oracle (on Azure VMs) using indexing and caching.

  • Integrated and deployed ML models with Azure Machine Learning with Spark SQL, supporting advanced data science workflows with 90%+ accuracy.

  • Containerized services using Docker and AKS and implemented CI/CD with Azure DevOps pipelines.

  • Wrote reusable transformation logic in Python, PySpark, and Spark SQL, enabling scalable operations.

PythonETL Pipelines Machine LearningData Warehouse DesignML Pipelines Infrastructure as Code (IaC) Apache SparkPysparkAzure HDInsight TerraformARMAzure SQL DatabasePostgreSQLMySQLOracleAzure Machine Learning Spark SQLDockerAzure Kubernetes Service (AKS) Azure DevOpsCI/CD Pipelines
The Cigna Group
The Cigna Group
Python Developer
2013 - 2016 (3 years)
Bloomfield, CT, United States of America
  • Built high-throughput batch and streaming frameworks using Spark, Hadoop, and Kafka for healthcare data pipelines.

  • Developed real-time ETL workflows with Informatica, exposed internal APIs using Flask, and created business dashboards in Tableau and Power BI.

  • Automated infrastructure deployments using AWS CloudFormation and Jenkins CI/CD.

  • Engineered scalable Big Data processing frameworks using Apache Spark, Apache Hadoop, and Apache Flink.

  • Built and optimized a cloud-native data warehouse using Snowflake, with schema evolution and performance tuning automated through Python-based orchestration script.

  • Managed and tuned high-volume Oracle databases and developed secure RESTful APIs with Flask.

  • Created Python-based data loaders and enrichment services to support interactive dashboards built in Tableau and Power BI.

  • Deployed and managed containerized microservices using Docker and Kubernetes.

  • Provisioned infrastructure using Terraform and AWS CloudFormation.

  • Implemented CI/CD pipelines with Jenkins and developed custom logging and monitoring tools in Python for Datadog.

Education

AWS Certified Data Engineer – Associate (Expires Apr 2028)
AWS Certified Data Engineer – Associate (Expires Apr 2028)
Amazon Web Services (AWS)
2025 - 2025
Microsoft Certified: Azure Data Engineer Associate (Expires Mar 2026)
Microsoft Certified: Azure Data Engineer Associate (Expires Mar 2026)
Microsoft
2025 - 2025
Bachelor’s Degree, Computer Science & Engineering
Bachelor’s Degree, Computer Science & Engineering
CVR College of Engineering, Hyderabad - India
2006 - 2010 (4 years)