UpStack - Grow your team with the top 1% Remote Software Developers

Shreyank B.

McLean, VA, United States of America

Hire Shreyank B. Hire Shreyank B. Hire Shreyank B.

About Me

Shreyank is a Senior Full-stack Python Engineer with 11+ years of experience designing and building scalable back-end systems, APIs, real-time data applications, and distributed services across AWS, Azure, and GCP environments. He designs, builds, and scales data systems that power critical business decisions and ML platforms, specializing in turning complex data challenges into secure, performant, and production-grade solutions. Passionate about creating reliable, observable, and cost-efficient data systems, Shreyank has architected real-time streaming data pipelines, batch ETL workflows, cloud-native lakehouses, and analytics platforms that handle billions of records.

AI, ML & LLM

Airbyte Airflow MLFlow Vertex AI

Backend

Python FastAPI Flask Django API Gateways Flask-Marshmallow

Database

Data Build Tool (dbt) Spark SQL

DevOps

DevOps Azure AWS GCP Azure Event Grid Cloud Dataflow AWS Sagemaker Azure Synapse Google Cloud Spanner Grafana Docker Kubernetes AWS EKS Azure Kubernetes Service (AKS) Google Kubernetes Engine (GKE) Terraform AWS Cloud Development Kit (CDK) AWS CloudFormation GitLab CI/CD Jenkins Azure Pipelines Google Cloud Build AWS IAM AWS Key Management Service (KMS) Prometheus AWS CloudWatch Azure Monitor ELK Stack AWS S3 AWS Glue

QA & Testing

PyTest Unit Testing

Workflow

GitHub Actions

Other

ETL Pipelines Data Engineering Object-oriented Programming (OOP) Design Patterns RESTFul APIs gRPC JWT OAuth 2 Apache Kafka Amazon Kinesis Google Pub/Sub Apache Pulsar Pyspark Apache Nifi Data Integration (ELT/ETL) Informatica Apache Sqoop Matillion Pydantic Apache Spark Dask Apache Flink Cloud Composer Dagster Databricks Azure ML Kubeflow Seldon Delta Lake Snowflake BigQuery Clickhouse Looker Power BI Tableau Apache Superset Amazon QuickSight Kibana Redash Metabase Matplotlib Helm OpenShift Pulumi ARM Ansible Chef Spinnaker Bitbucket Argo Tox HashiCorp Pandas Numpy SciPy Celery AsyncIO DataDog Loki OpenTelemetry EFK Stack Splunk Sentry Jupyter

Work history

Valley Bank

Senior Python Developer

2023 - Present (2 years)

Morristown, NJ, United States of America

Led the modernization of financial task workflows by developing distributed data pipelines using PySpark and transforming legacy systems into scalable, real-time processing frameworks.
Designed and implemented real-time ingestion flows, schema management strategies, and CI/CD automation using industry-standard orchestration and version control tools.
Delivered cost-optimized ETL and ELT pipelines using dbt and PySpark and built secure and high-performance APIs with FastAPI and Flask.
Created observability dashboards to monitor system health and performance using modern visualization platforms.
Modernized legacy data pipelines by building distributed PySpark jobs on AWS EMR, reducing job execution time by 40% while enabling cost-efficient parallel processing across large-scale financial queues.
Migrated application workflows to Amazon DynamoDB for NoSQL and Amazon RDS (PostgreSQL) for relational workloads using Django ORM and SQLAlchemy.
Refactored legacy services into Flask/FastAPI microservices and developed REST APIs with Django REST Framework and Flask, deployed via AWS Fargate.
Integrated Amazon SageMaker models for fraud detection with real-time scoring shown in both Vue and React dashboards.
Used Terraform and GitHub Actions to manage CI/CD pipelines, deploy EMR clusters, Fargate services, Redshift objects, and API endpoints.
Built reusable front-end components using TypeScript, HTML5, and CSS3, improving load performance by 25% and ensuring WCAG-compliant accessibility.
Designed and developed back-end infrastructure orchestration services using Python and Terraform to automate cloud resource provisioning on AWS (EC2, IAM, S3, VPC).
Integrated ML scoring APIs into dashboards using CloudWatch streaming and React hooks, surfacing insights in near real time.
Developed Jest unit tests and snapshot tests for React components, achieving 90%+ coverage and automating regressions via GitHub Actions.

Python Pyspark Data pipelinesCI/CD Flask FastAPIData Build Tool (dbt) ETL Pipelines ELT APIs AWS EMR SQLAlchemy Django ORM noSQL AWS RDS PostgreSQLAmazon DynamoDB MicroservicesREST APIs Django REST FrameworkAWS Fargate React Vue Amazon SageMaker TerraformGitHub Actions CI/CD Pipelines AWS Redshift Typescript HTML5 CSS3 WCAGAWS IAM Amazon Virtual Private Cloud (VPC)AWS EC2 AWS CloudWatch Jest Unit Testing

McKesson

Senior Python Developer

2020 - 2023 (3 years)

Irving, TX, United States of America

Architected scalable batch and streaming data pipelines to support claims processing and supply chain analytics across large, high-volume datasets.
Built secure, compliant data lake architectures and deployed predictive ML models to improve operational forecasting.
Established automated CI/CD workflows for consistent and reliable deployments.
Developed secure APIs and interactive dashboards to deliver real-time visibility into inventory, logistics, and alerting metrics for key stakeholders.
Engineered scalable batch pipelines using Apache Spark (PySpark) on AWS EMR, improving throughput by 30% while reducing compute costs via auto-scaling and fine-tuned cluster configurations.
Developed real-time ingestion architecture with Amazon Kinesis Data Streams, Firehose, and AWS Lambda, enabling 95% classification accuracy for health claim data and reducing ingestion anomalies by 20%.
Established a HIPAA-compliant data lake on Amazon S3, centralizing structured and semi-structured data across providers, supply chains, and patient events, with lifecycle policies and encryption at rest.
Designed ETL/ELT pipelines using AWS Glue, Airflow (MWAA), and PySpark and developed automated ingestion pipelines with Snowflake.
Created dynamic Python modules and scripts for provisioning secure VPCs, IAM roles, and data pipelines, reducing infra setup time by 60%.
Engineered high-performance data access patterns across MySQL and PostgreSQL, including query rewriting, connection pooling, and custom profiling to improve performance by 40%.
Embedded monitoring hooks via OpenTelemetry and pushed metrics to Datadog, enabling fine-grained tracking of ingestion and API latencies.
Monitored data pipeline health and model performance using CloudWatch Logs, CloudTrail, and Slack-based alerts.

Python Data pipelinesData Lakes Machine LearningCI/CD Apache Spark Pyspark AWS EMR AWS Kinesis AWS Kinesis FirehoseAWS Lambda HIPAA Compliance AWS S3ELT ETL Pipelines AWS GlueMWAA Airflow SnowflakeAWS IAM Amazon Virtual Private Cloud (VPC)MySQL PostgreSQLOpenTelemetry DataDog AWS CloudWatchAWS CloudTrail

Venmo

Python Developer

2018 - 2020 (2 years)

New York, NY, United States of America

Built real-time and batch data pipelines to support financial transaction analytics using distributed messaging, orchestration tools, and scalable processing frameworks.
Developed robust ETL workflows using Spark and workflow automation platforms, trained and deployed ML models for fraud detection and behavioral insights, and exposed secure APIs using FastAPI.
Automated infrastructure provisioning through IaC practices and deployed containerized services on a managed orchestration platform for scalable, resilient operations.
Collaborated with analysts and data scientists to define ingestion cadence, reporting KPIs, and real-time metrics visualized through React-based dashboards, Amazon QuickSight, and Kibana.
Built cloud-native data warehouse solutions using Amazon Redshift, applying star and snowflake schema designs, partitioning, and sort key strategies to improve query performance by 50%+.
Engineered real-time and batch ingestion pipelines using Amazon Kinesis, Apache Kafka, and AWS Glue, capturing high-volume payment event streams into Amazon S3 for downstream processing.
Played a key role in back-end modernization by building FastAPI-based APIs and migrating legacy Django logic to async-first services with Pydantic models and SQLAlchemy ORM.
Deployed containerized microservices using Docker and managed deployments using CI/CD (GitHub Actions) with pre-deployment PyTest checks and rollback strategies.
Trained and deployed fraud and churn models using SageMaker and implemented NLP pipelines using spaCy and NLTK to classify and tag user support chats and memos.
Monitored health of data pipelines using CloudWatch, CloudTrail, and Slack-based alerting, reducing MTTR and improving SLA compliance.
Built scalable data models in Redshift using dbt and schema strategies like star/snowflake schema, optimizing query latency by 50%.
Created reusable widgets in TypeScript, enhanced with CSS animations and responsive design using HTML5 flex/grid layouts.
Employed Gradle for front-end tooling and Maven to handle hybrid integrations with external Java-based fraud scoring systems.
Maintained dashboard scalability using React Context, Redux, and lazy-loaded components, reducing initial load time by 40%.

Python Data pipelines FastAPI Spark ETL Machine LearningInfrastructure as Code (IaC) KibanaReact Amazon QuickSightAmazon Redshift Data Warehouse AWS S3 AWS KinesisApache Kafka AWS Glue SQLAlchemy Django Pydantic PyTest Microservices DockerCI/CD GitHub Actions AWS SagemakerNatural Language Processing (NLP) spacyNatural Language Toolkit (NLTK) AWS CloudWatchAWS CloudTrail AWS RedshiftData Build Tool (dbt) HTML5 TypescriptWidgets CSS Responsive Design Gradle Maven Java React Context Redux

Pacific Life

Python Developer

2016 - 2018 (2 years)

Newport Beach, CA, United States of America

Engineered scalable ETL and real-time streaming pipelines to transform and process large-scale insurance datasets for analytics, reporting, and business insights.
Designed and maintained high-performance data warehouse solutions and deployed ML inference pipelines to support predictive analytics use cases.
Automated infrastructure provisioning using IaC methodologies and managed containerized applications on a secure, orchestrated platform to ensure high availability and scalability.
Built batch ETL jobs using Apache Spark (PySpark) on Azure HDInsight, improving processing speed by 40% across multi-TB datasets.
Automated cloud infrastructure provisioning with Terraform, ARM templates, reducing deployment time and ensuring consistent resource creation.
Optimized data models in Azure SQL Database, PostgreSQL, MySQL, and Oracle (on Azure VMs) using indexing and caching.
Integrated and deployed ML models with Azure Machine Learning with Spark SQL, supporting advanced data science workflows with 90%+ accuracy.
Containerized services using Docker and AKS and implemented CI/CD with Azure DevOps pipelines.
Wrote reusable transformation logic in Python, PySpark, and Spark SQL, enabling scalable operations.

PythonETL Pipelines Machine Learning Data Warehouse DesignML Pipelines Infrastructure as Code (IaC) Apache Spark PysparkAzure HDInsight Terraform ARM Azure SQL Database PostgreSQL MySQL OracleAzure Machine Learning Spark SQL DockerAzure Kubernetes Service (AKS) Azure DevOpsCI/CD Pipelines

The Cigna Group

Python Developer

2013 - 2016 (3 years)

Bloomfield, CT, United States of America

Built high-throughput batch and streaming frameworks using Spark, Hadoop, and Kafka for healthcare data pipelines.
Developed real-time ETL workflows with Informatica, exposed internal APIs using Flask, and created business dashboards in Tableau and Power BI.
Automated infrastructure deployments using AWS CloudFormation and Jenkins CI/CD.
Engineered scalable Big Data processing frameworks using Apache Spark, Apache Hadoop, and Apache Flink.
Built and optimized a cloud-native data warehouse using Snowflake, with schema evolution and performance tuning automated through Python-based orchestration script.
Managed and tuned high-volume Oracle databases and developed secure RESTful APIs with Flask.
Created Python-based data loaders and enrichment services to support interactive dashboards built in Tableau and Power BI.
Deployed and managed containerized microservices using Docker and Kubernetes.
Provisioned infrastructure using Terraform and AWS CloudFormation.
Implemented CI/CD pipelines with Jenkins and developed custom logging and monitoring tools in Python for Datadog.

Python Spark Hadoop Kafka Data pipelinesHealthcare Power BI Flask APIs Informatica ETL TableauDashboards CI/CD Jenkins AWS CloudFormation Apache Flink Big Data Data Processing Apache Spark HadoopData Build Tool (dbt) Snowflake Data Warehouse Data Warehouse Design OracleRESTFul APIs Microservices Docker Kubernetes AWS CloudFormation TerraformCI/CD Pipelines Jenkins DataDog

Education

AWS Certified Data Engineer – Associate (Expires Apr 2028)

Amazon Web Services (AWS)

2025 - 2025

Microsoft Certified: Azure Data Engineer Associate (Expires Mar 2026)

Microsoft

2025 - 2025

Bachelor’s Degree, Computer Science & Engineering

CVR College of Engineering, Hyderabad - India

2006 - 2010 (4 years)