Pragya P.

About Me

Data Engineer with 5+ years of experience designing and maintaining scalable data pipelines and cloud-based data architectures. Expertise in Python, SQL, and relational databases (PostgreSQL, MSSQL, Oracle) with deep understanding of data modeling and transformation. Strong hands-on experience with AWS Cloud Services (S3, Glue, Redshift, Lambda, Athena) for large-scale data processing. Proficient in ETL/ELT development, query optimization, and orchestration using Airflow and DBT. Experienced in working with structured and unstructured data, applying advanced data validation and transformation logic. Excellent understanding of ORMs (SQLAlchemy), OOPs principles, and Python performance optimization. Proven ability to collaborate with cross-functional teams in agile environments, delivering business-driven data solutions.

AI, ML & LLM

Apache Airflow

Backend

REST APIs Django Python

Database

DevOps

Workflow

Git GitHub Actions GitLab CI

Other

Regex Snowflake Data pipelines Query Optimization Data Transformation Data Modeling Performance Tuning Multiprocessing Numpy Pandas Agile Data Governance Functional programming Clustering Caching Partitioning Dimensional Modeling Talend Informatica Matillion Prefect Quicksight IAM Athena Lambda Redshift S3 BigQuery Kafka Kinesis RabbitMQ DataDog

Work history

Jawam Infotech
Insight360 Data Platform
2025 - 2025
Remote
  • Developed a centralized analytics platform that enables real-time business insights by integrating multiple data sources into a unified warehouse.

  • Designed and deployed scalable ETL pipelines using AWS Glue and Airflow for data ingestion and transformation; implemented optimized schemas in Redshift and automated query tuning.

  • Built data validation layers and batch jobs using Python and AWS Batch; integrated unstructured datasets and transformed them into analytics-ready tables; developed Terraform scripts.

Zecdata Technologies
Databridge Integration Hub
2021 - 2025 (4 years)
Remote
  • Developed Databridge, a unified integration system for migrating enterprise data from legacy systems to cloud data warehouses.

  • Built ETL workflows to ingest and transform large datasets from Oracle and Salesforce into AWS Redshift; developed DBT models for consistency and version control.

  • Utilized Pandas and NumPy for complex data transformation; implemented Lambda functions for event-based data ingestion and monitoring; set up CI/CD pipelines.

Accenture
Nova Analytics Engine
2020 - 2021 (1 year)
Remote
  • Developed and optimized SQL queries for data extraction, aggregation, and reporting.

  • Created and maintained ETL scripts in Python to automate data refresh cycles; designed relational data models in Oracle for dashboards.

  • Collaborated with analysts to deliver clean, structured data; automated recurring reports; contributed to performance tuning.

Zid
1 - 1
Remote
  • Maintained MySQL and Redshift schemas and executed Redshift SQL tuning for large analytical queries.

  • Developed ELT workflows using DBT with automated deployments via Terraform.

  • Designed and deployed AWS Lambda functions, managed Docker image creation, and built UI interfaces using Django.

Education

Education
Bachelor in Computer Application