Naga is a passionate Data Engineer with over 8 years of experience in designing, developing, and optimizing entire data pipelines including data science algorithms in production at scale with operational reliability. He has extensive experience with Python, Shell Scripting, Big Data, Hadoop, Hive, Spark, Kafka, Airflow, as well as cloud environments including AWS and Azure. Naga's expertise includes assessing organizational data needs and architecting the right approach, migrating traditional warehouse platforms to Big Data if necessary.
Delivers data warehouse and ETL solutions as part of an agile team using advanced machine learning techniques to improve performance and processes.
Helps build and improve infrastructure, application and performance development and ensures tight security including data encryption, security groups, and environment scanning.
Ensures high-quality deliverables and implements CI/CD and security best practices in fast-paced environments.
Worked at client site to build a data platform for Online Education. Migrated legacy Talend ETL jobs to AWS Lambda orchestrated by AWS managed Airflow. Developed frameworks for Lambda to integrate with Salesforce, efficient data ingestion process along with custom operators in Airflow.
Created configuration-based Data Orchestration Self Service platform including Airflow scheduling, ingestion, and key data pipelines in Kafka, Spark, Hive.
Supported AI/ML squads to enhance and enrich data services by embedding feature extraction into pipeline for machine learning and building models.
Designed and Implemented data ingestion and analytics pipeline for customer trials. Redeveloped Data Science algorithms in production at scale, to meet performance and operational stability needs.
Implemented AWS S3 and Redshift based data service to be used by Data Science teams. Reimplemented core data science algorithm to match with Scala performance thus enabling business to maintain a single version of code in Python across trials and production.
Implemented various Pandas optimizations to speed up data science and data engineering code execution.
Implemented and enhanced Data Ingestion framework in Big Data Technology using Spark (PySpark), Solr, Hive, Kafka (real time data and stats streaming).
Designed and developed a tool to recommend the right plan to customers based on product and usage behaviour. Increased customer satisfaction score and revenue (~ $5M annually).
Created and developed a tool to help consultants quickly process business segment customer requests related to cost centre hierarchy allocation.
Reduced defects in pre-production test environments by 50% through quality code and automated unit testing.
Optimized the Oracle PL/SQL code and reduced from 40k lines to 25k for easy maintainability, performance, and ability to deliver future enhancements quickly.
Improved performance and reduced processing time of critical invoicing module from 3 hours to 2 hours 15 minutes. Developed a tool to significantly reduce efforts (30 minutes to 5 minutes for every customer in test environment) in test data creation.
Telstra is Australia’s leading telecommunications and information services company. Performed various roles simultaneously as Technical Delivery Lead, Billing Architect, Iteration Manager, Pipeline Manager, Developer, based on team needs. Optimized the vendor costs by 10-30% reduction (on average for 70% of projects) by thorough and detailed review of estimations. Developed prototype Digital Billing APIs to enable long-term strategy for Businesses to migrate from monolithic applications to a modern architecture based on micro services/APIs.
ONZO is a global leader in the creation of meaningful analytics and insight derived from Smart Meter consumption data. Through our ATLAS platform, we analyze consumers' energy consumption, creating insight that produces a personalized and enhanced customer experience, enabling an Energy Company to tackle key business challenges by offering the right tariffs, products, and services. Designed and Implemented data ingestion and analytics pipeline for customer trials. Redeveloped Data Science algorithms in production at scale, to meet performance and operational stability needs. Implemented AWS S3 and Redshift-based data service to be used by Data Science teams. Reimplemented core data science algorithm to match with Scala performance thus enabling a business to maintain a single version of code in Python across trials and production.
Education
Bachelor of Technology in Computer Science & Engineering