Yuriy is a data specialist with over 15 years of experience in data warehousing, data engineering, feature engineering, big data, ETL/ELT, and business intelligence. As a big data architect and engineer, Yuriy specializes in AWS and Azure frameworks, Spark/PySpark, Databricks, Hive, Redshift, Snowflake, relational databases, tools like Fivetran, Airflow, DBT, Presto/Athena, and data DevOps frameworks and toolsets.
Designed and built a featured engineering data mart and customer 360° data lake in AWS S3.
Designed and developed a dynamic S3-to-S3 ETL system in Spark and Hive.
Completed various DevOps tasks included an Airflow installation, development of Ansible playbooks, and history backloads.
Worked on a feature engineering project which involved Hortonworks, Spark, Python, Hive, and Airflow.
Built a one-on-one marketing feature engineering pipeline in PySpark on Microsoft Azure and Databricks (used ADF, ADL, Databricks Delta Lake, and ADW as a source).
Worked on full data warehouse implementations for multiple clients.
Provided big data training and support for consulting partners.
Engineered and built an ETL pipeline for an AWS S3 data warehouse using AWS Kinesis, Lambda, Hive, Presto, and Spark. The pipeline was written in Python.
Delivered data warehouses, data lakes, data lakehouses, feature marts, BI systems, migrations, and integrations.
Managed two data warehouses and BI teams for both PriceGrabber and Shopzilla. Connexity is also known as PriceGrabber, Shopzilla, and BizRate.
Handled operational support for the PriceGrabber data warehouse. Recovered data warehouse after the data center migration.
Merged one data warehouse into another and retired one of them. Hands-on designed business and data integration architecture; developed data validation scripts and ETL integration code. Managed the transfer of a BI reporting system from Cognos to OBIEE and Tableau.
Defined the technology platform change strategy for the combined data warehouse.
Created SQL: PL SQL stored procedures, packages, and anonymous scripts for ETL and data validation.
Completed an Amazon Redshift project.
Worked on and completed a Cloudera Impala project.
Oversaw the company's data services, defined the overall and technical strategy for data warehousing, business intelligence, and big data environments.
Hired and managed a mixed on-shore (US)/off-shore (India) engineering team.
Replatformed a data warehouse to Oracle Exadata X3/Oracle ZFS combination, added big data and machine learning components to the data warehousing environment.
Supported 24x7x365 operations in compliance with the company's top-level production SLA.
Wrote thousands of lines of PL/SQL, PL/pgSQL, MySQL, and HiveQL code.
Wrote ETL scripting in Perl, Python, and JavaScript internally in Kettle.
Worked with big data on multiple types of projects (Hadoop, Pig, Hive, and Mahaut).
Developed a tool-based ETL for a Pentaho (Kettle) CE ETL redesign project.
Worked on machine learning for various types of projects (Python, SciPy, NumPy, and Pandas).
Senior Principal Consultant (Professional Services, Essbase Practice)
1999 - 2001 (2 years)
Remote
Led a practice for a consulting company covering for multiple clients.
Developed Essbase satellite systems: relational data warehouses and data marts, reporting systems, ETL systems, CRM's, EPP's, ETL in and out of Essbase and with Essbase itself.
Worked on multiple PL/SQL projects, by providing full support of the team's Oracle project pipeline.
Helped to develop SQL servers for multiple Transact-SQL and analysis services projects.
Developed a tool-based ETL for an Informatica project.
Worked with Hyperion, Essbase, Enterprise, Pillar, planning, financial analyzers, and VBA projects.
Acted in a consulting capacity in implementing the data management lifecycle components for Carbon 38's Data Lake and Data Warehouse project - handling the implementation of real-time streaming replication on the solution.
Designed, built and deployed a feature engineering platform that provides support for data scientists - migrating the platform from MSSS to Spark. Authored and maintained professional documentation describing data architecture, design specifications, source to target mappings and other client deliverables as required.
Worked on delivering a data warehouse solution for an ad-tech company - building a new Snowflake data warehouse with real-time data pipeline, Kafka streaming, heterogeneous database replication and migration from MSSS to Snowflake.
Education
Diploma (Master of Science Equivalent) Degree in Applied Mathematics
Odessa I.I. Mechnikov University
1975 - 1980 (5 years)
Certificate of Completion in Oracle Database Administration
UCI Extension
Certificate of Completion in Cloudera Developer Training for Apache Hadoop
Cloudera University
Certificate of Completion in Data Science and Engineering with Apache Spark