Rohit G.

About Me

Rohit G. is a Software Engineer with experience in data engineering and machine learning operations. He has worked on optimizing data ingestion pipelines, migrating pipelines and transitioning tables to Databricks Unity Catalog. Rohit G. has experience with Azure, Databricks, Kubernetes and various other technologies. He is also certified in Databricks, AWS, Dataiku and other areas.

AI, ML & LLM

Backend

Database

DevOps

Other

Work history

BP
BP
Data Engineer II
2023 - Present (2 years)
Pune (Hybrid), India
  • Optimized data ingestion pipelines, achieving cost savings and improved processing efficiency.

  • Migrated pipelines from a shared ADF to a dedicated ADF and set up CI/CD to automate code deployment.

  • Streamlined oil refining data ingestion from various sources with Azure Data Factory and Databricks.

ZS
ZS
Senior Data Engineer | Business Technology Analyst
2020 - 2023 (3 years)
Pune, India
  • Optimized long-running Spark jobs on the Kubernetes clusters.

  • Automated migration of projects from EMR to EKS using Dataiku APIs and Jupyter Notebooks.

  • Prepared a one-click solution for deploying a project and checking code quality using Azure DevOps, SonarQube, Pylint, and Trufflehog.

  • Deployed the NLP framework Flair as an ECS service.

  • Created a generic Python library used across pharmaceutical clients.

  • Dockerized Jenkins and executed a CI/CD pipeline into it.

  • Automated execution of an Airflow data ingestion pipeline.

  • Standardized and refactored Pharma project data.

  • Worked on Stardog API to create microservices.

SparkKubernetesDataikuAzure DevOpsSonarQubepylint Data EngineeringAWS EMRAWSAWS EKSJupyter NotebookAmazon Elastic Container Service (ECS) Natural Language Processing (NLP) PythonJenkinsDockerCI/CD Pipelines AirflowData pipelinesMicroservicesStardog MySQLAzure DatabricksData AnalysisSemantics GitLab CI/CD Big Data
ITC Infotech
ITC Infotech
Deep Learning Research Intern
2019 - 2019
Bangalore, India
  • Applied OpenCV DNN for face detection.

  • Calculated embeddings and Euclidean distances of images using FaceNet and TensorFlow.

  • Clustered similar images using the Chinese Whispers Algorithm.

Education

Developer, Core Designer, and Advanced Designer
Developer, Core Designer, and Advanced Designer
Dataiku
2023 - 2023
Cloud Practitioner Essentials
Cloud Practitioner Essentials
AWS
2020 - 2020
Fundamentals of Big Data and Delta Lake
Fundamentals of Big Data and Delta Lake
Databricks
2020 - 2020
Machine Learning
Machine Learning
Coursera
2019 - 2019
B.Tech Computer Science
B.Tech Computer Science
Jaypee University of Information Technology - India
2016 - 2020 (4 years)