Divya J.

Divya J.

Senior Big Data Engineer

Mumbai, India
Hire Divya J. Hire Divya J. Hire Divya J.

About Me

Divya has been designing and developing Core Java (J2EE) applications for over a decade, specializing in data-oriented and performance-intensive platforms. She has extensive hands-on experience in Spring Boot, Hibernate, and Dropwizard frameworks, as well as Kafka, JBoss Drools, Hadoop, and Spark. Divya has also worked with SQL and NoSQL databases.

Java SQL Java 6 PostgreSQL Microsoft SQL Server Spring MongoDB Hibernate Spring Boot Hadoop Apache Kafka Talend Natural Language Processing (NLP) Web Services MySQL

Work history

Freelance
Software Engineer
2020 - Present (4 years)
Remote
  • Developed a caching system, based on Elasticsearch, to allow faster searches.

  • Designed and implemented a microservices-based, system-wide, data access layer to be used across platforms to save and retrieve data.

  • Developed a custom authentication mechanism, using the Open Distro plugin to provide auth control on Elasticsearch.

Jakarta EE Elasticsearch Apache Kafka MongoDBSpringSpring BootSpring IntegrationSecurity Authentication
CIGNEX Datamatics
Big Data Lead Consultant
2012 - 2020 (8 years)
Remote
  • Developed an advanced file processing system that enables users to configure the structure, processing rules (using Drools and KIE), third-party API integrations, and other delivery configurations for processing the data (SSIS and Kafka ).

  • Developed a social media analytics solution used by a law-related firm to collect and report on data from different social networks and compute sentiments for the same. This solution won the company's Innovation of the Year award in 2014.

  • Built a sales acceleration and reporting platform for sales reps and business leaders to understand the relationships and trends across service lines, customers, and regions.

RESTful Development REST APIs Web ServicesSpring SecuritySecurity Microservices Architecture Spring BootStanford NER Generative Pre-trained Transformers (GPT) GPT Natural Language Processing (NLP) AlteryxTalend ETL Apache Kafka Elasticsearch Spring Data MongoDB Spring Data JPA HibernateJava 8 SQL Server Integration Services (SSIS) PostgreSQLJBoss DroolsETL Tools HadoopApache HiveMongoDBMicrosoft SQL Server SpringJavaTalendSQL
Persistent Systems
Software Engineer
2010 - 2012 (2 years)
Remote
  • Developed a platform that includes secondary and tertiary analysis of data collected from next-generation sequencing from DNA samples using SOLiD technology comprising mapping and alignment to the reference human genome data.

  • Built a platform that deals with the performance testing of the next-generation sequencing algorithm (KB Basecaller) for both diagnostic and research purposes. The platform is used for genome data collection and fragment analysis using algorithms.

  • Understood requirements and developed tools such as small RNA counts, coverage, and extractors with extensive input/output (IO) operations.

Portfolio

Dynamic File Processing Engine

The platform enables users to process and analyze large amounts of incoming and outgoing data. Since every dataset is unique in terms of quantity, quality, and format, the platform allows users to configure the format, specifications, and business rules for processing and managing the data on the UI itself. The engine uses these configurations and interprets them at various stages of processing. The underlying data used for the same remains in a data mart in the correct normalized form to provide the most effective results in terms of processing throughput and data consistency. In this way, clients avoid format-to-format hard coding and different platforms across the systems are integrated in one place.On the UI, the user configures the file layout and processing rules. This processing logic can be ordered to create a workflow. Internally, all these rules are converted into DRLs. With Drools, user-driven processing could be controlled by rules created by the user on the UI and executed at runtime in KIE Sessions. In these sessions, data was processed in chunks, enriched with third-party APIs (as configured), and delivered based on the frequency and FTP/SFTP configurations for delivery.

Real-time Production Control Solution

Unlike legacy systems, this platform provides a production control solution designed for Lean construction projects. Created by construction managers and engineers, with innovative and easily understood visualizations, it can replace the sticky-note process with no loss of Lean principles, concepts, benefits, or control, and it can be available to all stakeholders in the cloud. It serves as one tool for master, phase, production, look-ahead, and daily production schedules that takes advantage of Takt time planning.

Sales Acceleration and Reporting Platform

The platform enables sales representatives and business leaders to understand the relationships and trends across various service lines, customers, regions, etc. It also helped users to evaluate the ratio of serviceable market opportunity to bid to conversion for each opportunity by integrating data from various internal and external platforms.

Lead Generation and Analysis Product

The product is a B2B customer service intelligence engine that targets and tracks the products, companies, and profiles that are of interest to you. It aggregates public information from various media sources and performs social listening and natural language processing on the same. The information is also validated across different platforms, and the system generates targeted business intelligence and insight using advanced graph technologies and algorithms,

Social Media Analytics Product for Legal Industry

The product collects, processes, analyses, and reports on data from different social networks like Facebook, Twitter, Google+, YouTube, and Instagram. The data collection is based on keywords that are pre-specified or selected by users and other people on social media. Registered users can see sentiment analysis and trending for different keywords that are populated based on data collected from different social media platforms.

Education

TOGAF® Certified Enterprise Architect
TOGAF® Certified Enterprise Architect
The Open Group
2019 - 2019
Confluent Certified Developer for Apache Kafka
Confluent Certified Developer for Apache Kafka
Confluent
2019 - 2021 (2 years)
1.  C100DBA: MongoDB Certified DBA Associate Exam
2.  MongoDB Certified DBA Associate (C100DBA)
3.  MongoDB Certified Developer, Associate (C100DEV)
1. C100DBA: MongoDB Certified DBA Associate Exam 2. MongoDB Certified DBA Associate (C100DBA) 3. MongoDB Certified Developer, Associate (C100DEV)
MongoDB
2014 - 2016 (2 years)
M.Sc. Bioinformatics
M.Sc. Bioinformatics
University of Pune
2008 - 2010 (2 years)
BSc. Biotechnology
BSc. Biotechnology
Banasthali Vidyapith
2005 - 2008 (3 years)
Cloudera Certified Specialist in Apache HBase (CCSHB); Cloudera Certified Hadoop Developer
Cloudera Certified Specialist in Apache HBase (CCSHB); Cloudera Certified Hadoop Developer
Cloudera