Dipanshu is highly skilled engineer with a strong foundation in Python, specializing in data processing and transformation using Pandas and NumPy. Dipanshu displays proficiency in SQL, machine learning, and cloud technologies (AWS). He is experienced in building efficient data pipelines and optimizing query performance. Dipanshu is also committed to writing clean, maintainable, and scalable code.
Working on multiple software solutions focused on solving the problem of document-based information overload for the financial services industry.
Leading new projects using NLP, Machine Learning, data mining (unstructured data from PDF documents) + scraping (intelligent goal-oriented, config-based), asynchronous APIs, microservices, GNU/Linux (RHEL, Ubuntu), and on-premise deployments.
Enhancing the functionalities of current software systems and creating predictive models for ML-based features.
Bank Statements Hub extracts transaction information from any arbitrary bank’s statement PDF without having to maintain any prior format/template information. It analyzes the extracted information to generate a financial health model for use by credit rating agencies. Understood user requirements and built some internal Python libraries to augment the open-source packages available for PDF-related data mining. Developed a full-blown PDF processing library that is more robust and flexible to extend than the open-source options like Tabula. Participated in the integration with a client's existing systems through an on-premise deployment.
This AI-powered bot sends telegram messages with affirmations for users to read to themselves on fully customizable topics. Users can add/remove/list topics at any time and request new affirmations outside the schedule as well. The goal is to help users change their self-beliefs and try to become the best version of themselves every day.
KEngine is a self-improving application that extracts any pair of key & value from any PDF document (scanned or digital) based on initial training provided by a user for that document type. Worked on developing large parts of the back end and the self-learning mechanisms along with the user-led training mechanism. Technologies used: Python, PDF processing libraries, HTML+CSS+JS, OCR tools, SpaCy, NLTK.
News Analytics sources financial news from the web, adds tags to specific companies mentioned in the article, classifies it into a news category, and then predicts whether or not a given user would like to see that news item. Worked with Python, NLTK, SpaCy, TensorFlow, Selenium, BS4, Flask.
A SaaS tool for auto-generating hyper-personalized videos at scale for the entire customer base. Focused on the finance industry, it automatically converts reports (PDFs) and other data into videos.
Developed an AI-based web app for YouTube creators to get instant detailed feedback on their thumbnails and titles before publishing, so they can maximize their chances of high view counts.