Denis is a senior full-stack AI engineer and data scientist, highly skilled in modern generative tech (GPT-4, Midjourney, etc.), machine learning, ETL pipelines, data analysis, mathematical modeling, big data, and MLOps. He has a Ph.D. in mathematics, and his data science expertise includes probabilistic risk modeling, revenue forecasting, geospatial data analysis, handwriting recognition, anomaly detection in time series, data engineering, and team leading.
Created a BI dashboard to Query and summarize a vast amount of semi-structured data.
Set up and tuned an Elasticsearch cluster and Kibana on AWS cloud.
Developed an ETL pipeline to ingest a terabyte of raw data into Elasticsearch.
Architected and directed the creation of a core Similarity engine to score candidates.
Created a Big Data pipeline in Databricks and Spark to enrich the input data and prepare the features for ML.
Used pre-trained NLP deep neural networks to create semantic text embeddings, which significantly increased the Similarity engine output results.
Developed a score using Spark GraphX to measure the company's attractiveness in the job market.
Prepared custom deep learning models to build richer embeddings, including various data sources and metadata.
Led communications with external data providers and created infrastructures to interface with their APIs.
Drove implementation of best DevOps and MLOps practices to improve reliability and reproducibility of ETL, feature generation, models' training, and inference subsystems.
Built a foundational end-to-end machine learning solution that predicts fair prices of real-estate properties, thus eliminating a need for manual assessment and enabling the company to run its business by providing quick responses to its customers.
Designed and implemented an automatically refreshing ETL pipeline that injects, cleans, joins, and enriches new data from AWS S3 storage daily.
Developed an interpretable machine learning model with Scikit-learn, CatBoost, Lifelines, FBProphet, FAISS, and SHAP that consists of several submodels and satisfies business monotonicity constraints.
Set up a continuous machine learning procedure for daily model retraining and redeploying based on the newly collected data.
Designed and implemented an automatic model promotion mechanism to ensure that models produced via the daily retraining process get deployed to production only if they have sufficiently good performance metrics and satisfy business constraints.
Created a historical data simulation system to generate synthetic data before the company’s launch and enable backtesting capabilities.
Architected and built the required infrastructure in AWS cloud: EC2 instances and VPCs, Docker environments for development, testing, and production, an Airflow pipeline for ETL and ML, and MLFlow model storage.
Created various dashboards for data exploration and data quality management, model performance monitoring, and visualized predictions.
Supervised other data science team members and coordinated with the engineering team.
Created a machine learning model that predicted revenues for a retail store chain based on store location, local demographic data, GIS features, seasonality, and other factors.
Developed and deployed an interpretable machine learning model that scored B2B customers for payment default risks and provided explanations for the scores. The model massively reduced workload for weekly risks assessment.
Built a probabilistic Bayesian machine learning model to predict which apartment buildings still under construction would fail to be commissioned in time. The model helped reduce the funds needed to hedge risks by two times.
Developed and deployed NLP models to automatically label a vast body of housing contracts by contract type and extract contractor party names, address entities, and other attributes.
Constructed and deployed a model to predict the problematic clogging of the evaporator in a chemical factory. This allowed for the timely preemptive service of the unit before it broke down, saving millions of dollars in production time.
Led and mentored a team of junior and middle data scientists in the projects mentioned above.
Communicated with clients, ensuring business goals were correctly translated into data science and machine learning tasks—explained insights and models to clients.
Architected ETL pipelines, including data acquisition, data ingestion, merging internal and external datasets, data cleaning and validation, data transformation, and feature engineering on several distinct projects.
Designed model performance metrics and their measurement protocols on several distinct projects.
Developed an ML system for a retail bank to recommend bank products to clients based on their past transactions' patterns. This included building an ETL pipeline and an ML recommendation system.
Developed an MVP of an app to query enterprise data in natural language. Given access to a database and a question in natural language about the data, the app would output the answer as a plot or a small table.
Engineered and fine-tuned the prompts to improve the quality and correctness of SQL code generation.
Created an automatic annotator for the database columns and the final table.
PandasBig DataGenerative Pre-trained Transformers (GPT)
GPT
Natural Language Processing (NLP)
Text Generation
Code Generators
SQLLanguage Models
Fine-tuning
CSV
ChatGPT
OpenAIAPI IntegrationOpenAI GPT-3 API
Chatbots
Databases
Blockchain Security Company
Senior Machine Learning Engineer
Present (2025 years)
Remote
Created a machine learning model to automatically detect malicious smart contracts before they can cause harm.
Built a visualization tool for model output to audit its decisions.
Deployed the model to AWS cloud platform as a Lambda serverless function.
BlockchainEthereum Smart Contracts
Smart ContractsAWS Lambda
Generative Pre-trained Transformers (GPT)
GPT
Natural Language Processing (NLP)
Linear Regression
Decision Tree Regression
CSV
Databases
Skillbox
Data Science Evangelist
Present (2025 years)
Remote
Reviewed and improved core courses in mathematics, data science, and machine learning.
Supervised the creation of new courses, including video lectures and exercises, on Data Science, Analytics, SQL, Power BI, and Tableau.
Recruited, interviewed, and screened lecturers and tutors for new courses.
Developed a machine learning model to predict retail store chain revenues.
Led a data science team in the entire model lifecycle, including data extraction, web scraping, ETL, analysis, feature engineering, and model deployment.
Implemented a dashboard to visualize and present the revenue prediction model.