Yoe H.

Yoe H.

Python Developer | Data Scientist

Medellín, Colombia
Hire Yoe H. Hire Yoe H. Hire Yoe H.

About Me

Yoe is a Python Developer and Data Scientist with extensive experience working on Machine Learning/Python projects and transferring real-world problems into requirements and solution planning. With a solid background and practical knowledge in ML/AI, research, mathematics, and statistical analysis, he delivers solutions and helps businesses to achieve more.

Work history

UpStack
UpStack
Python Developer | Data Scientist
2022 - Present (3 years)
Remote
  • Delivering data warehouse and ETL solutions as part of an Agile team using advanced ML techniques to improve performance and processes.

  • Helping build and improve infrastructure, application, and performance development and ensuring tight security including data encryption, security groups, and environment scanning.

  • Ensuring high-quality deliverables and implementing DevOps and security best practices in fast-paced environments.

  • Building pipelines with transformation/aggregation phases using PySpark and working with Databricks for collaborative analytics.

Turing
Turing
Engineering Manager
2024 - Present (1 year)
Remote
  • Supervising and managing a team of 30+ business analysts, including 5+ team leads, ensuring output targets are met and guidelines followed.

  • Identifying training needs, conducting sessions, and ensuring high-quality training datasets.

  • Performing evaluations and implementing improvement plans.

  • Conducting regular QA checks, identifying process gaps, and improving workflows to enhance quality.

  • Overseeing daily operations, ensuring timely, within-budget project delivery.

  • Managing resources and monitoring performance metrics.

Engineering Management Training Evaluation Key Performance Metrics Resource Management Python 3 Google Sheet APIBigQuery AWSFlaskMongoDBMySQLSQLMachine Learning Algorithms
Mercor
Mercor
Mathematics Expert
2024 - 2024
Remote

Wrote solutions to advanced math problems to be fed into an LLM model.

Large Language Models (LLMs) Complex Problem Solving Mathematics
Turing
Turing
AI Trainer
2024 - 2024
Remote
  • Evaluated model responses.

  • Worked on prompt engineering.

  • Built applications using OpenAI’s GPT models for chat and RAG in Flask.

Python 3 Data ScienceMachine LearningArtificial Intelligence (AI) Training Prompt Engineering OpenAI GPT-3 API OpenAIGPT FlaskRetrieval-augmented Generation (RAG)
Darwin AI
Darwin AI
Senior Software Engineer
2023 - 2024 (1 year)
Remote
  • Created API for collecting and labelling visual assets from ad platforms such as Google Ads, Meta, TikTok.

  • Worked on CI/CD processes using GitHub Actions and Bitbucket Pipelines to automate testing/deployment for APIs and data pipelines.

  • Containerized apps with Docker Compose for local testing and deployment to AWS EC2.

  • Integrated PyTest into CI workflows for Python services, ensuring code reliability.

  • Deployed serverless apps via Lambda/API Gateway.

  • Ensured robustness by writing unit/integration tests (PyTest) and monitoring performance with CloudWatch metrics.

PythonAmazon S3 (AWS S3) AWS Lambda AWS CloudWatchMongoDBAPI Applications Google Ads APIGraph API Creative Problem Solving Query Optimization BitbucketGitHub Actions CI/CD Pipelines AWS EC2Docker ComposePyTestAmazon API Gateway Integration TestingUnit Testing
Meta4Capital
Meta4Capital
Data Scientist/Analyst
2022 - 2022
Remote
  • Worked for an NFT startup company and created an algorithmic trading strategy Flask web app.

  • Developed a credit risk model for NFTs using Python and Flask among other technologies and libraries.

  • Gathered data from primary or secondary data sources and maintained databases/data systems.

Technology Institute of Antioquia
Technology Institute of Antioquia
Python Developer | Research Professor
2021 - 2022 (1 year)
Medellin, Colombia
  • Worked on some pragmatic prevention guidelines regarding SARS-CoV-2 and COVID-19 in Latin-America inspired by Mixed Machine Learning Techniques and Artificial Mathematical Intelligence.

  • Used ML tools and Python to set up a sentiment analysis classifier of tweets with the TensorFlow module.

  • Used ML tools and Python to set up a Long-Short-Term Memory Neural Network with the TensorFlow module to forecast the Colombian coffee price.

Universidad Autónoma de Bucaramanga
Universidad Autónoma de Bucaramanga
Researcher
2015 - 2022 (7 years)
Bucaramanga, Colombia
  • Published numerous research papers including statistical mechanics in the portfolio optimization with Kusuoka’s representation and conceptual computation in artificial mathematical intelligence as a paradigm-shifting technique in physics and mathematics.

  • Reviewed methods and teaching materials and gave recommendations for improvement.

  • Worked on research, fieldwork, investigations, and writing up reports.

University of Oklahoma
University of Oklahoma
Graduate Research & Teaching Assistant
2008 - 2013 (5 years)
Oklahoma, United States of America
  • Conducted research, prepared new materials, and read scientific papers, deriving new data-driven algorithms and creating computational models.

  • Contributed to the development of research documentation for publications, presentations, and applications.

  • Worked on creating new concepts, techniques, and standards.

  • Fine-tuned models on code generation, logical reasoning, and domain-specific Q&A.

Portfolio

Engineering Manager - SFT Advanced Reasoning
Engineering Manager - SFT Advanced Reasoning

Leading a 40-person team on the SFT Advanced Reasoning project (from Sep 2024), designing scalable architectures for curating complex training datasets and fine-tuning LLMs. This includes defining workflows for prompt engineering, response evaluation, and model iteration. Designed a Flask-based app integrating OpenAI APIs to test the different trainers' breaking model prompts. In previous roles, worked with systems using AWS Lambda, API Gateway, and EC2 to automate data collection/labeling pipelines for ad optimization, and with containerized services via Docker for reproducibility (I didn't create them, but I studied their configuration carefully). Using engineered ETL processes with BigQuery to monitor trainer performance, automating data pulls from Google Sheets/AppScript and streamlining reporting.

Colombian Coffee Price Forecast via LSTM Neural Networks
Colombian Coffee Price Forecast via LSTM Neural Networks

Used Machine Learning tools and Python to set up a Long-Short-Term Memory Neural Network with the TensorFlow module to forecast the Colombian coffee price. Used TensorFlow and Keras for LSTM time-series forecasting and classification tasks and leveraged MLFlow experimentally during model tracking for a BERT-based Q&A system, logging metrics/hyperparameters during A/B testing of different fine-tuning strategies for small projects or job application projects.

Data Scientist/Data Analyst - Meta4.Capital
Data Scientist/Data Analyst - Meta4.Capital

Meta4 Capital is a crypto-focused fund that identifies and invests in unique NFT projects with an emphasis on collectibles, art, gaming, and virtual land. Worked on two projects: NFT trading methods and NFT credit risk modeling, deploying a web app for credit score on lending NFTs using clustering algorithms in Flask. Technologies used: Python, MongoDB, Pandas, NumPy, Sklearn, Data Analysis, NFT, Flask.

Semantic and Morpho-Syntactic Prevention’s Guidelines for COVID-19 Based on Cognitively Inspired Artificial Intelligence and Data Mining
Semantic and Morpho-Syntactic Prevention’s Guidelines for COVID-19 Based on Cognitively Inspired Artificial Intelligence and Data Mining

Case Study: Europe, North America, and South America. Used Machine Learning tools and Python to set up a sentiment analysis classifier of tweets with the TensorFlow module.

Some Pragmatic Prevention’s Guidelines Regarding SARS-CoV-2 and COVID-19 in Latin-America
Some Pragmatic Prevention’s Guidelines Regarding SARS-CoV-2 and COVID-19 in Latin-America

Inspired by Mixed Machine Learning Techniques and Artificial Mathematical Intelligence. Case Study: Colombia. Used Machine Learning tools and Python to set up a sentiment analysis classifier of tweets with the TensorFlow module.

API for Visual Assets Collection
API for Visual Assets Collection

Contributed to the creation and maintenance of APIs to collect visual assets from ad platforms such as Google Ads, Meta, TikTok.

API for Labelling Visual Assets
API for Labelling Visual Assets

Contributed to the creation and maintenance of APIs to label visual assets from ad platforms such as Google Ads, Meta, TikTok.

Education

Education
MA Mathematics
University of Oklahoma
2008 - 2012 (4 years)
Education
PhD Mathematics
University of Oklahoma
2008 - 2013 (5 years)
Education
MSc Mathematics
Universidad Nacional de Colombia
2005 - 2007 (2 years)
Education
Bachelor’s Degree, Mathematics
Universidad Nacional de Colombia
1996 - 2004 (8 years)