Engineering Manager - SFT Advanced Reasoning
Leading a 40-person team on the SFT Advanced Reasoning project (from Sep 2024), designing scalable architectures for curating complex training datasets and fine-tuning LLMs. This includes defining workflows for prompt engineering, response evaluation, and model iteration. Designed a Flask-based app integrating OpenAI APIs to test the different trainers' breaking model prompts. In previous roles, worked with systems using AWS Lambda, API Gateway, and EC2 to automate data collection/labeling pipelines for ad optimization, and with containerized services via Docker for reproducibility (I didn't create them, but I studied their configuration carefully). Using engineered ETL processes with BigQuery to monitor trainer performance, automating data pulls from Google Sheets/AppScript and streamlining reporting.
Colombian Coffee Price Forecast via LSTM Neural Networks
Used Machine Learning tools and Python to set up a Long-Short-Term Memory Neural Network with the TensorFlow module to forecast the Colombian coffee price. Used TensorFlow and Keras for LSTM time-series forecasting and classification tasks and leveraged MLFlow experimentally during model tracking for a BERT-based Q&A system, logging metrics/hyperparameters during A/B testing of different fine-tuning strategies for small projects or job application projects.
Data Scientist/Data Analyst - Meta4.Capital
Meta4 Capital is a crypto-focused fund that identifies and invests in unique NFT projects with an emphasis on collectibles, art, gaming, and virtual land. Worked on two projects: NFT trading methods and NFT credit risk modeling, deploying a web app for credit score on lending NFTs using clustering algorithms in Flask. Technologies used: Python, MongoDB, Pandas, NumPy, Sklearn, Data Analysis, NFT, Flask.
API for Visual Assets Collection
Contributed to the creation and maintenance of APIs to collect visual assets from ad platforms such as Google Ads, Meta, TikTok.
API for Labelling Visual Assets
Contributed to the creation and maintenance of APIs to label visual assets from ad platforms such as Google Ads, Meta, TikTok.