Felipe is an expert Data Scientist empowered with the latest trends in big data and technology; in developing and delivering practical insights and actionable recommendations on projects. He employs a mix of dataset acquisition, statistical modelling, exploratory data analysis, and software engineering best practices in addressing a wide range of technical and identifying additional opportunities where data science can bring real business value.
Built a home detection algorithm from GPS trace/geolocation data from 10,000 users throughout the US using listwise learning-to-rank (LTR) algorithms for Intellinum Analytics.
Analyzed and modelled spatiotemporal point processes to identify clusters of mobile users and create new advertising audiences for the client based on their digital behaviour or places visited.
Led Intellinum's data science team on projects; leading to a reduction in computing and storage costs by 52%, optimizing the performance and quality of data ingestion and processing from 15 hours to 2 hours.
Designed, implemented and analyzed offline and online experiments using A/B Test, factorial experiments and Thompson sampling to inform content generation, advertising, and web development processes to increase traffic, conversion rates, reach, and ROI for a 50+ SME customer base.
Built Python-based machine learning models to ingest historic purchase data and multichannel interactions for efficient customer segmentation, lead qualification, cross-selling and up-selling, and time-series forecasts.
Developed a reporting tool that integrated data from several marketing platforms onto customized Tableau dashboards; providing real-time updates on KPIs for websites, eCommerce sites, search and social ad spending.
Provided expertise in forecasting future crime rate trends in Python using Network time series analysis, deep learning models and clustered time series in identifying similarities between states.
Assessed the impact of inmates population on different legislative measures to reduce overcrowding in prison using Monte Carlo simulation and Variance Reduction Techniques.
Performed market basket analysis and clustering in R on historic arrests data; learning and understanding better criminal behaviour in Colombia and identifying spatial and temporal changes of how crimes are committed.
Developed and maintained a pipeline to manage 2+ billion daily location data events using structured streaming.
Implemented an optimization and production algorithm for geospatial data processing and enrichment, reducing data volumes by 75% and storage costs by $500,000 annually.
Improved data structures, tuned parameters, and implemented QA systems to monitor data streams and algorithm output, resulting in a core component for new data products.
Developed a fraud detection solution for a client, increasing data volume received from data partners.
Implemented new fraud detection mechanisms utilizing different data providers to enhance data quality.
Utilized several machine learning models to identify and rectify fraudulent data, reducing storage and processing costs by 25% and improving data reliability.
Gathered requirements from stakeholders, creating technical specifications for data and software engineers.
Improved query efficiency by reducing query execution time from 30 min to 2-3 min, enabling independent data exploration and product innovation.
Implemented data aggregation and format changes to enhance query performance and scalability, and created a self-service tool for customer success representatives.
Education
MSc. Operations Research
Columbia University in the City of New York
2018 - 2019 (1 year)
Big Data: Data Analytics for Business and beyond
(Joint summer progamme between London School of Economics and Peking University)