Felipe C.

Felipe C.

Senior Data Scientist

New York, United States of America
Hire Felipe C. Hire Felipe C. Hire Felipe C.

About Me

Felipe is an expert Data Scientist empowered with the latest trends in big data and technology; in developing and delivering practical insights and actionable recommendations on projects. He employs a mix of dataset acquisition, statistical modelling, exploratory data analysis, and software engineering best practices in addressing a wide range of technical and identifying additional opportunities where data science can bring real business value.

Work history

UpStack
UpStack
Senior Data Scientist
2020 - Present (4 years)
Remote
  • Creating and developing innovative software solutions for clients across a broad range of industries.

  • Participate in scrums consisting of cross-functional teams, both software and hardware.

  • Ensure that features are being delivered efficiently and on-time.

Intellinum Analytics Inc
Intellinum Analytics Inc
Lead Data Scientist
2019 - Present (5 years)
New York, United States of America
  • Built a home detection algorithm from GPS trace/geolocation data from 10,000 users throughout the US using listwise learning-to-rank (LTR) algorithms for Intellinum Analytics.

  • Analyzed and modelled spatiotemporal point processes to identify clusters of mobile users and create new advertising audiences for the client based on their digital behaviour or places visited.

  • Led Intellinum's data science team on projects; leading to a reduction in computing and storage costs by 52%, optimizing the performance and quality of data ingestion and processing from 15 hours to 2 hours.

SpicyMinds – Digital Business Lab
SpicyMinds – Digital Business Lab
Lead Data Scientist
2015 - 2018 (3 years)
Mexico City, Mexico
  • Designed, implemented and analyzed offline and online experiments using A/B Test, factorial experiments and Thompson sampling to inform content generation, advertising, and web development processes to increase traffic, conversion rates, reach, and ROI for a 50+ SME customer base.

  • Built Python-based machine learning models to ingest historic purchase data and multichannel interactions for efficient customer segmentation, lead qualification, cross-selling and up-selling, and time-series forecasts.

  • Developed a reporting tool that integrated data from several marketing platforms onto customized Tableau dashboards; providing real-time updates on KPIs for websites, eCommerce sites, search and social ad spending.

Ministry of Justice and Law (Colombia)
Ministry of Justice and Law (Colombia)
Data Analyst - Criminal and Prison Policy
2015 - 2017 (2 years)
Bogota, Colombia
  • Provided expertise in forecasting future crime rate trends in Python using Network time series analysis, deep learning models and clustered time series in identifying similarities between states.

  • Assessed the impact of inmates population on different legislative measures to reduce overcrowding in prison using Monte Carlo simulation and Variance Reduction Techniques.

  • Performed market basket analysis and clustering in R on historic arrests data; learning and understanding better criminal behaviour in Colombia and identifying spatial and temporal changes of how crimes are committed.

Portfolio

Lead Data Scientist - GeoSpatial Analytics
Lead Data Scientist - GeoSpatial Analytics

Led processes and worked on the creation of a pipeline to manage daily 2+ B events of location data - using structured streaming to receive data from different data providers, processing it to extract only the most relevant information and to prepare it in a format that was more convenient for downstream consumption. Handled the optimization and implementation of the production algorithm that processed geospatial data and enriched it with external information; leading to a reduction in data volumes by 75%, saving 50% on monthly storage and computing costs of $500,000 per year and reduced the time for analysts to create custom reports. Improved data structures on the algorithm, tuned the parameters to satisfy customer requirements and implemented QA systems to monitor data streams and the output of the algorithm. The solution was implemented in production and it has become the core of new data products.

Lead Data Scientist - Fraud Detection in AdTech
Lead Data Scientist - Fraud Detection in AdTech

Worked on a fraud detection solution for the client with an increase in the volume of data received from the data partners. The numbers were high on the pipeline as it was unable to keep up with the new loads. Implemented new mechanisms for fraud detection on the old solution; working with different data providers to improve the quality of data sent. Employed several machine learning models on the project; identifying and rectifying most of the fraudulent data on it - reducing storage and processing costs by 25% and improving the reliability of insights from data received. The new algorithm was successfully implemented to update the client's databases by deleting bad data.

Lead Data Scientist - Visualization Tool for Customer Success
Lead Data Scientist - Visualization Tool for Customer Success

Contributions: Gathering the requirements from different stakeholders, creating technical requirements for data and software engineers. Documentation and training to all stakeholders within the company. Achievements: Efficient query of the most relevant data for customer success. Average running time of the queries decreased from 30 min to 2-3 min. It allowed independent exploration of the data by the analyst and sales team and they later proposed new products that could leverage the insights they found. The tool fostered innovation within the company. Improvements: The level of aggregation was customized to the most common requirements and it allowed the queries to be more efficient. Partitioning the data and changing the format of how the data was stored. Finality on the project: was it launched? Was it published? How many downloads? How many users? The product is still in beta and we are currently improving the UI to make it more accessible and to easily integrate it with the entire data infrastructure of the company. All incoming request and questions from customer success representatives were forwarded to the Engineering team. It took several days to process and it created a high load on the engineers. We created a self-service tool that the customer success team could use to analyze the most common request and questions and to query the data efficiently.

Education

MSc. Operations Research
MSc. Operations Research
Columbia University in the City of New York
2018 - 2019 (1 year)
Big Data: Data Analytics for Business and beyond 
(Joint summer progamme between London School of Economics and Peking University)
Big Data: Data Analytics for Business and beyond (Joint summer progamme between London School of Economics and Peking University)
Peking University
2016 - 2016
BEng. Chemical Engineering
BEng. Chemical Engineering
Universidad Nacional de Colombia
2007 - 2013 (6 years)