Michael is a Machine Learning Consultant and Educator with a technical focus on machine learning and statistical analysis using Python, R, SQL and Shiny. He provides expertise for
machine learning, regression analysis and statistics, text mining and social media analysis, time series analysis, Shiny Web Applications and machine learning applications using frameworks such as Keras and TensorFlow. He is a keen researcher and participant in international machine learning conferences.
Conducted automated financial ratio analysis for companies across a range of industries using Python’s Quandl library.
Identified and reported on undervalued stocks based on free cash flow analysis.
Wrote and edited investment commentaries, statistical analysis white papers, sponsored commentaries, quarterly outlooks and customized RFP response materials.
Developed an extensive video course on the application of data manipulation, regression analysis and machine learning techniques in R with O'Reilly Media.
Instructed students on how to utilize R to conduct extensive data manipulation techniques on datasets with over 750,000 observations.
Worked with a team of 4 video content editors for the creation and production of the relevant course material, with over 2 hours of instructional video for students..
Built time series model using Python for company in the energy industry that allowed for automatic selection of moving average parameters based on RMSE minimisation.
Utilized regression analysis in engineering solutions that influenced traffic policy for a local Canadian government.
Designed an OLS regression model with an autocorrelation feature to identify statistically significant 6.569 units increase in road collisions from adverse weather conditions.
Developed a forecasting model using LSTM with Python to predict page views for the term “Brexit” on the Wikipedia platform. The model demonstrated a test score of 0.33 RMSE and consistency in predicting anomalies across page view trends. A long-short-term memory network was used to forecast page views for the test set.
Classified the traffic patterns for the City of London traffic using a Scikit-learn K-Means clustering algorithm. Segmented traffic by routes based on the density of cyclists and cars/taxis - identified six separate route clusters.
Used a seasonally-adjusted SARIMA model to predict temperature fluctuations for Dublin, Ireland. Obtained a mean percentage error of 7% with over 70% of observations showing less than 10% deviation from actual values.
Hotel cancellations can cause issues for many businesses in the industry. Not only is there the loss of revenue as a result of customer cancellations, but also causes difficulty in the coordination of bookings and adjustment of revenue management practices. To investigate how machine learning can aid in this task, the ExtraTreesClassifer, logistic regression, and support vector machine models were employed in Python to determine how cancellations can be accurately predicted using the model. Used Pandas to process, sort, and aggregate 20,000 booking entries by total cancellations per week on the project, identified lead time, country of origin, and deposit type as cancellation drivers -obtained AUC of 0.74. Implemented an LSTM model to predict weekly cancellations-yielded mean directional accuracy of 90%. Deployed the machine learning models using AWS CodeCommit and AWS Sagemaker, interviewed major hotel chain owners and presented findings to inform potential marketing strategies.
The purpose of the image recognition solution was to build a classifier that can distinguish between the image of a car vs. the image of a plane. Implemented data augmentation with a VGG16 pre-trained network to prevent overfitting and trained models across 30 epochs. Validation loss was reduced significantly from 0.9405 to 0.2567 and the model correctly categorized 93% of the test images.