Mariel B.

Mariel B.

Data Scientist

Ireland
Hire Mariel B. Hire Mariel B. Hire Mariel B.

About Me

Qualified Data Scientist specialized in Artificial Intelligence with 7+ years of experience in creative modeling of complex data for classification and prediction using Python and R. Mariel is proficient in Data Analysis and communicating results to audiences with diverse backgrounds. Contributed to multiple research projects, while being a Bioinformatics Ph.D. candidate in the School of Mathematics, Statistics and Applied Mathematics at the National University of Ireland - Galway with peer-reviewed publications in Artificial Intelligence and Bioinformatics.

Work history

UpStack
UpStack
Data Scientist
2020 - Present (4 years)
Remote
  • Create and implement data analysis pipelines, including data access, ingestion, munging / manipulation / cleansing, analysis / modelling, testing, deployment / integration into business applications and services.

  • Enhance operational aspects of businesses by increasing control of the company's data.

  • Working in cross-functional teams to provide data-driven solutions for increased efficiency and productivity.

National University of Ireland, Galway
National University of Ireland, Galway
PhD Graduate Student
2017 - 2021 (4 years)
Ireland
  • Worked on a project for performing the risk prediction in kidney transplant patients in a more appropriate manner using gene expression data.

  • Performed data pre-processing and analysis using Python to develop an Ensemble Learning System with Particle Swarm Optimisation (PSO) approach for hyper-parameter optimization.

  • Led to increased predictive performance when compared with the Ensemble alone when using a random oversampling technique.

Federal University of Rio Grande do Sul
Federal University of Rio Grande do Sul
Course Instructor
2017 - 2017
Brazil
  • Performed course ministration for undergraduate and graduate students in Python for Bioinformatics.

  • Developed an appropriate instructional plan, as well as plan lessons and assignments.

  • Kept up to date with changes and innovations in their field and published research and analysis in books and academic journals.

National University of Ireland, Galway
National University of Ireland, Galway
Research Intern
2016 - 2016
Ireland
  • Worked on the research of automated classification of ultrasonic vocalization patterns in a genetic mouse model of autism spectrum disorder.

  • Applied information-theoretic and statistical approaches to the study of vocalization patterns in wild-type and heterozygous Tbx1 mice, which show distinct phenotype in terms of neuronal development and social communication. Supervisor: Pilib Ó Broin, PhD

  • Performed data analysis, interpretation of results and co-wrote the paper.

Federal University of Rio Grande do Sul
Federal University of Rio Grande do Sul
Undergraduate Student Researcher / Course Instructor
2013 - 2016 (3 years)
Brazil
  • Worked on the research of Protein Structure Prediction Algorithms and Tools, Protein Data Bank using a Data mining approach.

  • Performed research of Evolutionary history of RAGE family, using a systems biology approach; Geneplast using the R package development.

  • Held courses in the subject area Python for Bioinformatics for undergraduate and graduate students.

Portfolio

PhD Candidate / Data Scientist /AI specialist / Python Developer - Risk prediction of kidney transplant patients
PhD Candidate / Data Scientist /AI specialist / Python Developer - Risk prediction of kidney transplant patients

The aim of this project is to perform the risk prediction in kidney transplant patients in a more appropriate manner using gene expression data. This project is based on the development of an Optimised Ensemble Learning System composed of five supervised learning algorithms using a binary comparator for antibody-mediated rejection prediction based on gene expression data. Performed data pre-processing and analysis, two random oversampling approached to deal with unbalanced data, Python developed an Ensemble Learning System with Particle Swarm Optimisation (PSO) approach for hyper-parameter optimization. Performed statistical analysis of the results and interpretation, as well as co-wrote the paper. Using publicly available data the Ensemble-PSO led to increased predictive performance when compared with the Ensemble alone when using a random oversampling technique. Technologies used in the project: Python, Sklearn, NumPy, Pandas.

PhD Candidate / Data Scientist / AI specialist / R and Python Developer - Automated classification of vocalization patterns
PhD Candidate / Data Scientist / AI specialist / R and Python Developer - Automated classification of vocalization patterns

The project comprises of using as user input two folders composed by vocalization files from two study groups and a file with the complete vocalization alphabet for the specific model, as the R Shiny app generates results for Entropy analysis, Markov Models, Linear Models, and Classification. The user can select levels of Entropy and pseudo count values for analysis and download all the results as a PDF report. I developed a Python implementation of Shannon’s Entropy function in four levels of complexity and used the Reticulate R package to generate the Python interface for the R Shiny code. Using the Reticulate interface and Python generated transition probability matrices, R developed a Markov Chain model to analyze call transitions, using Jensen-Shannon divergence as a metric of similarity of transition probabilities. R developed a mixed-effects linear model to assess the appropriate entropy level for classification analysis. R developed classification analysis using Boruta, a wrapper using the Random Forest approach. Developed R Shiny dashboard application. The application is already being used in collaboration work with a Psychiatry research group studying autism spectrum disorder. Technologies used in the project: R, R Shiny, Boruta, Reticulate, Python, NumPy, Pandas.

Research Assistant / Machine Learning specialist / Python developer - APL: An angle probability list to improve knowledge-based metaheuristics for the three-dimensional protein structure prediction
Research Assistant / Machine Learning specialist / Python developer - APL: An angle probability list to improve knowledge-based metaheuristics for the three-dimensional protein structure prediction

Tertiary protein structure prediction is one of the most challenging problems in structural bioinformatics based on the combinatorial explosion of plausible shapes that a protein can assume. In this project, a new computational approach for this problem was proposed using information from the Protein Data Bank regarding the neighborhood of amino acids (protein building blocks) and their propensity to assume a certain three-dimensional structure. This project involved the Python development of two metaheuristics to optimize the physicochemical function that governs the folding mechanism that is related to the active form of the protein, which is a possible therapeutic target for personalized medicine. I developed a Python implementation which is a Particle Swarm Optimisation (PSO) approach for Protein Structure Prediction, an NP-complete problem, using information from the Angle Probability List. Performed data analysis, interpretation of results and co-wrote the paper. My participation in this project with the PSO approach to protein structure prediction contributed to demonstrate the improvement in predictive performance related to the use of the angle probability list. This project was peer-reviewed published in the Journal of Computational Biology and Chemistry and has 31 citations. Technologies used in the project: Python, SciPy, NumPy, Artificial Intelligence, Optimization, Metaheuristics.

Education

Education
Ph.D. Bioinformatics
National University of Ireland, Galway
2017 - 2021 (4 years)
Education
Undergraduate exchange scholarship
National University of Ireland, Galway
2015 - 2016 (1 year)
Education
Bachelor's degree, Biotechnology - Major in Bioinformatics
Federal University of Rio Grande do Sul
2012 - 2016 (4 years)
Education
Bachelor's degree, Biomedical Sciences, General
Federal University of Health Sciences of Porto Alegre
2008 - 2011 (3 years)