Fabio A.

About Me

Fabio is a Senior Data Engineer with 10+ years of experience working on data analysis, data science, and wrangling big datasets for multiple projects. He also has extensive experience in business intelligence and data analytics for gathering game-changing insights for businesses, also focusing on scalability, data infrastructure, and efficient development.

AI, ML & LLM

Backend

DevOps

Other

Work history

Caylent (via UpStack)
Caylent (via UpStack)
Senior Data Engineer
2021 - Present (4 years)
Remote
  • Building and improving databases, acquiring data, ETL/ELT, Big Data pipelines, and deploying cloud services on projects.

  • Administering infrastructure solutions to improve data models, increase data accessibility, and foster data-driven solutions for clients.

  • Implementing monitoring solutions to ensure data integrity, working closely with engineers, product managers, and other stakeholders.

BairesDev
BairesDev
Data Analyst
2020 - 2021 (1 year)
Remote
  • Transitioned production ELT to a new architecture using new AWS stacks, PySpark, and stream services like Kafka, Polaris, and Kinesis, also working with files versions.

  • Built data producers using Python and Flask with Kubernetes to connect to data sources and stream data.

  • Analyzed data in Snowflake for custom reports that are created using DBT views scheduled by Airflow.

Albert Einstein Hospital
Albert Einstein Hospital
Data Engineer | Data Analyst
2020 - 2021 (1 year)
Brasília, Brazil
  • Worked on developing an analytics environment with Impala and Hive, and used Spark/Python for Machine Learning.

  • Supported the architecture of environment in AWS with Elastic, MongoDB, Glue, Kafka, QuickSight, R, Jupyter Notebook, Pretos, Apache Hue.

  • Delivered insights on the public heath system using Power BI and Jupyter Notebook.

  • Gathered datasets from many types of sources, built pipelines, and performed exploratory data analysis.

Onne Empresas
Onne Empresas
Data Scientist
2019 - 2020 (1 year)
Brasília, Brazil
  • Worked on a platform that seeks more agile deliveries, reduction of operational costs, and improvement of processes.

  • Used multiple models like SARIMAX, Decision Tree, LSTM, and GMM in the food and restaurant segment.

  • Analyzed large amounts of information to discover trends and patterns.

PythonBusiness IntelligenceData AnalysisData ScienceDecision Trees Long Short-term Memory (LSTM) LSTMTime Series AnalysisFinancial Forecasting
CAIXA
CAIXA
Data Scientist
2019 - 2019
Brasília, Brazil
  • Worked on cloud and on-premises in financial fraud, financial default turnover, IT capability, and legal documents categorization.

  • Tuned models and integrated them with computational capabilities.

  • Performed anomaly detection for financial illegal operations like money-laundering.

  • Mapped the relationship between transactions with Spark and PySpark using IsolationForest and NetworkX.

One Way Solution
One Way Solution
Data Analyst
2018 - 2019 (1 year)
Brasília, Brazil
  • Built a Big Data fast-lane architecture for a client in the events & productions sector.

  • Worked on data wrangling and data discovery and combined legacy data with new business data.

  • Used Python and Scala to create Machine Learning algorithms for customer profile consumption, promotion directions, and event consumption in real time.

Comp Line Services Solutions
Comp Line Services Solutions
Big Data Engineer
2018 - 2018
Brasília, Brazil
  • Implemented data analytics and reporting with Power BI and worked on data warehouse architecture.

  • Gathered data from SQL and imported it into Azure Blob Storage using Azure Data Factory.

  • Created a messenger service between MSSQL 2014 on Azure to AWS RDS and MongoDB using Kafka and Broker.

  • Queried and generated data reports to Power BI using Hive and Pig.

Autotrac Comércio e Telecomunicações S.A.
Autotrac Comércio e Telecomunicações S.A.
Data Analyst
2016 - 2018 (2 years)
Brasília, Brazil
  • Conducted data analysis and delivered lectures on T-SQL tuning for a major geolocation company in Brazil.

  • Worked on a billing automation system, T-SQL tuning, data consistency, new billing rules based on traffic signals, and client attendance program.

  • Handled BI support, ETL with SSIS, and billing team support.

Data AnalysisSQLT-SQLBusiness IntelligenceETLMS SSISPL/SQL Tuning AutomationSQL Server Integration Services (SSIS) Geolocation

Showcase

Data Engineer - Self-service Data System
Data Engineer - Self-service Data System
  • Developed a self-service data system featuring our own CDC tool, Kinesis, Glue, Trino, and Cube.js

  • Constructed an algorithm to streamline unstructured data like JSON and build relationships between nodes

  • Employed Python and Kinesis Data Stream in overall development

Data Engineer - On-premises Cluster
Data Engineer - On-premises Cluster
  • Created an on-premises cluster with Hortonworks tools.

  • Processed and analyzed bank's financial data.

  • Worked on detecting anomalies in the financial data.

Data Engineer - Pipeline Creation
Data Engineer - Pipeline Creation
  • Built a data-consuming pipeline centered around vaccinations

  • Utilized Spark, Kafka, and AWS tools such as Glue Catalog and QuickSight

  • Constructed a COVID portal using the aforementioned tools

Data Engineer - GenAI
Data Engineer - GenAI
  • Assisted in adding new features for a GenAI Agent

  • Contributed to maintaining features for a GenAI Agent

  • The GenAI Agent was tasked to deliver legal content to users

Education

Applied Machine Learning in Python
Applied Machine Learning in Python
Coursera
2019 - 2019
Postgraduate Course in IT Governance
Postgraduate Course in IT Governance
Universidade Católica de Brasília - Brazil
2011 - 2013 (2 years)
Bachelor's Degree, Computer Science
Bachelor's Degree, Computer Science
UniCEUB - Brazil
2004 - 2010 (6 years)