Summary
Overview
Work History
Education
Skills
Publications
Work Preference
Timeline
Generic
Open To Work

Tiago Santos

Lisbon

Summary

Data professional with hands-on experience delivering scalable data pipelines, deploying cloud-based workflows, and developing innovative machine learning solutions. Expertise in fine-tuning models and leveraging advanced techniques in NLP and Generative A. Strong foundation in data transformation, automation, and advanced analytics, supported by a Master’s degree in Data Science and Advanced Analytics. Currently transitioning into Data Science roles, leveraging expertise in ML and data engineering to build impactful, production-ready solutions.

Overview

2
2
years of professional experience

Work History

Data Engineer

CGI
01.2025 - Current
  • Datahub Maintenance: Collaborate with cross-functional team within an Agile framework for a major enterprise client, designing/modifying entities database tables and views via SQL across multiple data layers.
  • ETL Implementation: Developed ETL pipelines on Databricks, ensuring accurate data transformation and integration across the Datahub.
  • Cloud Workflow Orchestration & CI/CD: Orchestrated data pipelines with Azure Data Factory and Synapse Analytics, overseeing deployments across multiple environments using Azure DevOps.

Professor

NOVA IMS
09.2024 - 02.2025
  • Course Creator: Designed, developed, and delivered a 14-week capstone course for final-year Data Science Bachelor students. Took full responsibility for curriculum design, learning objectives, assessment strategies, and organizing course topics.
  • Startup Simulation: Mentored 11 student groups in the development of simulated startups, guiding them through identifying real-world problems. Each group created a chatbot, using LangChain and OpenAI API, solution designed to address one of these challenges.

AI Engineer

Remynd (Startup)
03.2024 - 10.2024
  • ASR Development: Fine-tuned pre-trained Automatic Speech Recognition (ASR) models for pt-PT using PyTorch and Transformers on the Databricks platform.
  • Web Service Design: Architected and implemented a scalable web service using WebSockets and FastAPI for real-time audio streaming and transcription, leveraging AWS for CI/CD deployment.
  • RAG for Conversation Insights: Built and integrated a Retrieval-Augmented Generation (RAG) system using LangChain and the OpenAI API, extracting insights from transcribed conversations.

Summer Intern

Energias de Portugal (EDP)
07.2022 - 08.2022
  • LiDAR Data Analysis: Processed and analyzed LiDAR data to identify anomalies in electrical lines, enhancing inspection processes.
  • MVP Development: Created a Python-based MVP to automate and streamline visual inspections, improving defect detection efficiency.

Education

Master's Degree - Data Science and Advanced Analytics

NOVA IMS, Universidade Nova De Lisboa
01.2024

Bachelor's Degree - Electrical and Computer Engineering

Instituto Superior Técnico, Universidade De Lisboa
01.2022

Skills

  • Programming: Python, C, SQL
  • Machine Learning: PyTorch, Transformers, Scikit-learn
  • Generative AI & NLP: OpenAI API, LangChain, Pinecone, Prompt Engineering, RAG
  • MLOps & Deployment: Databricks, MLflow, GitHub, AWS, GCP, Docker, FastAPI
  • Big Data: Data modeling, Data warehousing, Databricks, Azure Data Factory, Azure Synapse Analytics, Azure DevOps
  • Data Visualization: Pandas, NumPy, Matplotlib, Seaborn
  • Languages: Native Portuguese, Fluent English

Publications

Accepted Papers

  • Perezhohin, Y., Santos, T., Costa, V., Peres, F., & Castelli, M. (2024). Enhancing Automatic Speech Recognition: Effects of Semantic Audio Filtering on Models Performance. IEEE Access, 12, 155136 - 155150. Advance online publication. https://doi.org/10.1109/ACCESS.2024.3482970.

Submitted Papers

  • Santos, T., Perezhohin, Y., Peres, F., Costa, V., & Castelli, M. (Under Review). LongSemAnnotator: A Longformer Framework for Column Type Annotation. Pattern Recognition (Elsevier).

Work Preference

Work Type

Full Time

Location Preference

RemoteHybrid

Timeline

Data Engineer

CGI
01.2025 - Current

Professor

NOVA IMS
09.2024 - 02.2025

AI Engineer

Remynd (Startup)
03.2024 - 10.2024

Summer Intern

Energias de Portugal (EDP)
07.2022 - 08.2022

Master's Degree - Data Science and Advanced Analytics

NOVA IMS, Universidade Nova De Lisboa

Bachelor's Degree - Electrical and Computer Engineering

Instituto Superior Técnico, Universidade De Lisboa
Tiago Santos