Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

Pedro Casal Ribeiro

Lisbon

Summary

Over 17 years of experience in diverse industries, combining business acumen and technical expertise to solve complex challenges. Specializes in advanced language model integration, enhancing data curation processes, and implementing Retrieval-Augmented Generation (RAG) for user-facing applications.

Expertise in metadata management, significantly improving metadata quality and generation processes. Directed enhancements across multiple projects, leveraging language models to refine and optimize metadata.

Managed the OMICS DeepLife catalog with meticulous curation and continuous enhancement, supporting data-driven decisions. Developed robust single-cell analysis pipelines, contributing significantly to genomics research by analyzing and interpreting complex biological data.

Specializes in creating insightful visual reports, simplifying complex data into actionable insights. These reports enhance research outcomes and support strategic decision-making. Utilizes proven machine learning techniques to tackle challenges in the business and healthcare sectors, driven by a passion for continuous learning and innovation.

Overview

17
17
years of professional experience
1
1
Certification

Work History

NLP Data Engineer

DeepLife
05.2024 - Current
  • Advanced Language Model Integration: Leading the integration of OpenAI and other LLMs, enhancing data curation and transforming raw data into structured ontologies for streamlined normalization and integration.
    Developing and deploying Retrieval-Augmented Generation (RAG) strategies and advanced chatbots, both with and without knowledge graphs, improving user interactions and operational efficiency across front-facing applications.
  • Metadata Management: Led strategic enhancement and curation of metadata across multiple projects, setting high standards for data accuracy and utility. Using applications of LLMs in metadata curation, normalization, and refinement to streamline workflows and enhance data quality.
  • Catalog Management and Enhancement: Holding primary responsibility for the meticulous curation, maintenance, and enhancement of the OMICS DeepLife Catalog. Ensured continuous improvement and upheld stringent data integrity standards, significantly enhancing decision-making processes within the organization and for our clients.

Bioinformatics Software Engineer

DeepLife
09.2022 - 04.2024
  • Holding primary responsibility for the OMICS DeepLife Catalog, overseeing its meticulous curation, maintenance, and continuous improvement. Ensured the catalog met high standards of data integrity and was instrumental in fostering data-driven decision-making processes within the organization and for our clients.
  • Assisted in the development of robust single-cell analysis pipelines, leveraging advanced programming and bioinformatics techniques. Played a pivotal role in analyzing and interpreting complex biological data, which contributed significantly to groundbreaking research in genomics.
  • Specialized in crafting insightful visual reports to represent data, including detailed quality control (QC) metrics. Employed sophisticated visualization tools and statistical methods to transform complex raw data into clear, actionable insights, thereby enhancing research outcomes and supporting strategic decisions.
  • Implemented advanced machine learning techniques for better understanding of complex language patterns, leading to more accurate predictions.
  • Designed and executed comprehensive evaluation metrics, ensuring the effectiveness of implemented NLP solutions.
  • Conducted thorough research on cutting-edge NLP advancements, contributing to innovative product development.

Bioinformatics Software Developer

Centro Laboratorial Germano De Sousa
01.2019 - 08.2022
    • Creating and maintaining ETL pipelines for NGS data annotation from various different databases (Clinvar, Ensembl, dbSNP, Varsome, OMIM, COSMIC, gnomAD). That data is be used in diagnosing germline and somatic mutations.
    • Creating and maintaining end-to-end pipelines to analyze and report on metagenomic data with purpose of obesity treatment.
    • Creating and maintaining software application using Docker, Django and PostgresQL for data management, analysis and clinical interpretation that is currently in production and continuous development. It is responsible for increasing team productivity and department revenue.

Full Stack Web Developer

Freelance
01.2020 - 12.2023
    • Used Python, Django, Javascript, SQL, HTML5 and CSS3 to develop full stack end-to-end web solutions.
    • Increased website performance by optimizing front-end and back-end code for faster loading times.

Bioinformatics Intern

Centro Laboratorial Germano De Sousa
10.2018 - 12.2018
    • Investigate feasibility of bioinformatics pipeline that was able to extract, manage and process microbiome data.
    • Applied data science and leveraged understanding of metagenomics and its data to create population database.
    • Established efficient workflows for handling high-throughput sequencing data, expediting progress towards research goals.

LeadGen Specialist

LeadGenius
10.2014 - 03.2018
    • Developed sales leads to B2B customer through marketing lists and internet leads.
    • Followed all company policies and procedures to deliver quality work.

Director of Sales and Marketing

JOSE Gourmet
11.2010 - 08.2014
    • First employee and therefore responsible for all daily activities from warehouse management to sales and marketing efforts.
    • Responsible for YOY 50% sales growth between 2010-2013 by pursuing door to door sales and adapting marketing and business strategies to reflect time of overall economic contraction.
    • Developed comprehensive sales plans for business growth, resulting in significant market share expansion.

Customer Support Analyst & Supervisor

SYKES
01.2008 - 11.2010
    • Monitoring team performance and helped them achieve company goals and managing staff expectations and needs.
    • Assessing individual performance, providing feedback and implementation of improvement plans.
    • Troubleshot problems with software, hardware and networking for users.

Education

Master of Science - Artificial Intelligence

Munster Technological University
Cork
08.2022

Bachelor of Science - Bioinformatics

Polytechnic Institute of Setúbal
Barreiro
12.2019

High School Diploma -

Escola Secundária Quinta Do Marquês
Oeiras
05.2002

Skills

  • Programming Languages : Python, R, SQL, JavaScript, Bash, React
  • Frameworks & Tools : Django, FastAPI, Jinja2, Vuejs, PostgreSQL, Docker, Langchain, Llamaindex, Haystack, Ollama, Streamlit, Weaviate, Neo4J
  • Machine Learning : Expertise in Natural Language Processing (NLP) and Large Language Models (LLMs), including OpenAI Scikit-learn, Tensorflow and Keras
  • Data Management : Advanced experience in metadata management, catalog enhancement, and data integrity assurance
  • Bioinformatics: use of HL7 messaging services, understanding of standards, data sources and tools used in NGS and clinical genomics, single-cell
  • ETL development
  • CI/CD best practices by using git, unit testing and DRY methodologies
  • Data Modeling and Pipeline design
  • Collaborating and communicating with key multidisciplinary stakeholders to ensure on-time delivery of objectives

Certification

  • Hands-on Torrent Variant Caller Training. Jan 2021
  • Python for Data Science and Machine Learning Bootcamp & NLP - Natural Language Processing with Python & Python and Django Full Stack Web Developer Bootcamp. Udemy - 2019-2020
  • Algorithmic Design and Techniques,
    University of San Diego. edX - 2018
  • Bioinformatic Methods I & II University of
    Toronto. Coursera - 2018
  • Biology Meets Programming:
    Bioinformatics for Beginners, University of
    California San Diego. Coursera - 2018
  • Gut Check: Exploring Your Microbiome,
    University of Colorado Boulder, University of
    Colorado System & University of California
    San Diego. Coursera - 2018

Timeline

NLP Data Engineer

DeepLife
05.2024 - Current

Bioinformatics Software Engineer

DeepLife
09.2022 - 04.2024

Full Stack Web Developer

Freelance
01.2020 - 12.2023

Bioinformatics Software Developer

Centro Laboratorial Germano De Sousa
01.2019 - 08.2022

Bioinformatics Intern

Centro Laboratorial Germano De Sousa
10.2018 - 12.2018

LeadGen Specialist

LeadGenius
10.2014 - 03.2018

Director of Sales and Marketing

JOSE Gourmet
11.2010 - 08.2014

Customer Support Analyst & Supervisor

SYKES
01.2008 - 11.2010

Master of Science - Artificial Intelligence

Munster Technological University

Bachelor of Science - Bioinformatics

Polytechnic Institute of Setúbal

High School Diploma -

Escola Secundária Quinta Do Marquês
Pedro Casal Ribeiro