
DEFESA DE DISSERTAÇÃO DE MESTRADO
Aluna: Heloísa Oss Boll
Orientadora: Profa. Dra. Mariana Recamonde Mendoza Guerreiro
Título: Graph Neural Networks for Clinical Risk Prediction Based on Patient Similarity Graphs
Linha de Pesquisa: Aprendizado de Máquina, Representação de Conhecimento e Raciocínio
Data: 25/07/2024
Horário: 15h30min.
Local: Esta banca ocorrerá de forma remota. Acesso público disponibilizado pelo link: https://mconf.ufrgs.br/webconf/00195037 .
Banca Examinadora:
– Profa. Dra. Viviane Pereira Moreira (UFRGS)
– Prof. Dr. Bruno Pereira Nunes (UFPel)
– Prof. Dr. Anderson Rocha Tavares (UFRGS)
Presidente da Banca: Profa. Dra. Mariana Recamonde Mendoza Guerreiro
Abstract: Electronic health records (EHRs) are a comprehensive source of information about a patient’s health history. Due to the interconnected nature of clinical events, these records contain data that can be expressed as graphs; for example, patients can be represented as nodes in a similarity network that connects individuals with multiple shared health events, such as diagnoses and medications. Traditional machine learning (ML) models used for predicting clinical risks, which aim to forecast diagnoses, readmissions, and mortality, usually do not use this structured information. As a result, their predictive power is hindered. In contrast, graph neural networks (GNNs) are a new deep learning (DL) approach that has shown superior results in predicting clinical risks based on graphs, helping to improve patient care and medical decision-making. This study aims to provide a comprehensive overview of the most recent GNNs used for predicting clinical risks using EHRs and, in particular, to investigate the relevance of patient similarity graphs for diagnosing heart failure. First, we present an extensive review of 50 papers on the topic, which identified the graph attention network (GAT) as the most widely used GNN, diagnosis prediction as the most investigated task, and MIMIC-III as the most popular EHR dataset. Next, we introduce three new GNN solutions based on GraphSAGE, GAT, and Graph Transformer (GT) that address the challenges related to three identified literature gaps: multimodality, patient similarity, and interpretability. Our best model, the GT, obtained an F1 score of 0.5361, resulting in a 35.7% increase over the highest score from baseline methods, as well as a balanced accuracy of 0.7166 and an AUROC of 0.7930. In addition, we evaluate the importance of four different types of data modalities for predicting heart failure and introduce new techniques to improve the explainability of our GT model, including a descriptive statistics analysis of the connectivity of patient nodes in the graph, their attention profiles, and patterns in their medical features and those of their neighbors. Finally, our results reinforce the potential of GNNs to optimize clinical risk prediction and highlight the importance of using structured patient data to improve medical outcomes.
Keywords: graph neural networks. electronic health records. clinical risk prediction. patient similarity.