Proposta de Tese em Processamento Paralelo e Distribuído

UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL

INSTITUTO DE INFORMÁTICA

PROGRAMA DE PÓS-GRADUAÇÃO EM COMPUTAÇÃO

———————————————-

DEFESA DE PROPOSTA DE TESE

Aluno: Eduardo Henrique Molina da Cruz

Orientadora: Prof. Dr. Philippe Olivier Alexandre Navaux

Título: Improving Memory Locality Using the Memory Management Unit

Linha de Pesquisa: Processamento Paralelo e Distribuído

Data: 18/08/14

Horário: 13h

Local: Prédio 43412 – Sala 215 (Sala de Videoconferência), Instituto de Informática

Banca Examinadora:

Prof. Dr. Antônio Carlos Schneider Beck Filho (UFRGS)

Prof. Dr. Lucas Mello Schnorr (UFRGS)

Prof. Dr. Ronaldo Augusto de Lara Gonçalves (UEM)

Presidente da Banca: Prof. Dr. Philippe Olivier Alexandre Navaux

Resumo:

One of the main challenges for modern parallel shared memory architectures are accesses to main memory. In current systems, the performance and energy efficiency of memory accesses depend on their locality: accesses to remote caches and NUMA nodes are more expensive than accesses to local ones. Increasing the locality requires knowledge about how the threads of a parallel application access memory pages. With this information, pages can be migrated to the NUMA nodes that access them (data mapping), as well as threads that access the same pages can be migrated to processing cores such that locality can be improved even further (thread mapping).

In this work, we propose two mechanisms to dynamically detect the memory access pattern of parallel applications, called LAPT and SAMMU. They are implemented in the memory management unit of processors and extend the page table to store page sharing information. This information is used by the operating system to perform an optimized thread and data mapping during the execution of the parallel application. In contrast to previous work, our mechanisms do not require any previous information about the behavior of the applications, or changes to the source code or runtime libraries. Experiments with the NAS Parallel Benchmarks~(NPB) showed performance and energy efficiency improvements of up to 19.2% and 12.5%, respectively.

Palavras-chave: Thread mapping, data mapping, parallel computer architectures, shared memory, communication, cache memory, cache coherence protocols, memory management unit, virtual memory, NUMA.

____________________

Divulgação PPGC