Aluna: Janaína Schwarzrock
Orientador: Prof. Dr. Antonio Carlos S. Beck
Título: Integration framework for dynamic concurrency throttling and thread and page mapping
Linha de Pesquisa: Sistemas Embarcados
Esta banca ocorrerá excepcionalmente de forma totalmente remota. Interessados em assistir a defesa poderão acessar a sala virtual através do link: https://meet.google.com/uhs-yvew-zux
– Prof. Dr. Marco Antonio Zanata Alves (UFPR)
– Profa. Drª. Mariza Ferro (LNCC)
– Prof. Dr. Gabriel Luca Nazar (UFRGS)
Presidente da Banca: Prof. Dr. Antonio Carlos S. Beck
Abstract: Technology scaling has allowing a growing number of cores in a processor to satisfy the increasing demand of new applications, which need to process huge amounts of data in High Performance Computing (HPC). However, because of scalability issues of the memory system, these multicore processors usually employ some sort of Non-Uniform Memory Access (NUMA). Therefore, to fully take advantage of these systems, wisely choosing the right thread-to-core allocation and page placement are essential to decreasing the number of remote memory accesses, improving performance and reducing energy consumption. Moreover, considering that many parallel applications have limited scalability, applying thread throttling (i.e., artificially reducing the number of active threads) will in many times further optimize the aforementioned non-functional requirements. Given that, the task of rightly choosing the ideal configuration (thread mapping, page mapping, and the number of threads) to deliver the best outcome in energy and performance (represented by the Energy-Delay Product, or EDP) is not straightforward. The ideal configuration may change according to the system (e.g., microarchitecture, number of cores), application, input set or even during execution (since a parallel application may comprise many and different parallel regions).Because it involves many variables, previous research has not considered combining these three optimization techniques and still provide full adaptability (automatically adjusting to applications at runtime) and transparency (without demanding changes in the original code or the use of unconventional tools). For such a problem, adaptability is key, since its changing nature precludes the feasibility of offline strategies, or at least does not allow offline approaches to deliver satisfactory solutions. Considering that, this work presents a proposal for a framework to synergistically integrate thread throttling with thread mapping and page mapping techniques into a single, online, and efficient approach. By exploiting different heuristics, it aims to optimize any dynamically linked OpenMP application without requiring any code transformation or recompilation. In this work, we discuss the main challenges in developing this proposal and present some directions we have already identified to overcome them.
Keywords: Parallel applications, NUMA systems, Thread throttling, Thread mapping, Page mapping.