UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL
INSTITUTO DE INFORMÁTICA
PROGRAMA DE PÓS-GRADUAÇÃO EM COMPUTAÇÃO
———————————————-
Aluno: Jimmy Fernando Tarrillo Olano
Orientador: Profa. Dra. Fernanda Gusmão de Lima Kastensmidt
Título: Exploring the Use of Multiple Modular Redundancies for Masking Accumulated Faults in SRAM-based FPGAs
Linha de Pesquisa: Teste e Confiabilidade de Sistemas Integrados de Hardware e Software
Data: 27/06/2014
Local: Prédio 43412 – Sala 215 (Sala de Videoconferência), Instituto de Informática
Banca Examinadora:
Prof. Dr. Fabian Luis Vargas ((PUCRS)
Prof. Dr. Sérgio Bampi (UFRGS)
Prof. Dr. Silvio Manea (INPE) p/videoconferência
Resumo:
Soft errors in the configuration memory bits of SRAM-based FPGAs are an important issue due to the persistence effect and its possibility of generating functional failures in the implemented circuit. Whenever a configuration memory bit cell is flipped, the soft error will be corrected only by reloading the correct configuration memory bitstream. If the correct bitstream is not loaded, persistent soft errors can accumulate in the configuration memory bits provoking a system functional failure in the user’s design, and consequently can cause a catastrophic situation. This scenario gets worse in the event of multi-bit upset, whose probability of occurrence is increasing in new nano-metric technologies. Traditional strategies to deal with soft errors in configuration memory are based on the use of any type of triple modular redundancy (TMR) and the scrubbing of the memory to repair and avoid the accumulation of faults. The high reliability of this technique has been demonstrated in many studies, however TMR is aimed at masking single faults. The technology trend makes the dimensions of the transistors are lower, and this leads to increased susceptibility to faults. In this new scenario, it is more common to have multiple to single faults in the configuration memory of the FPGA, so that the use of TMR is inappropriate in high reliability applications. Furthermore, since the fault rate is increasing, scrubbing rate also need to be incremented, leading to the increase in power consumption. Aiming at coping with massive upsets between sparse scrubbing, this work proposes the use of a multiple redundancy system composed of n identical modules, known as nmodular redundancy (nMR), operating in tandem and an innovative self-adaptive voter to be able to mask multiple upsets in the system. The main drawback of using modular redundancy is its high cost in terms of area and power consumption. However, area overhead is less and less problem due the higher density in new technologies. On the other hand, the high power consumption has always been a handicap of FPGAs. In this work we also propose a model to prevent power overhead caused by the use of multiple redundancy in SRAM-based FPGAs. The capacity of the proposal to tolerate multiple faults has been evaluated by radiation experiments and fault injection campaigns of study case circuits implemented in a 65nm technology commercial FPGA. Finally we demonstrate that the power overhead generated by the use of nMR in FPGAs is much lower than it is discussed in the literature.
Palavras-chave: SRAM-based FPGA; radiation effects; fault tolerance; modular redundancy.
______________
Divulgação PPGC