Português English
Contato

Tese de Mateus Grellert da Silva


Detalhes do Evento


Aluno: Mateus Grellert da Silva
Orientador: Prof. Dr. Sergio Bampi
Coorientador: Prof. Dr. Bruno Zatt

Título: Machine Learning Mode Decision for Complexity Reduction and Scaling in Video Applications
Linha de Pesquisa: Arquitetura e Projeto de Sistemas Computacionais

Data: 28/03/2018
Horário: 14h
Local: Prédio 43412 – Sala 218 (sala de videoconferência), Instituto de Informática

Banca Examinadora:
Prof. Dr. Altamiro Amadeu Susin (UFRGS)
Profª. Drª. Carla Liberal Pagliari (IME)
Prof. Dr. Luciano Volcan Agostini (UFPel)

Presidente da Banca: Prof. Dr. Sergio Bampi

Abstract: The recent innovations in Machine Learning techniques have led to a large utilization of intelligent models to solve complex problems that are especially hard to compute with traditional data structures and algorithms. In particular, the current research on Image and Video Processing shows that it is possible to design Machine Learning models that perform object recognition and even action recognition with high confidence levels. In addition, the latest progress on training algorithms for Deep Learning Neural Networks was also an important milestone in Machine Learning, leading to prominent discoveries in Computer Vision and other applications. Recent studies have also shown that it is pos- sible to design intelligent models capable of drastically reducing the optimization space of mode decision in video encoders with minor losses in coding efficiency. All these facts indicate that Machine Learning for complexity reduction in visual applications is a very promising field of study. The goal of this thesis is to investigate learning-based techniques to reduce the complexity of the HEVC encoding decisions, focusing on fast video encoding and transcoding applications. A complexity profiling of HEVC is first presented to identify the tasks that must be prioritized to accomplish our objective. Sev- eral variables and metrics are then extracted during the encoding and decoding processes to assess their correlation with the encoding decisions associated with these tasks. Next, Machine Learning techniques are employed to construct classifiers that make use of this information to accurately predict the outcome of these decisions, eliminating the time- consuming operations required to compute them. The fast encoding and transcoding so- lutions were developed separately, as the source of information is different on each case, but the same methodology was followed in both cases. In addition, mechanisms for com- plexity scalability were developed to provide the best rate-distortion performance given a target complexity reduction. Experimental results demonstrated that the designed fast encoding solutions achieve time savings of 37% up to 78% on average, with Bjontegaard Delta Bitrate (BD-BR) increments between 0.04% and 4.8%. In the transcoding results, a complexity reduction of 43% and 67% was observed, with average BD-BR increments between 0.34% up to 1.7%. Comparisons with state of the art confirm the efficacy of the designed methods, as they outperform the results achieved by related solutions.

Keywords: Video coding, Video transcoding, Complexity reduction, Complexity scaling, Machine Learning, HEVC.