I am interested in database systems, artificial intelligence, and big data processing to support data-intensive science.
You can check my Linkedin profile for a full professional history.
You can check my Big Five personality traitstest results here
SOME INTERESTING WORKS:
[new] Optimizing Sort in Hadoop using Replacement Selection : This thesis implements and evaluates an alternative sorting component for Hadoop based on the replacement-selection algorithm. Hadoop is an open source implementation of the MapReduce framework. MapReduce’s popularity arises from the fact that it provides distribution transparency, linear scalability, and fault tolerance. This work proposes an alternative to the existing load-sort-store solution which can generate a small number of longer runs, resulting in a faster merge phase. The replacement selection algorithm usually produces runs that are larger than available memory, which in turn reduces the overall sorting time.
Extensible Operator Models on Data Processing Systems: This work aims to investigate the state of the art of operator models in data management systems. Nowadays, data volume is scaling faster than computer resources, leading us to build bigger and more complex computational clusters. The tools we have to analyze data flows only provide basic operators for simple, SQL-like analysis - most of them inherited from the RDBMS era. Big Data analytics requires more complex tasks, which today are embedded in user-defined functions hidden from query compiler and optimizer. Reliance on user-driven program optimizations is likely to lead to poor cluster utilization, and system-driven holistic optimization11 will require not just database query optimization, but also optimization of the whole data flow of this applications. In this direction, the extensible operator models permit the programmers to add new (possibly sophisticated) functionalities to data analysis tools. The query compiler can access the semantics of this new operators, potentially optimizing the data flow since application-specific functions are treated as first-class operators.
Inteligent Agent for Energy Estimation: This energy agent is called IAEE. It aims to estimate the energy consumed by a machine based on its usage metrics. This way, is not necessary to sensor a whole cluster of computers to be aware of the machine's individual power consumption.
A Monitoring System for WattDB: An Energy-Proportional Database Cluster: this is bachelor thesis developed during my internship in Germany, under the orientation of Dipl.-Inf. Daniel Schall. Check this word cloud of my text!
A Monitoring System for WattDB: presentation. Take a look in my Bachelor presentation!
Memory Evolutive Systems: um Modelo Categorial para Descricao de Sistemas Hierarquicos Complexos- elected best paper - this text give us an overview about the MES and how they can model natural open self-organizing systems, such as biological, sociological or neural systems. Click here to download the paper (pt-BR) and here for the paper presentation (pt-BR).