On Friday, November 27, 2020, Dinei Rockenbach (GMAP master’s student) was approved with distinction in the defense of his master’s thesis at the School of Technology of the Pontifical Catholic University of Rio Grande do Sul (PUCRS). The master’s thesis was advised by Dr. Luiz Gustavo Fernandes (PUCRS) and Dr. Dalvan Griebler (PNPD/PUCRS). The members of the examining board were Dr. Avelino Zorzo (PUCRS) and Dr. Marco Aldinucci (University of Turin). The defense was carried out through video conferencing due to the Covid-19 pandemic.
Rockenbach tells how was his experience in the master’s course:
The master’s in Computer Science was a big challenge in my career. Since I’m bachelor in Information Systems, which is a course with focus in business, I had to learn many things to keep up with the topics being taught at the master’s. And I had to work this out while keeping my full-time job, so it was a long period of much dedication. At last, when the credits of the disciplines are done, there was the last (and biggest) challenge: the master’s thesis. Thanks to all the preparation and dedication, I was able to celebrate being approved with distinction. The basis that was built during this period have great potential to generate many more fruits. I certainly do not regret having chosen to do the master’s degree, because when I look back today I can see how much I learned during this period.
Dinei Rockenbach
The title and abstract of the master’s thesis are as follows:
Title: HIGH-LEVEL PROGRAMMING ABSTRACTIONS FOR STREAM PARALLELISM ON GPUS
Abstract: The growth and spread of parallel architectures have driven the pursuit of greater computing power with massively parallel hardware such as the Graphics Processing Units (GPUs). This new heterogeneous computer architecture composed of multi-core Central Processing Units (CPUs) and many-core GPUs became usual, enabling novel software applications such as self-driving cars, real-time ray tracing, deep learning, and virtual reality (VR), which are characterized as stream processing applications. However, this heterogeneous environment poses an additional challenge to software development, which is still in the process of adapting to the parallel processing paradigm on multi-core systems, where programmers are supported by several APIs (Application Programming Interfaces) that offer different abstraction levels. The parallelism exploitation in GPU is done using both CUDA and OpenCL for academia and industry, whose developers have to deal with low-level architecture concepts to efficiently exploit GPU parallelism in their applications. There is still a lack of parallel programming abstractions when: 1) parallelizing code on GPUs, and 2) needing higher-level programming abstractions that deal with both CPU and GPU parallelism. Unfortunately, developers still have to be expert programmers on system and architecture to enable efficient hardware parallelism exploitation in this architectural environment. To contribute to the first problem, we created GSParLib, a novel structured parallel programming library for exploiting GPU parallelism that provides a unified programming API and driver-agnostic runtime. It offers Map and Reduce parallel patterns on top of CUDA and OpenCL drivers. We evaluate its performance comparing with state-of-the-art APIs, where the experiments revealed a comparable and efficient performance. For contributing to the second problem, we extended the SPar Domain-Specific Language (DSL), which has been proved to be high-level and productive for expressing stream parallelism with C++ annotations in multi-core CPUs. In this work, we propose and implement new annotations that increase expressiveness to combine the current stream parallelism on CPUs and data parallelism on GPUs. We also provide new pattern-based transformation rules that were implemented in the compiler targeting automatic source-to-source code transformations using GSParLib for GPU parallelism exploitation. Our experiments demonstrate that SPar compiler is able to generate stream and data parallel patterns without significant performance penalty compared to handwritten code. Thanks to these advances in SPar, our work is the first on providing high-level C++11 annotations as an API that does not require significant code refactoring in sequential programs while enabling multi-core CPU and many-core GPU parallelism exploitation for stream processing applications.