2022 |
Garcia, Adriano Marques; Griebler, Dalvan; Schepke, Claudio; Fernandes, Luiz Gustavo Um Framework para Criar Benchmarks de Aplicações Paralelas de Stream Inproceedings doi Anais da XXII Escola Regional de Alto Desempenho da Região Sul, pp. 97–98, Sociedade Brasileira de Computação, Curitiba, Brazil, 2022. @inproceedings{GARCIA:ERAD:22, title = {Um Framework para Criar Benchmarks de Aplicações Paralelas de Stream}, author = {Adriano Marques Garcia and Dalvan Griebler and Claudio Schepke and Luiz Gustavo Fernandes}, url = {https://doi.org/10.5753/eradrs.2022.19180}, doi = {10.5753/eradrs.2022.19180}, year = {2022}, date = {2022-04-01}, booktitle = {Anais da XXII Escola Regional de Alto Desempenho da Região Sul}, pages = {97--98}, publisher = {Sociedade Brasileira de Computação}, address = {Curitiba, Brazil}, abstract = {Este trabalho apresenta o SPBench, um framework para o desenvolvimento de benchmarks de processamento de stream em C++. O SPBench fornece um conjunto de aplicações realísticas através de abstrações de alto nível e permite customizações nos dados de entrada e métricas de desempenho.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Este trabalho apresenta o SPBench, um framework para o desenvolvimento de benchmarks de processamento de stream em C++. O SPBench fornece um conjunto de aplicações realísticas através de abstrações de alto nível e permite customizações nos dados de entrada e métricas de desempenho. |
Garcia, Adriano Marques; Griebler, Dalvan; Schepke, Claudio; Fernandes, Luiz Gustavo Evaluating Micro-batch and Data Frequency for Stream Processing Applications on Multi-cores Inproceedings doi 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 10-17, IEEE, Valladolid, Spain, 2022. @inproceedings{GARCIA:PDP:22, title = {Evaluating Micro-batch and Data Frequency for Stream Processing Applications on Multi-cores}, author = {Adriano Marques Garcia and Dalvan Griebler and Claudio Schepke and Luiz Gustavo Fernandes}, url = {https://doi.org/10.1109/PDP55904.2022.00011}, doi = {10.1109/PDP55904.2022.00011}, year = {2022}, date = {2022-04-01}, booktitle = {30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)}, pages = {10-17}, publisher = {IEEE}, address = {Valladolid, Spain}, series = {PDP'22}, abstract = {In stream processing, data arrives constantly and is often unpredictable. It can show large fluctuations in arrival frequency, size, complexity, and other factors. These fluctuations can strongly impact application latency and throughput, which are critical factors in this domain. Therefore, there is a significant amount of research on self-adaptive techniques involving elasticity or micro-batching as a way to mitigate this impact. However, there is a lack of benchmarks and tools for helping researchers to investigate micro-batching and data stream frequency implications. In this paper, we extend a benchmarking framework to support dynamic micro-batching and data stream frequency management. We used it to create custom benchmarks and compare latency and throughput aspects from two different parallel libraries. We validate our solution through an extensive analysis of the impact of micro-batching and data stream frequency on stream processing applications using Intel TBB and FastFlow, which are two libraries that leverage stream parallelism on multi-core architectures. Our results demonstrated up to 33% throughput gain over latency using micro-batches. Additionally, while TBB ensures lower latency, FastFlow ensures higher throughput in the parallel applications for different data stream frequency configurations.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In stream processing, data arrives constantly and is often unpredictable. It can show large fluctuations in arrival frequency, size, complexity, and other factors. These fluctuations can strongly impact application latency and throughput, which are critical factors in this domain. Therefore, there is a significant amount of research on self-adaptive techniques involving elasticity or micro-batching as a way to mitigate this impact. However, there is a lack of benchmarks and tools for helping researchers to investigate micro-batching and data stream frequency implications. In this paper, we extend a benchmarking framework to support dynamic micro-batching and data stream frequency management. We used it to create custom benchmarks and compare latency and throughput aspects from two different parallel libraries. We validate our solution through an extensive analysis of the impact of micro-batching and data stream frequency on stream processing applications using Intel TBB and FastFlow, which are two libraries that leverage stream parallelism on multi-core architectures. Our results demonstrated up to 33% throughput gain over latency using micro-batches. Additionally, while TBB ensures lower latency, FastFlow ensures higher throughput in the parallel applications for different data stream frequency configurations. |
Vogel, Adriano; Griebler, Dalvan; Danelutto, Marco; Fernandes, Luiz Gustavo Self-adaptation on Parallel Stream Processing: A Systematic Review Journal Article doi Concurrency and Computation: Practice and Experience, 34 (6), pp. e6759, 2022. @article{VOGEL:Survey:CCPE:2022, title = {Self-adaptation on Parallel Stream Processing: A Systematic Review}, author = {Adriano Vogel and Dalvan Griebler and Marco Danelutto and Luiz Gustavo Fernandes}, url = {https://doi.org/10.1002/cpe.6759}, doi = {10.1002/cpe.6759}, year = {2022}, date = {2022-03-01}, journal = {Concurrency and Computation: Practice and Experience}, volume = {34}, number = {6}, pages = {e6759}, publisher = {Wiley}, abstract = {A recurrent challenge in real-world applications is autonomous management of the executions at run-time. In this vein, stream processing is a class of applications that compute data flowing in the form of streams (e.g., video feeds, images, and data analytics), where parallel computing can help accelerate the executions. On the one hand, stream processing applications are becoming more complex, dynamic, and long-running. On the other hand, it is unfeasible for humans to monitor and manually change the executions continuously. Hence, self-adaptation can reduce costs and human efforts by providing a higher-level abstraction with an autonomic/seamless management of executions. In this work, we aim at providing a literature review regarding self-adaptation applied to the parallel stream processing domain. We present a comprehensive revision using a systematic literature review method. Moreover, we propose a taxonomy to categorize and classify the existing self-adaptive approaches. Finally, applying the taxonomy made it possible to characterize the state-of-the-art, identify trends, and discuss open research challenges and future opportunities.}, keywords = {}, pubstate = {published}, tppubtype = {article} } A recurrent challenge in real-world applications is autonomous management of the executions at run-time. In this vein, stream processing is a class of applications that compute data flowing in the form of streams (e.g., video feeds, images, and data analytics), where parallel computing can help accelerate the executions. On the one hand, stream processing applications are becoming more complex, dynamic, and long-running. On the other hand, it is unfeasible for humans to monitor and manually change the executions continuously. Hence, self-adaptation can reduce costs and human efforts by providing a higher-level abstraction with an autonomic/seamless management of executions. In this work, we aim at providing a literature review regarding self-adaptation applied to the parallel stream processing domain. We present a comprehensive revision using a systematic literature review method. Moreover, we propose a taxonomy to categorize and classify the existing self-adaptive approaches. Finally, applying the taxonomy made it possible to characterize the state-of-the-art, identify trends, and discuss open research challenges and future opportunities. |
Hoffmann, Renato Barreto; Löff, Júnior; Griebler, Dalvan; Fernandes, Luiz Gustavo OpenMP as runtime for providing high-level stream parallelism on multi-cores Journal Article doi The Journal of Supercomputing, 1 (1), pp. 7655–7676, 2022. @article{HOFFMANN:Jsuper:2022, title = {OpenMP as runtime for providing high-level stream parallelism on multi-cores}, author = {Renato Barreto Hoffmann and Júnior Löff and Dalvan Griebler and Luiz Gustavo Fernandes}, url = {https://doi.org/10.1007/s11227-021-04182-9}, doi = {10.1007/s11227-021-04182-9}, year = {2022}, date = {2022-01-01}, journal = {The Journal of Supercomputing}, volume = {1}, number = {1}, pages = {7655–7676}, publisher = {Springer}, address = {New York, United States}, abstract = {OpenMP is an industry and academic standard for parallel programming. However, using it for developing parallel stream processing applications is complex and challenging. OpenMP lacks key programming mechanisms and abstractions for this particular domain. To tackle this problem, we used a high-level parallel programming framework (named SPar) for automatically generating parallel OpenMP code. We achieved this by leveraging SPar's language and its domain-specific code annotations for simplifying the complexity and verbosity added by OpenMP in this application domain. Consequently, we implemented a new compiler algorithm in SPar for automatically generating parallel code targeting the OpenMP runtime using source-to-source code transformations. The experiments in four different stream processing applications demonstrated that the execution time of SPar was improved up to 25.42% when using the OpenMP runtime. Additionally, our abstraction over OpenMP introduced at most 1.72% execution time overhead when compared to handwritten parallel codes. Furthermore, SPar significantly reduces the total source lines of code required to express parallelism with respect to plain OpenMP parallel codes.}, keywords = {}, pubstate = {published}, tppubtype = {article} } OpenMP is an industry and academic standard for parallel programming. However, using it for developing parallel stream processing applications is complex and challenging. OpenMP lacks key programming mechanisms and abstractions for this particular domain. To tackle this problem, we used a high-level parallel programming framework (named SPar) for automatically generating parallel OpenMP code. We achieved this by leveraging SPar's language and its domain-specific code annotations for simplifying the complexity and verbosity added by OpenMP in this application domain. Consequently, we implemented a new compiler algorithm in SPar for automatically generating parallel code targeting the OpenMP runtime using source-to-source code transformations. The experiments in four different stream processing applications demonstrated that the execution time of SPar was improved up to 25.42% when using the OpenMP runtime. Additionally, our abstraction over OpenMP introduced at most 1.72% execution time overhead when compared to handwritten parallel codes. Furthermore, SPar significantly reduces the total source lines of code required to express parallelism with respect to plain OpenMP parallel codes. |
Müller, Caetano; Löff, Junior; Griebler, Dalvan; Eizirik, Eduardo Avaliação da aplicação de paralelismo em classificadores taxonômicos usando Qiime2 Inproceedings doi Anais da XXII Escola Regional de Alto Desempenho da Região Sul, pp. 25–28, Sociedade Brasileira de Computação (SBC), Porto Alegre, RS, Brasil, 2022. @inproceedings{gmap:MULLER:ERAD-RS:22, title = {Avaliação da aplicação de paralelismo em classificadores taxonômicos usando Qiime2}, author = {Caetano Müller and Junior Löff and Dalvan Griebler and Eduardo Eizirik}, url = {https://sol.sbc.org.br/index.php/eradrs/article/view/19152}, doi = {10.5753/eradrs.2022.19152}, year = {2022}, date = {2022-01-01}, booktitle = {Anais da XXII Escola Regional de Alto Desempenho da Região Sul}, pages = {25--28}, publisher = {Sociedade Brasileira de Computação (SBC)}, address = {Porto Alegre, RS, Brasil}, abstract = {A classificação de sequências de DNA usando algoritmos de aprendizado de máquina ainda tem espaço para evoluir, tanto na qualidade do resultado quanto na eficiência computacional dos algoritmos. Nesse trabalho, realizou-se uma avaliação de desempenho em dois algoritmos de aprendizado de máquina da ferramenta Qiime2 para classificação de sequências de DNA. Os resultados mostram que o desempenho melhorou em até 9,65 vezes utilizando 9 threads.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } A classificação de sequências de DNA usando algoritmos de aprendizado de máquina ainda tem espaço para evoluir, tanto na qualidade do resultado quanto na eficiência computacional dos algoritmos. Nesse trabalho, realizou-se uma avaliação de desempenho em dois algoritmos de aprendizado de máquina da ferramenta Qiime2 para classificação de sequências de DNA. Os resultados mostram que o desempenho melhorou em até 9,65 vezes utilizando 9 threads. |
Parallel Applications Modelling Group
GMAP is a research group at the Pontifical Catholic University of Rio Grande do Sul (PUCRS). Historically, the group has conducted several types of research on modeling and adapting robust, real-world applications from different domains (physics, mathematics, geology, image processing, biology, aerospace, and many others) to run efficiently on High-Performance Computing (HPC) architectures, such as Clusters.
In the last decade, new abstractions of parallelism are being created through domain-specific languages (DSLs), libraries, and frameworks for the next generation of computer algorithms and architectures, such as embedded hardware and servers with accelerators like Graphics Processing Units (GPUs) or Field-Programmable Gate Array (FPGAs). This has been applied to stream processing and data science-oriented applications. Concomitantly, since 2018, research is being conducted using artificial intelligence to optimize applications in the areas of Medicine, Ecology, Industry, Agriculture, Education, Smart Cities, and others.
Research Lines
Applied Data Science
Parallelism Abstractions
The research line HSPA (High-level and Structured Parallelism Abstractions) aims to create programming interfaces for the user/programmer who is not able in dealing with the parallel programming paradigm. The idea is to offer a higher level of abstraction, where the performance of applications is not compromised. The interfaces developed in this research line go toward specific domains that can later extend to other areas. The scope of the study is broad as regards the use of technologies for the development of the interface and parallelism.
Parallel Application Modeling
Team
Prof. Dr. Luiz Gustavo Leão Fernandes
General Coordinator
Prof. Dr. Dalvan Griebler
Research Coordinator
Last Papers
2022 |
Um Framework para Criar Benchmarks de Aplicações Paralelas de Stream Inproceedings doi Anais da XXII Escola Regional de Alto Desempenho da Região Sul, pp. 97–98, Sociedade Brasileira de Computação, Curitiba, Brazil, 2022. |
Evaluating Micro-batch and Data Frequency for Stream Processing Applications on Multi-cores Inproceedings doi 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 10-17, IEEE, Valladolid, Spain, 2022. |
Self-adaptation on Parallel Stream Processing: A Systematic Review Journal Article doi Concurrency and Computation: Practice and Experience, 34 (6), pp. e6759, 2022. |
OpenMP as runtime for providing high-level stream parallelism on multi-cores Journal Article doi The Journal of Supercomputing, 1 (1), pp. 7655–7676, 2022. |
Avaliação da aplicação de paralelismo em classificadores taxonômicos usando Qiime2 Inproceedings doi Anais da XXII Escola Regional de Alto Desempenho da Região Sul, pp. 25–28, Sociedade Brasileira de Computação (SBC), Porto Alegre, RS, Brasil, 2022. |
Projects
Software
Last News
Contact us!
Or, feel free to use the form below to contact us.