2023 |
Garcia, Adriano Marques; Griebler, Dalvan; Schepke, Claudio; Fernandes, Luiz Gustavo Micro-batch and data frequency for stream processing on multi-cores Journal Article doi The Journal of Supercomputing, In press (In press), pp. 1-39, 2023. @article{GARCIA:JS:23, title = {Micro-batch and data frequency for stream processing on multi-cores}, author = {Adriano Marques Garcia and Dalvan Griebler and Claudio Schepke and Luiz Gustavo Fernandes}, url = {https://doi.org/10.1007/s11227-022-05024-y}, doi = {10.1007/s11227-022-05024-y}, year = {2023}, date = {2023-01-01}, journal = {The Journal of Supercomputing}, volume = {In press}, number = {In press}, pages = {1-39}, publisher = {Springer}, abstract = {Latency or throughput is often critical performance metrics in stream processing. Applications’ performance can fluctuate depending on the input stream. This unpredictability is due to the variety in data arrival frequency and size, complexity, and other factors. Researchers are constantly investigating new ways to mitigate the impact of these variations on performance with self-adaptive techniques involving elasticity or micro-batching. However, there is a lack of benchmarks capable of creating test scenarios to further evaluate these techniques. This work extends and improves the SPBench benchmarking framework to support dynamic micro-batching and data stream frequency management. We also propose a set of algorithms that generates the most commonly used frequency patterns for benchmarking stream processing in related work. It allows the creation of a wide variety of test scenarios. To validate our solution, we use SPBench to create custom benchmarks and evaluate the impact of micro-batching and data stream frequency on the performance of Intel TBB and FastFlow. These are two libraries that leverage stream parallelism for multi-core architectures. Our results demonstrated that our test cases did not benefit from micro-batches on multi-cores. For different data stream frequency configurations, TBB ensured the lowest latency, while FastFlow assured higher throughput in shorter pipelines.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Latency or throughput is often critical performance metrics in stream processing. Applications’ performance can fluctuate depending on the input stream. This unpredictability is due to the variety in data arrival frequency and size, complexity, and other factors. Researchers are constantly investigating new ways to mitigate the impact of these variations on performance with self-adaptive techniques involving elasticity or micro-batching. However, there is a lack of benchmarks capable of creating test scenarios to further evaluate these techniques. This work extends and improves the SPBench benchmarking framework to support dynamic micro-batching and data stream frequency management. We also propose a set of algorithms that generates the most commonly used frequency patterns for benchmarking stream processing in related work. It allows the creation of a wide variety of test scenarios. To validate our solution, we use SPBench to create custom benchmarks and evaluate the impact of micro-batching and data stream frequency on the performance of Intel TBB and FastFlow. These are two libraries that leverage stream parallelism for multi-core architectures. Our results demonstrated that our test cases did not benefit from micro-batches on multi-cores. For different data stream frequency configurations, TBB ensured the lowest latency, while FastFlow assured higher throughput in shorter pipelines. |
2022 |
Rockenbach, Dinei A; Löff, Júnior; Araujo, Gabriell; Griebler, Dalvan; Fernandes, Luiz G High-Level Stream and Data Parallelism in C++ for GPUs Inproceedings doi XXVI Brazilian Symposium on Programming Languages (SBLP), pp. 41-49, ACM, Uberlândia, Brazil, 2022. @inproceedings{ROCKENBACH:SBLP:22, title = {High-Level Stream and Data Parallelism in C++ for GPUs}, author = {Dinei A. Rockenbach and Júnior Löff and Gabriell Araujo and Dalvan Griebler and Luiz G Fernandes}, url = {https://doi.org/10.1145/3561320.3561327}, doi = {10.1145/3561320.3561327}, year = {2022}, date = {2022-10-01}, booktitle = {XXVI Brazilian Symposium on Programming Languages (SBLP)}, pages = {41-49}, publisher = {ACM}, address = {Uberlândia, Brazil}, series = {SBLP'22}, abstract = {GPUs are massively parallel processors that allow solving problems that are not viable to traditional processors like CPUs. However, implementing applications for GPUs is challenging to programmers as it requires parallel programming to efficiently exploit the GPU resources. In this sense, parallel programming abstractions, notably domain-specific languages, are fundamental for improving programmability. SPar is a high-level Domain-Specific Language (DSL) that allows expressing stream and data parallelism in the serial code through annotations using C++ attributes. This work elaborates on a methodology and tool for GPU code generation by introducing new attributes to SPar language and transformation rules to SPar compiler. These new contributions, besides the gains in simplicity and code reduction compared to CUDA and OpenCL, enabled SPar achieve of higher throughput when exploring combined CPU and GPU parallelism, and when using batching.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } GPUs are massively parallel processors that allow solving problems that are not viable to traditional processors like CPUs. However, implementing applications for GPUs is challenging to programmers as it requires parallel programming to efficiently exploit the GPU resources. In this sense, parallel programming abstractions, notably domain-specific languages, are fundamental for improving programmability. SPar is a high-level Domain-Specific Language (DSL) that allows expressing stream and data parallelism in the serial code through annotations using C++ attributes. This work elaborates on a methodology and tool for GPU code generation by introducing new attributes to SPar language and transformation rules to SPar compiler. These new contributions, besides the gains in simplicity and code reduction compared to CUDA and OpenCL, enabled SPar achieve of higher throughput when exploring combined CPU and GPU parallelism, and when using batching. |
Vogel, Adriano Self-adaptive abstractions for efficient high-level parallel computing in multi-cores Tese PhD School of Technology - PUCRS, 2022. @phdthesis{VOGEL:PHD_PUCRS:22, title = {Self-adaptive abstractions for efficient high-level parallel computing in multi-cores}, author = {Adriano Vogel}, url = {https://tede2.pucrs.br/tede2/handle/tede/10232}, year = {2022}, date = {2022-05-01}, address = {Porto Alegre, Brazil}, school = {School of Technology - PUCRS}, abstract = {Nowadays, a significant part of computing systems and real-world applications demand parallelism to accelerate their executions. Although high-level and structured parallel programming aims to facilitate parallelism exploitation, there are still issues to be addressed to improve existing parallel programming abstractions. Usually, application developers still have to set non-intuitive or complex parallelism configurations. In this context, self-adaptation is a potential alternative to provide a higher-level of autonomic abstractions and runtime responsiveness in parallel executions. However, a recurrent problem is that self-adaptation is still limited in terms of flexibility, efficiency, and abstractions. For instance, there is a lack of mechanisms to apply adaptation actions and efficient decision-making strategies to decide which configurations to be enforced at run-time. In this work, we are interested in abstractions achievable with self-adaptation transparently managing the executions while the parallel programs are running (at run-time). Our main goals are to increase the adaptation space to be more representative of real-world applications and make self-adaptation more efficient with comprehensive evaluation methodologies, which can provide use-cases demonstrating the true potentials of self-adaptation. Therefore, this doctoral dissertation provides the following scientific contributions: I) An Systematic Literature Review (SLR) providing a taxonomy of the state-of-the-art. II) A conceptual framework to support designing and abstracting the decision-making process within self-adaptive solutions, such a conceptual framework is then employed in the technical contributions to assist in making the solutions more modular and potentially generalizable. III) Mechanisms and strategies for self-adaptive replicas in applications with single and multiple parallel stages, supporting multiple customizable non-functional requirements. IV) Mechanism, strategy, and optimizations for self-adaptation of Parallel Patterns/applications’ graphs topologies. We apply the proposed solutions to the context of stream processing applications, a representative paradigm present in several real-world applications that compute data flowing in the form of streams (e.g., video feeds, image, and data analytics). A part of the proposed solutions is evaluated with SPar and another part with the FastFlow programming framework. The results demonstrate that self-adaptation can provide efficient parallelism abstractions and autonomous responsiveness at run-time, yet achieve a competitive performance w.r.t. the best static executions. Moreover, when appropriate, we compare state-of-the-art solutions and demonstrate that our highly optimized decision-making strategies achieve significant performance and efficiency gains.}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Nowadays, a significant part of computing systems and real-world applications demand parallelism to accelerate their executions. Although high-level and structured parallel programming aims to facilitate parallelism exploitation, there are still issues to be addressed to improve existing parallel programming abstractions. Usually, application developers still have to set non-intuitive or complex parallelism configurations. In this context, self-adaptation is a potential alternative to provide a higher-level of autonomic abstractions and runtime responsiveness in parallel executions. However, a recurrent problem is that self-adaptation is still limited in terms of flexibility, efficiency, and abstractions. For instance, there is a lack of mechanisms to apply adaptation actions and efficient decision-making strategies to decide which configurations to be enforced at run-time. In this work, we are interested in abstractions achievable with self-adaptation transparently managing the executions while the parallel programs are running (at run-time). Our main goals are to increase the adaptation space to be more representative of real-world applications and make self-adaptation more efficient with comprehensive evaluation methodologies, which can provide use-cases demonstrating the true potentials of self-adaptation. Therefore, this doctoral dissertation provides the following scientific contributions: I) An Systematic Literature Review (SLR) providing a taxonomy of the state-of-the-art. II) A conceptual framework to support designing and abstracting the decision-making process within self-adaptive solutions, such a conceptual framework is then employed in the technical contributions to assist in making the solutions more modular and potentially generalizable. III) Mechanisms and strategies for self-adaptive replicas in applications with single and multiple parallel stages, supporting multiple customizable non-functional requirements. IV) Mechanism, strategy, and optimizations for self-adaptation of Parallel Patterns/applications’ graphs topologies. We apply the proposed solutions to the context of stream processing applications, a representative paradigm present in several real-world applications that compute data flowing in the form of streams (e.g., video feeds, image, and data analytics). A part of the proposed solutions is evaluated with SPar and another part with the FastFlow programming framework. The results demonstrate that self-adaptation can provide efficient parallelism abstractions and autonomous responsiveness at run-time, yet achieve a competitive performance w.r.t. the best static executions. Moreover, when appropriate, we compare state-of-the-art solutions and demonstrate that our highly optimized decision-making strategies achieve significant performance and efficiency gains. |
Vogel, Adriano Self-adaptive abstractions for efficient high-level parallel computing in multi-cores Tese PhD Computer Science Department - University of Pisa, 2022. @phdthesis{VOGEL:PHD_PISA:22, title = {Self-adaptive abstractions for efficient high-level parallel computing in multi-cores}, author = {Adriano Vogel}, url = {https://etd.adm.unipi.it/theses/available/etd-04142022-142258/unrestricted/Vogel_PhD_Dissertation_UNIPI.pdf}, year = {2022}, date = {2022-05-01}, address = {Pisa, Italy}, school = {Computer Science Department - University of Pisa}, abstract = {Nowadays, a significant part of computing systems and real-world applications demand parallelism to accelerate their executions. Although high-level and structured parallel programming aims to facilitate parallelism exploitation, there are still issues to be addressed to improve existing parallel programming abstractions. Usually, application developers still have to set non-intuitive or complex parallelism configurations. In this context, self-adaptation is a potential alternative to provide a higher-level of autonomic abstractions and runtime responsiveness in parallel executions. However, a recurrent problem is that self-adaptation is still limited in terms of flexibility, efficiency, and abstractions. For instance, there is a lack of mechanisms to apply adaptation actions and efficient decision-making strategies to decide which configurations to be enforced at run-time. In this work, we are interested in abstractions achievable with self-adaptation transparently managing the executions while the parallel programs are running (at run-time). Our main goals are to increase the adaptation space to be more representative of real-world applications and make self-adaptation more efficient with comprehensive evaluation methodologies, which can provide use-cases demonstrating the true potentials of self-adaptation. Therefore, this doctoral dissertation provides the following scientific contributions: I) An Systematic Literature Review (SLR) providing a taxonomy of the state-of-the-art. II) A conceptual framework to support designing and abstracting the decision-making process within self-adaptive solutions, such a conceptual framework is then employed in the technical contributions to assist in making the solutions more modular and potentially generalizable. III) Mechanisms and strategies for self-adaptive replicas in applications with single and multiple parallel stages, supporting multiple customizable non-functional requirements. IV) Mechanism, strategy, and optimizations for self-adaptation of Parallel Patterns/applications’ graphs topologies. We apply the proposed solutions to the context of stream processing applications, a representative paradigm present in several real-world applications that compute data flowing in the form of streams (e.g., video feeds, image, and data analytics). A part of the proposed solutions is evaluated with SPar and another part with the FastFlow programming framework. The results demonstrate that self-adaptation can provide efficient parallelism abstractions and autonomous responsiveness at run-time, yet achieve a competitive performance w.r.t. the best static executions. Moreover, when appropriate, we compare state-of-the-art solutions and demonstrate that our highly optimized decision-making strategies achieve significant performance and efficiency gains.}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Nowadays, a significant part of computing systems and real-world applications demand parallelism to accelerate their executions. Although high-level and structured parallel programming aims to facilitate parallelism exploitation, there are still issues to be addressed to improve existing parallel programming abstractions. Usually, application developers still have to set non-intuitive or complex parallelism configurations. In this context, self-adaptation is a potential alternative to provide a higher-level of autonomic abstractions and runtime responsiveness in parallel executions. However, a recurrent problem is that self-adaptation is still limited in terms of flexibility, efficiency, and abstractions. For instance, there is a lack of mechanisms to apply adaptation actions and efficient decision-making strategies to decide which configurations to be enforced at run-time. In this work, we are interested in abstractions achievable with self-adaptation transparently managing the executions while the parallel programs are running (at run-time). Our main goals are to increase the adaptation space to be more representative of real-world applications and make self-adaptation more efficient with comprehensive evaluation methodologies, which can provide use-cases demonstrating the true potentials of self-adaptation. Therefore, this doctoral dissertation provides the following scientific contributions: I) An Systematic Literature Review (SLR) providing a taxonomy of the state-of-the-art. II) A conceptual framework to support designing and abstracting the decision-making process within self-adaptive solutions, such a conceptual framework is then employed in the technical contributions to assist in making the solutions more modular and potentially generalizable. III) Mechanisms and strategies for self-adaptive replicas in applications with single and multiple parallel stages, supporting multiple customizable non-functional requirements. IV) Mechanism, strategy, and optimizations for self-adaptation of Parallel Patterns/applications’ graphs topologies. We apply the proposed solutions to the context of stream processing applications, a representative paradigm present in several real-world applications that compute data flowing in the form of streams (e.g., video feeds, image, and data analytics). A part of the proposed solutions is evaluated with SPar and another part with the FastFlow programming framework. The results demonstrate that self-adaptation can provide efficient parallelism abstractions and autonomous responsiveness at run-time, yet achieve a competitive performance w.r.t. the best static executions. Moreover, when appropriate, we compare state-of-the-art solutions and demonstrate that our highly optimized decision-making strategies achieve significant performance and efficiency gains. |
Andrade, Gabriella; Griebler, Dalvan; Fernandes, Luiz Gustavo Avaliação do Esforço de Programação em GPU: Estudo Piloto Inproceedings doi Anais da XXII Escola Regional de Alto Desempenho da Região Sul, pp. 95–96, Sociedade Brasileira de Computação, Curitiba, Brazil, 2022. @inproceedings{ANDRADE:ERAD:22, title = {Avaliação do Esforço de Programação em GPU: Estudo Piloto}, author = {Gabriella Andrade and Dalvan Griebler and Luiz Gustavo Fernandes}, url = {https://doi.org/10.5753/eradrs.2022.19179}, doi = {10.5753/eradrs.2022.19179}, year = {2022}, date = {2022-04-01}, booktitle = {Anais da XXII Escola Regional de Alto Desempenho da Região Sul}, pages = {95--96}, publisher = {Sociedade Brasileira de Computação}, address = {Curitiba, Brazil}, abstract = {O desenvolvimento de aplicações para GPU não é uma tarefa fácil, pois exige um maior conhecimento da arquitetura. Neste trabalho realizamos um estudo piloto para avaliar o esforço de programadores não-especialistas ao desenvolver aplicações para GPU. Os resultados revelaram que a GSParLib requer menos esforço em relação as demais interfaces de programação paralela. Entretanto, mais investigações são necessárias a fim de complementar o estudo.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } O desenvolvimento de aplicações para GPU não é uma tarefa fácil, pois exige um maior conhecimento da arquitetura. Neste trabalho realizamos um estudo piloto para avaliar o esforço de programadores não-especialistas ao desenvolver aplicações para GPU. Os resultados revelaram que a GSParLib requer menos esforço em relação as demais interfaces de programação paralela. Entretanto, mais investigações são necessárias a fim de complementar o estudo. |
Grupo de Modelagem de Aplicações Paralelas
Na última decada, estão sendo criadas novas abstrações do paralelismo através de linguagens específicas de domínio (DSLs), bibliotecas e frameworks para a próxima geração de algoritmos e arquiteturas de computadores, como Servidores com Aceleradores (Graphics Processing Units – GPUs ou Field-Programmable Gate Array – FPGAs) e hardware embarcados. Isso tem sido aplicado ao domínio de aplicações de processamento de stream e voltadas à ciência de dados. Concomitantemente, desde de 2018, estão sendo conduzidas pesquisas com uso de inteligência artificial para otimização de aplicações nas áreas de Medicina, Ecologia, Indústria, Agricultura, Educação, Cidades Inteligentes e entre outras.
Linhas de Pesquisa
Ciência de Dados Aplicada
Abstrações de Paralelismo
Modelagem de Aplicações Paralelas
Equipe
Prof. Dr. Luiz Gustavo Fernandes
Coordenador Geral
Prof. Dr. Dalvan Griebler
Coordenador de Pesquisa
Últimas Publicações
2023 |
Micro-batch and data frequency for stream processing on multi-cores Journal Article doi The Journal of Supercomputing, In press (In press), pp. 1-39, 2023. |
2022 |
High-Level Stream and Data Parallelism in C++ for GPUs Inproceedings doi XXVI Brazilian Symposium on Programming Languages (SBLP), pp. 41-49, ACM, Uberlândia, Brazil, 2022. |
Self-adaptive abstractions for efficient high-level parallel computing in multi-cores Tese PhD School of Technology - PUCRS, 2022. |
Self-adaptive abstractions for efficient high-level parallel computing in multi-cores Tese PhD Computer Science Department - University of Pisa, 2022. |
Avaliação do Esforço de Programação em GPU: Estudo Piloto Inproceedings doi Anais da XXII Escola Regional de Alto Desempenho da Região Sul, pp. 95–96, Sociedade Brasileira de Computação, Curitiba, Brazil, 2022. |
Projetos
Softwares
Últimas Notícias
Contate-nos!
Ou, sinta-se à vontade para usar o formulário ao lado para contatar-nos.