Projects

Exploração do Paralelismo em Aplicações de Processamento de Stream Relevantes para a Indústria Tecnológica

Status : Ongoing
Nature : Research
Duration :
Institutions : PUCRS
Funders : FAPERGS

Grande parte das aplicações do mundo real podem ser caracterizadas como um fluxo contínuo de processamento. São as chamadas aplicações de processamento de stream. Elas estão presentes em diversos segmentos da indústria tecnológica, colaborando para o surgimento de soluções inovadoras que agregam valor aos produtos que têm alto impacto na vida cotidiana das pessoas. De modo geral, as que mais precisam da exploração do paralelismo podem ser categorizadas em: I) aplicações de processamento de imagem, video e áudio que normalmente realizam operações de codificação, reprodução, filtro, captura, etc.; II) aplicações de backup, compressão e deduplicação em dados armazenados e recebidos de diferentes fontes (sensores e aparelhos mobile, monitoramento de pacientes e sistemas, registros de log, etc.); e III) aprendizado de máquina profundo (Deep Learning) que envolve o reconhecimento de padrões e informações em vídeos, áudio, imagens e em outras bases de dados. Dentro deste contexto, algumas aplicações requerem uma latência baixa e uma alta vazão. Na maior parte das aplicações de stream, estes requisitos só podem ser atendidos através da exploração do paralelismo. Diante disso, o projeto tem como objetivo a exploração do paralelismo em um conjunto de aplicações de processamento de stream do mundo real usando a linguagem específica de domínio SPar. Ela foi criada pela equipe do grupo de pesquisa GMAP da PUCRS com a finalidade de simplificar a exploração do paralelismo em aplicações de stream. Além de contribuir para melhorar o desempenho das aplicações de stream, este projeto ajudará a validar a SPar em um ambiente real.

Uma DSL para Aplicações MapReduce em Sistemas Multi-Core Baseada em Anotações

Status : Ongoing
Nature : Research
Duration :
Institutions : PUCRS
Funders : CNPq

A programação paralela tem sido um grande desafio para muitos desenvolvedores de aplicação. Uma forma de explorar o paralelismo de dados é usar o padrão MapReduce, do qual surge o termo aplicações MapReduce. A popularidade deste padrão cresceu com o aumento massivo de dados e ele tem sido amplamente usado, especialmente em ambientes distribuídos nesta última década. Mais recentemente, a utilização do MapReduce também começou a ser explorada em ambientes de memória compartilhada (multi-core). O maior desafio neste cenário é o suporte de abstrações de alto nível para este tipo de aplicação. Enquanto que as soluções do estado da arte atual abstraem detalhes para o programador de sistemas, os desenvolvedores de aplicação possuem dificuldade em lidar com detalhes atrelados à programação paralela. Podemos destacar a implementação do próprio padrão MapReduce, do balanceamento de carga e das customizações orientadas a otimização do desempenho para arquitetura alvo. Outro problema é a utilização de diretivas de compilação ou extensões de linguagem que não são parte da linguagem hospedeira padrão, pois possuem uma sintaxe própria e pouco familiar ao que o desenvolvedor de aplicação está acostumado a trabalhar. O objetivo deste projeto é propor uma abordagem de abstração diferente e especializada. Para isso, a proposta é a criação de uma linguagem específica de domínio interna baseada em anotações do padrão C++17, apoiando desenvolvedores de aplicações MapReduce com abstrações de mais alto nível e familiares ao seu ambiente de desenvolvimento.

Avaliação de uma Linguagem Específica de Domínio para Paralelismo de Stream

Status : Concluded
Nature : Research
Duration : 1 year
Institutions : PUCRS
Funders : FAPERGS

Aplicações baseadas em stream se tornaram amplamente utilizadas nos últimos anos. A grande maioria delas necessita de respostas imediatas com alta vazão no processamento dos elementos do stream. Alguns exemplos são processamento de vídeo e imagem, análise de dados em tempo real, aplicações de rede, simulações e sistemas reativos. Neste contexto, apresentamos a SPar que é uma linguagem específica de domínio embarcada na linguagem C++ para realizar o paralelismo de stream. Ela apresenta uma nova abordagem, permitindo que usuários com pouco conhecimento em programação paralela possam, usando apenas cinco atributos e dois tipos de anotações, anotar e identificar no código sequencial a região de paralelismo. Além disso, o compilador da SPar é usado para gerar um código paralelo que suporta os sistemas multicore. No entanto, ainda é necessário realizar uma avaliação mais ampla do seu desempenho em aplicações reais, bem como experimentos de software para medir o esforço de programação em relação aos frameworks do estado da arte. Em vista disso, o objetivo deste projeto é implementar diferentes tipos de aplicações e comparar com os demais frameworks o esforço e o desempenho da SPar. As contribuições esperadas são: (I) novos resultados do desempenho da SPar para otimização da geração de código; (II) informações sobre o esforço necessário com o intuito de melhorar a interface com o programador; (III) descoberta de novas aplicações de stream e limitações da SPar.

Um Compilador para Geração Automática de Código Paralelo em Aplicações MapReduce

Status : Concluded
Nature : Research
Duration : 1 year
Institutions : PUCRS
Funders : FAPERGS

Visando melhoria de performance, simplicidade e escalabilidade no processamento de dados amplos, o Google propôs o padrão paralelo MapReduce. Este padrão tem sido implementado de variadas formas para diferentes níveis de arquitetura, alcançando resultados significativos com respeito a computação de alto desempenho. No entanto, desenvolver código otimizado com tais soluções requer conhecimento especializado na interface e na linguagem de programação de cada solução. Durante a dissertação de mestrado de um membro do GMAP (Grupo de Modelagem de Aplicações Paralelas) foi criada uma interface unificada de programação MapReduce que ofereceu regras de transformação de código que permitem partir de uma linguagem de alto nível e gerar soluções otimizadas para arquiteturas de memória compartilhada e distribuída. A interface proposta é capaz de evitar perdas de performance uma vez que a geração através das regras apresentou resultados similares à aplicações paralelizadas manualmente, enquanto alcança uma redução de código e esforço de programação que aumenta a produtividade do desenvolvedor. Este projeto tem como objetivo dar continuidade a esta pesquisa a partir do desenvolvimento de um gerador de código para realizar automaticamente as transformações e realizar melhorias na interface através do estudo de outras ferramentas atuais como Spark, Storm e Flink.

Automatic Elasticity in MapReduce Applications on Cloud Computing

Status : In Progress
Nature : Research
Duration : 2016-2018
Institutions : PUCRS
Funders : PUCRS

With the increasing access to the Internet, the amount of data produced has increased significantly, especially in the last 10 years. The Zettabyte Era, a survey conducted by Cisco, shows a study and projection of the data created on the Internet. The shows that in 1992, 100GB of data were created per day. In 2013, this amount was around 28.875 GB per second. And the projection for the amount of data created until 2018 is 50,000 GB per second. This data production growth is an idea that is already mature in computing, so companies and several research groups have been working in order to create mechanisms to understand and gather more information that are generated on the internet. Currently, the standard MapReduce is one of the mechanisms used for processing large volumes of data. In most cases, the mechanisms as MapReduce require massive computational power, companies need to invest in these financial resources. In order to avoid such high costs, companies have a positive view of cloud computing. The cloud computing model allows that provisioned computing resources are selected by the user according to his need. The costs for using this service is performed according to the use, with the demand or need for your application. An important feature in cloud environments is the ability to apply elasticity and can increase or decrease the resources according to the required demand. This context motivates the proposal of creating a mechanism that provides automatic elasticity runtime for MapReduce applications. Thus, resources can be dynamically provisioned according to the need of the system/application, allowing more efficient processing services to provide infrastructure and enable users to pay only for what was used. This will facilitate for cloud service customer run their MapReduce applications without having to worry about the allocation of resources, which will be held by a middleware. This middleware proposed a review of necessary resources and make the allocation automatically, without requiring user intervention, facilitating not only running the software, but also the development process, considering that currently the customer needs to use the OpenStack APIs take advantage of the elasticity that clouds offer. The project is proposing the creation of a middleware that provides automatic provisioning of computing resources (memory, storage and processing) for MapReduce applications in cloud computing tool OpenStack.

Comparative Performance Evaluation of a Parallel Interface Patterns-Oriented Programming

Status : Completed
Nature : Research
Duration : 2014 - 2015
Institutions : PUCRS
Funders : CAPES, FAPERGS

In order to teach parallel programming details and guide developers to the use of standards for the modeling of applications, domain-specific language DSL-POPP named was created. The main objective is to increase the productivity of programmers and reduce stress on the parallelization of applications for multi-core architectures. Another important factor is performance, which can not be compromised significantly in this scenario. In this sense, this research sought to conduct an evaluation and a comparison of the DSL-POPP performance over traditional parallel programming interfaces for multi-core architectures. For example, Pthreads, OpenMP, FastFlow, TBB and Cilk. In contrast, the interface was built to provide skeletons for the standards Master / Slave and Pipeline. Thus, the user just selects the appropriate standard to model your application and fills them with sequential code. When compiling the program with the DSL-POPP compiler, parallel code is automatically generated. The performance evaluation was performed using applications from different domains. Each of them has been implemented from its sequential version and then run on a machine with multi-core processor. Performance results showed that DSL-POPP not had significant losses, showing that the generation of códio and abstraction created are efficient. Even if the losses were small, the result is positive, as previous reviews of the effort demonstrate a significant gain.

Parallel Programming and High Performance Geared for Cloud Computing Environments

Status : Completed
Nature : Research
Duration : 2014 - 2015
Institutions : PUCRS
Funders : CAPES, FAPERGS

The essential characteristics of cloud computing are virtualization and abstraction. Virtualization is the main facilitator in cloud computing management, and enables a single hardware to run multiple operating systems simultaneously. There are several virtualization operating methods: emulation, full virtualization, paravirtualization virtualization and operating system level. The construction of cloud computing environments is possible to form ecosystems that determine models of cloud services, which are the IaaS models (Infrastructure as a Service), PaaS (Platform as a Service) and SaaS (Software as a Service). The project objectives are to address the parallel programming in the scenario of cloud computing and check the performance of applications running on a particular hardware is maintained when migrating to a cloud environment. For this, a private cloud was implanted with OpenNebula tool over the virtualizer KVM. The virtual instances were created with the operating system Ubuntu server. Later, the suites were executed the NPB (NASA Parallel Benchmark) that explored the shared memory and distributed scenarios in cloud environments and native. After collecting the execution logs, the statistical approach was used to analyze how the differences were significant between environments. The results showed that the performance of parallel applications is significantly affected when the parallel applications are migrated to the cloud.

A Domain-Specific Language for Large-Scale Data Visualization

Status : In Progress
Nature : Research
Duration : 2015 - 2016 (12 months)
Institutions : PUCRS
Funders : CAPES, FAPERGS

In the last decade the production of data around the world has grown exponentially. These data have very different origins since the amount of data grows producing agents using the technology. Examples of massive amounts of data producers are social networks, simulations, sensors, recording telephone calls and customer information activities. Analysis of these data can provide information to build knowledge on trends and behaviors and assist in decision-making in the corporate, scientific and academic environment. It turns out that the analysis of these data in raw form is complex, and in some cases impossible due to the large volume. In this context, the use of visualization techniques becomes an alternative to assist in the perception of the data. The use of different display modes to the analyzer provides a way of presentation of data that favors the perception, since it uses graphics and images to represent data. It is proven that the use of images and shapes facilitate visual representation of human perception. In this case the image processing is performed in parallel by the human perception system, unlike data in text form, which are limited to the sequential reading process. There are views that have a complex process of creation, since they require programming. The view when applied on a massive amount of data adds to the complexity of parallelization or optimization both in its generation as to support interactions. This becomes a difficult task for users, scientists from some areas covered by computing and lay in the development of views as you have to worry about managing a large amount of data, in addition to visualization and programming itself. Domain Specific Languages ​​(or Domain-Specific Languages ​​- DSL) are languages ​​that seek to solve a particular problem domain. DSLs have been used in the application or for the generation of visualization techniques. As they allow for simplified creation, they provide a high-level interface that abstracts programming details, and some include parallelism. This reduces the effort and time required to generate a visualization. However, it is open open the possibility of creating a DSL that meets the following requirements: have a user-friendly interface for the user who does not have advanced knowledge in programming; allow the creation of views quickly and easily; use large amounts of data. This project aims to address the processing of large amounts of data in parallel and the inclusion of different types of views in the DSL. The goal is to identify and develop views that can be added in the language by making a study of the types available and most commonly used views. Thus, the validation of the proposed language will be possible to evaluate the processing time of the data and programming effort.

DSL-POPP - Domain-Specific Language for Pattern-Oriented Parallel Programming

Status : Concluded
Nature : Research
Duration : 2012 - 2013
Institutions : PUCRS
Funders : CAPES, FAPERGS

This project aims to collaborate in the development of DSL-POPP. This is a Domain Specific Language for Oriented Programming Parallel Patterns, which offers for the programmer a high-level interface. The main contribution is the evaluation of effort and performance in different synthetic applications. The results showed that the DSL-POPP reduced effort on the parallelizing applications without compromising performance.

GREEN-GRID - Sustainable High Performance Computing

Status : Active
Nature : Research
Duration : 2010 - Current
Institutions : UFRGS, PUCRS, UFPEL, UFSM
Funders : CNPQ, FAPERGS

    The recent awareness of the importance about the energy cost of computational systems does not match with the growing demand for computational resources increasingly powerful. If in a part, different information society segments need automatized treatment of their processes, on the other, the CPUs clock frequency increase is becoming impossible because of the energy cost associated to its increment. The solution that has been proposed by manufacturers is the introduction of the parallelism in their own CPUs (multi-core chips, and soon many-cores), as the use of volunteer computational resources (volunteer computing) or aggregates through Grid Computing. Both solutions allow maximaze the utilization rate of their hardware resources, thereby increasing the processing capacity without increase the energy cost. This conscious processing context about the energy cost is associated to the Green Computing, or Sustainable Computing, as mentioned in this text. The proposal presents the working group on Sustainable Computing consisted of four working groups distributed in four Higher Education Institutions of Rio Grande do Sul State, these are: the Federal University of Rio Grande do Sul, the Federal University of Pelotas, the Federal University of Santa Maria and the Pontifical Catholic University of Rio Grande do Sul. The partners propose uniting their abilities to create a Grid Computing infrastructure for Sustainable Computing, based on the underutilized resources exploration (e.g. idle laboratories in night shifts) and in muti-core processors that have low-cost energy. The research attentions are going to be focused on issues related to the application execution support about this infrastructure, there is also concern with the support tools development to the applications development for this infrastructure considering the requirement of sustainable computing.

High Performance Computing with Hybrid Programming for Real Applications

Status : Completed
Nature : Research
Duration : 2010 - Current
Institutions : PUCRS
Funders : PUCRS

Various fields of knowledge have increasingly applications that require quick responses drawn from the computation of a large volume of data. Can be cited as an example areas such as meteorology (weather forecasting), Physics (simulations with high computational load), bioinformatics (data-mining protein chain), Computer Graphics (representation of three-dimensional volumes), etc. Parallel processing appears as a viable alternative to improve the performance of such applications, a fact provided by the relatively small cost to purchase a parallel environment with high computational power. This computational power increases significantly as the growing use of multiprocessor clusters of machines (hybrid architecture). However, the use of conventional parallel programming techniques is not enough to be extracted from the whole processing power of this type of architecture. In this scenario, this project is justified by the need to research new methodologies for proper programming of high performance hybrid environments, covering programming practices, target application characteristics analysis and ways to evaluate the quality of hybrid deployment. With this, researchers from various fields of knowledge will benefit, since research and know all these aspects it is a phase that usually spends time in implementations.

Solving Sparse Linear Systems in High Performance Environments with Verified Computing

Status : Active
Nature : Research
Duration : 2010 - Current
Institutions : PUCRS
Funders : CNPQ, FAPERGS

    This research proposal is focused in the use of high performance computing techniques to accelerate the large sparse Linear Equations Systems resolution that use the Verified Computing to ensure the right numerical representation of the obtained results.

HPVC - High Performance Verified Computing

Status : Completed
Nature : International Cooperation
Duration : 2008 - 2009
Institutions : PUCRS, USP, University of Karlsruhe, University of Wuppertal
Funders : CAPES, DAAD

    In scientific and technological problems, reliability is becoming an increasingly important requirement. Extremely simple examples show that, due to the finite representation of numbers, numerical results using traditional arithmetic floating-point may be completely wrong. The interval mathematics allows you to capture uncertainties in the modeling and formulation of the problems in estimating the parameters, in rounding, and interpretation of models. On solving systems of linear equations, the control over the error generated in computing is presented as a pre-requisite for obtaining correct results of solving problems. The construction of solvers for the treatment of sparse and large matrices, representing a set of linear equations, using verified computing in high-performance allows the resolution of issues arising from different areas of knowledge, and that still require high processing power and higher accuracy in their results. Thus, the high performance computing with the automated verification of results is presented as a key tool for critical applications from fields such as space technology, bioinformatics, automotive engineering or nuclear plants validation without real physical experiments. In order to address not only academic problems away from the constraints imposed by industrial needs, software for high reliability and efficiency should be developed for parallel architectures. The main scientific result sought, after the execution of this project, is focused on a methodology that allows the solving of large sparse linear equations systems with verified computation in high performance platforms. This methodology represents an important tool for researchers from different scientific areas.

PROPRIO - Profiling for Optimized Parallel Ripping Orchestration

Status : Completed
Nature : Company Cooperation
Duration : 2008 - 2009
Institutions : PUCRS, HP Brazil, HP Laboratory Pallo Alto, UFPE
Funders : HP Brazil

   The goal of this project is to investigate new optimization techniques to accelerate the ripping process of documents previously rendering by developing strategies to allocate a set of RIPs based on jobs profile analysis with PDF (Portable Document Format) content. The main difficulties are related to the identification of a strategy to ensure the jobs processing taking maximum advantage of the ability of the distributed print environment through jobs scheduling politics. It is also necessary ensure that the jobs priority be maintained and that the Print Service Provider has the flexibility to submit small jobs during the larger jobs processing.

FOP/MAUI - High Performance in the Layout Composition for Variable Data Printers

Status : Completed
Nature : Company Cooperation
Duration : 2007 - 2007
Institutions : PUCRS, HP Brazil, HP Laboratory Pallo Alto, UFPE
Funders : HP Brazil

   This project aims to investigate the use of high performance techniques to accelerate the composing layouts process for printing variable data. The biggest challenge of this research project is to produce a self-configurable tool capable to identify the best environment for rendering a document according to their characteristics as described in its format.

ADR-VDP - High Performance Techniques for the XSL-FO Documents Ripping to VDP

Status : Completed
Nature : Company Cooperation
Duration : 2006 - 2006
Institutions : PUCRS, HP Brazil, HP Laboratory Pallo Alto, UFPE
Funders : HP Brasil

   The main objective of this project is directly related to the creation of a robust, portable, scalable tool, with good usability for parallel rendering of VDP documents in industrial printing environments, (i.e., environments that require high throughput in rendering documents). The use of high-performance applications techniques within the context of Documents Engineering can be considered a second innovative aspect of this research. The biggest challenge of this proposed research project is to conduct the research and refinement of the optimizations to be made in the parallel FOP tool efficiently so that it results in a finished product, and tested with satisfactory performance.

GerpavGrid - Distributed System for Supporting Conservation and Maintenance of Public Road Pavements

Status : Completed
Nature : Development Research
Duration : 2005 - 2006
Institutions : PUCRS, PROCEMPA, UFCG
Funders : FINEP

  The Secretaria Municipal de Obras e Viação (SMOV) by Supervisors Conservation and Urban Pathways, an agency responsible for conservation and maintenance of public roads in the city has faced in recent years with the increasing necessity of methodologies and upgraded management systems for the efficient performance of its institutional functions. This is due, for example, because of the boom in demand for services, consequence of accelerated urban expansion, and their own solutions adopted to respond to such expansion, as is the case, for example, of the greater use of asphalt pavement. On the other hand, occurs at the same time a strong retraction of the government, grappling with massive budget constraints, and with charges increasingly prominent in society for greater efficiency in the statement regarding the options for use of public resources. In this context, emerged the demand for a system that monitors and supervises the use of public resources in relation to the control of urban road network. The target system of this project has a high computational demand by performing conservation indicator calculations on the part of the extension of a street axis, called arc. The arc is characterized as the segment between two breakpoints feature, interruption or partition of a public area. There are approximately 32,000 registered arcs in Porto Alegre. Moreover, many simulations can be done by projecting the conditions of these arcs over time (ranging from the evolution of traffic flow, weather, etc.) to allow an allocation of resources more efficiently.

SDE-FED - Dynamic Simulation of Electrons in an FED Device using the Outline Elements Method

Status : Completed
Nature : Company Cooperation
Duration : 2005 - 2005
Institutions : PUCRS, HP Brazil, UNICAMP
Funders : HP Brazil

   This project aims to develop a high performance software to simulate the electrons dynamics in FED devices. The software has to be robust and provide data mainly in good agreement with the experimental results, in order to aid the understanding of desirable features for the end device as well as aid in the modeling of new geometries for the construction of these devices.

CAP - Research and Development Center of Parallel Applications

Status : Completed
Nature : Company Cooperation
Duration : 2003 - 2006
Institutions : PUCRS, HP Brazil
Funders : HP Brazil

   The Research and Development Center of Parallel Applications (CAP - PUCRS/HP Brazil) is dedicated to the study of high performance computing techniques and methodologies applied on the development of solutions for computational intensive applications. The CAP project has two main research lines: (a) support techniques for parallel programs design and (b) development of high performance applications for distributed memory architectures. (a) Support techniques for the design of parallel programs are not directly related to the parallel solution of a given problem, but they offer alternatives to simplify and optimize the process of developing parallel programs. In this scenario, the CAP team is interested in research topics like: (i) Analytical modeling of parallel programs using Stochastic Automata Networks (SAN); (ii) Formal verification of parallel and distributed programs properties using Objects-Based Graph Grammars (GGBO); (iii) Load-balancing algorithms for dynamic irregular applications over distributed memory platforms; (iv) Structural test methodologies for parallel programs. (b) The development of parallel applications for different categories of distributed memory architectures, such as heterogeneous clusters or computational grids, is another major research line of the CAP team. Our focus is the development of new high performance algorithms and/or programs for scientific or industrial problems using the message passing programming paradigm. Recently, the CAP group has been working on parallel solutions for the following applications: (i) Documents rendering tool for high speed printers (FOP - Formatting Objects Processor); (ii) Visualization of medical data for image-based diagnosis; (iii) Simulation of electrons trajectory on Field Emission Displays.

GMAP © - 2018 - PUCRS - All Rights Reserved.