12

DMGA: A generic brokering-basedData Mining Grid Architecture

Alberto Sánchez, María S. Pérez, Pierre Gueant, José M. Peña and Pilar Herrero

ABSTRACT

The concept of a data mining grid has become one of the most challenging topics in both the grid and data mining areas. Indeed, grid environments seem to be an answer to the great demand of computational power and data facilities required by current data mining applications.

The Data Mining Grid Architecture (DMGA) is a generic brokering-based architecture for deploying data mining services in a grid. This approach presents two different composition models: (i) horizontal composition, which offers workflow capabilities, and (ii) vertical composition, for increasing the performance of inherently parallel data mining services. This scheme is specially significant to those services accessing a large volume of data, which can be distributed through diverse locations.

This chapter describes DMGA and addresses both kinds of composition. Additionally, two use cases are shown with the aim of demonstrating the advantages of combining service functionalities and processing distributed data in data-intensive applications.

12.1 Introduction

Nowadays the amount of information required by many business and scientific applications is unmanageable. Data mining has become a critical task for understanding these large volumes of data. This paradigm has evolved in several directions, trying to tackle limitations of original data mining systems. As ...

Get Data Mining Techniques in Grid Computing Environments now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.