Effective keyword search for software resources installed in large-scale Grid infrastructures

George Pallis*, Asterios Katsifodimos, Marios D. Dikaiakos

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

4 Citations (Scopus)

Abstract

In this paper, we investigate the problem of supporting keyword-based searching for the discovery of software resources that are installed on the nodes of largescale, federated Grid computing infrastructures. We address a number of challenges that arise from the unstructured nature of software and the unavailability of software-related metadata on Grid sites.We presentMinersoft, a Grid harvester that visits Grid sites, crawls their file-systems, identifies and classifies software resources, and discovers implicit associations between them. The results of Minersoft harvesting are encoded in a weighted, typed graph, named the Software Graph. A number of IR algorithms are used to enrich this graph with structural and content associations, to annotate software resources with keywords, and build inverted indexes to support keyword-based searching for software. Using a real testbed, we present an evaluation study of our approach, using data extracted from a production-quality Grid infrastructure. Experimental results show that our approach achieves high search efficiency.

Original languageEnglish
Title of host publicationProceedings - 2009 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2009
Pages482-489
Number of pages8
Volume1
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event2009 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2009 - Milano, Italy
Duration: 15 Sept 200918 Sept 2009

Conference

Conference2009 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2009
Country/TerritoryItaly
CityMilano
Period15/09/0918/09/09

Fingerprint

Dive into the research topics of 'Effective keyword search for software resources installed in large-scale Grid infrastructures'. Together they form a unique fingerprint.

Cite this