Modern scientific experiments often involve multiple storage and computing platforms, software tools, and analysis scripts. The resulting heterogeneous environments make data management operations challenging, the significant number of events and the absence of data integration makes it difficult to track data provenance, manage sophisticated analysis processes, and recover from unexpected situations. Current approaches often require costly human intervention and are inherently error prone.
The difficulties inherent in managing and manipulating such large and highly distributed datasets also limits automated sharing and collaboration. We study a real world e-Science application involving terabytes of data, using three different analysis and storage platforms, and a number of applications and analysis processes. We demonstrate that using a specialized data life cycle and programming model — Active Data — we can easily implement global progress monitoring, and sharing, recover from unexpected events, and automate a range of tasks.
Continue Reading
In this paper, a new algorithm for web service business protocol discovery is presented. It is based on an information retrieval and document indexing technique, the TF-IDF. The latter's formula…
Modeling and prediction of temporal sequences is central to many signal processing and machine learning applications. Prediction based on sequence history is typically performed using parametric models, such as fixed-order…
Distance metric learning (DML) is successful in discovering intrinsic relations in data. However, most algorithms are computationally demanding when the problem size becomes large. In this paper, we propose a…
In this paper, we propose the long range cooperative communication via high altitude platforms (HAPs). The HAPs are temporary platforms placed in the stratosphere, at the typical heights of 17-25…
The robustness of schedule schemes plays an important role for the smooth execution projects under uncertain conditions. This paper presents a new stochastic programming model about the bi-objective robust resource-constrained…
Beamforming is a signal processing technique with numerous benefits in wireless communication. Unlike traditional omnidirectional communication, it focuses the energy of the transmitted and/or the received signal in a particular…
This paper proposes a distributed peer-to-peer data lookup technique on DHTs in order to serve range queries over multiple attributes. The scheme, MARQUES, uses space filling curves to map multi-attribute…
The use of emotions has recently been considered to improve the indexing of video contents and two different approaches are usually followed: computation of objective emotions through low-level video features…
Smartphones are programmable and equipped with a set of cheap but powerful embedded sensors, such as accelerometer, digital compass, gyroscope, GPS, microphone, and camera. These sensors can collectively monitor a…