
|
 |
|
Grid Computing
|  |
Grid projects
|
 |
|
The Distributed Computing Systems group in HRL is currently working on two Grid-related projects: an OGSA reporting service and an enhancement for GridFTP for efficient transfer of large data sets. In addition, we are part of the IBM IntraGrid - a geographically distributed supercomputer linking IBM research and development laboratories in the US, Israel, Switzerland, Japan, and England.
The Reporting Grid Service (ReGS)
The Globus Project has developed fundamental technologies needed to build computational grids; environments that enable sharing of widespread computational resources, which may be managed by different organizations. The Open Grid Services Architecture (OGSA), a joint effort between IBM and the Globus Project, is a proposed evolution of the Globus technology towards a system architecture based on the integration of Grid and Web services concepts and technologies into a notion of Grid Services.
IBM envisions OGSA as a means to enable standard based heterogeneous distributed computing, as shown in the following figure. Heterogeneous platforms are virtualized by OGSA Meta-OS services, e.g., logging and clustering, on top of which higher level functions and consolidated management middleware are defined.
ReGS is an OGSA-based Meta-OS Grid Service for logging, tracing and monitoring applications in a distributed, heterogeneous computer environment. It provides standard logging interfaces for use by other Grid Services and Applications. It is also capable of virtualizing existing logging systems including zOS logging, NT events, and Unix syslogs. Filtered log messages are managed by ReGS for synchronous and/or asynchronous delivery to consumers. Filtering can occur on different criteria, e.g., message severity or application-specific and filter definitions are extensible. ReGS exploits OGSI interfaces and will be implemented on top of existing messaging systems (e.g., MQSI). ReGS is the plumbing behind more advanced functions such as end-to-end problem determination. Its architecture is depicted by the following figure:
ReGS demo is part of IBM ETTK (Emerging Technology ToolKit), formerly known as WSTK (Web Services ToolKit).
DirectorY Network Analyser and MOver - DYNAMO
GridFTP is the protocol proposed for all data transfers on the GRID. It extends the standard FTP protocol with facilities such as multistreamed transfer, autotuning and Globus based security.
DYNAMO is a tool built on top of GridFTP to address the problem of moving large data sets consisting of very large numbers of small objects. A particular instance of the problem that we address is moving a directory subtree of a file system. There are many reasons why one might need to move a directory of files. For example, given a utility model, a distributed enterprise (or even an SSP) might want to move the file data of a set of users from storage in one city to storage in another city. This could be done to take advantage of available capacity in the second city if there was insufficient capacity in the first city. Or it could be done to improve the end user’s experience if the network distance between the user and the second city was shorter. Another scenario might be that of replicating modified files to a remote site for purposes of disaster recovery.
DYNAMO directly leverages the work done on moving large data objects to move file systems consisting of large numbers of small files. In addition, it:
1) imposes constant memory overhead on the client and server systems.
2) is independent of the actual transfer mechanism used so we can easily take advantage of advances in technologies for transferring large files. (It has the benefits of GridFTP, in particular with respect to parallel data transfer and restartability, as well as security, third party control, etc.)
3) works well even for very large collections of very small files (DYNAMO outperforms GridFTP for this task, especially over a Metropolitan Area Network).
4) is a complete solution, i.e., reproduce the directory tree has been reconstructed at the server.
Experiments with DYNAMO for transferring large data set of files (e.g., 400KB from Toronto to NY), indicate a 40% improvement in transfer time, over a simple tar(to file)-gridFTP-untar approach. Moreover, it indicates 30% improvement over a tar(to pipe)-GridFTP-untar approach. A paper describing the DYNAMO protocol and performance results was presented in the GRID2002 workshop.
|
 |
|
 |
|