AbstractWe present a method for the efficient access to parts of remote files. The efficiency is achieved by using a file format independent compact pattern description, that allows to request several parts of a file in a single operation. This results in a drastically reduced number of remote operations and network latencies if compared to common solutions. We measured the time to access parts of remote files with compact patterns, compared it with normal GridFTP remote partial file access and observed a significant performance increase.In the paper mentioned below, We also discuss how the presented pattern access can be used for an efficient read from multiple replicas and how this can be integrated into a data management system to support the storage of partial replicas for large scale simulations. nested FALLSLine SegmentsThe basic building block of FALLS is the line segment (LS). A line segment (l, r) is a contiguous part of a byte array defined by the offset of the left-most (l) and by that of the right-most (r) byte. The first 4 highlighted boxes in Fig. 1 can be described by the line segment (3, 6).Family of Line Segments.A FAmiLy of Line Segments (FALLS) represents a set of equally spaced and equally sized line segments. A FALLS is defined by a 4-tuple (l, r, s, n) where l and r define the first line segment, s is the stride between two consecutive line segments and n is the number of repetitions . In Fig. 1 the highlighted boxes can be described by (3, 6, 7, 4). (3, 6) describes the first line segment, 7 is the stride and 4 specifies how often this pattern is repeated.Nested Family of Line Segments.A nested family of line segments (nested FALLS) is defined using FALLS. Line segments select a contiguous part of a byte array. The result is again a byte array. So we can apply one FALLS to a byte array and apply another FALLS to each element of the result. The result is always a set of line segments. Nested falls are defined recursively where the first 4 parameters have the same meaning as for FALLS whereas the last parameter is either a FALLS or another nested FALLS (l_0, r_0, s_0, n_0, (l_1, r_1, s_1, n_1)).ExampleThis software is well suited for extracting regular subsets of remote data. For Fig. 2 we extracted several subsets of a 512x512x512 dataset (2 GB) stored in Amsterdam while the viewer ran in Berlin.DocumentsT. Schütt, A. Merzky, A. Hutanu, F. Schintke:Remote Partial File Access Using Compact Pattern Descriptions Proceedings of the ccgrid 2004, April 2004. |
![]() Figure 1. Family of line segments (FALLS), described by the pattern (3, 6, 7, 4).
![]()
![]() Figure 2. (1) full-resolution 3D image of the brain of a honey bee, (2) lower right corner, (3) same data at 1/4 resolution and (4) at 1/16 resolution |