Agile Array Analytics: Why We Need Array Databases


Multidimensional arrays represent an information category which substantially contributes to today's "Big Data" challenge. 1D sensor timeseries, 2D satellite imagery, 3D x/y/t image timeseries and x/y/z exploration data, 4D x/y/z/t climate and ocean data constitute but some examples of array data in the geo sciences; further application domains include life, space, and social sciences as well as business and engineering. As databases do not support large, multidimensional arrays, traditionally file-based ad-hoc implementations offering limited functionality prevail in data archives and Web services holding such data. Array databases have set out to close this gap by extending standard databases with n-D array modeling and query support. A particular contribution such Array DBMSs can make for both scientific and industrial applications is their query flexibility, scalability, and information integration. In our talk we present the rasdaman ("raster data manager") Array DBMS we have developed since several years. It is fully implemented and has been evaluated in a variety of relevant application fields. According to independent experts in the field, rasdamanis the most advanced Array DBMS today. We outline conceptual model and architecture of rasdaman, highlight the query optimization potential, discuss application domains investigated, and introduce a proposal for extending the ISO SQL standard with array capabilities in a natural way. Many real-life examples will be presented which participants with online access can recap and vary.

Baumann, P.; Agile Array Analytics: Why We Need Array Databases; Summer Meeting 2013. ESIP Commons , April 2013