From Big to Meta Data


SUMMARY of Abstract :

Data objects up to research fields seem rather exploding and scattering than converging and leveling up to Meta data, hence discovery.

Data processing and interdisciplinary human collaboration need to be much more interactive.

We propose a global interdisciplinary experimentation of research fields behavior predictive models after 'Discinnet process' as shown on, by :

- First discuss conditions and limits of Big to Meta data transitions for interdisciplinary scientific discovery, from its 'General epistemology challenge' to the complexity classes/barriers,

- Refer to proven models - from both physics and computational complexity domains - showing how this may be overcome and suggesting the type of automated systems / human interacting organization then presented,

- Make a demonstration and offer that it be appropriated and experimented in a statistically large enough network of research fields to test the ability to predict their behavior, trajectories and outcomes, hence frame the uncertainty on research to technology transitions and therefore their cost.

- As a summary we  have several goals through such experiment, going from 'downward' checking research fields trajectory models - necessarily interdisciplinary - with efficient organizations to frame and reduce uncertainty on research outcomes, to 'upward' testing a diversity of epistemological models - necessarily general - of which interpretations of quantum measurement and the nature of data.

Practically the experiment starts with limited geographic areas and networks freely appropriating the Discinnet pocess and tool but it is highly desirable that the majority of content become shared at global scale.

Abstract: for the primordial goals of disaster planning, response management and awareness ‘Big’ or large amounts of data are increasingly available from systematic scrutiny and towards automated higher-level interpretation, through more and more powerful survey and detection instruments, storage capacity and computation resources. However, this paper demonstrates which types of organization and processes may achieve them in a sustainable and increasingly systematic manner while others would on the contrary consummate resources at a rate incommensurable with the targeted results, hence finally unachievable.

There are contradictory signals about the trends, if only because the essential role of the human in the final interpretative step seems impossible to measure although we will propose a quantification of this most obfuscate level. This is what will keep human intervention compulsory and yet possible to minimize and focus, in the organization, on otherwise irreducible gaps between systematic analysis and the semantic highest complexity of desired metadata level, i.e. forecast decision problems.

In first section, we observe that ‘Big’ data are often considered from a volumetric perspective instead of Hardness Space and Time classes – using their Computational rather than SI definitions – leveled by Complexity theory. This means that the Big up to Meta Data ladder itself, or equivalently Polynomial up to Exponential Space and Time Computational or Complexity hierarchies, is more important than basic ‘Big’ volumes themselves. We will then explain their proposed equivalence, which is not a priori obvious since the first one involves a diversity of object types and related dimensionality, up to or versus subjects, while the second one is wholly formal. Indeed relating Big data to most basic objects and signals all the more numerous than further reduced and split, hence more objective, hence bearers of valid conclusions, seems compelling. Yet the trend to object oriented simplicity and easy categorization is opposite to metadata converse reconstruction. We will detail Main TIME and SPACE Complexity classes and particularly the k-SAT ones, where Big to Meta Data construction may be modeled using results such as the behavior of M/N ratio of Boolean clauses to variables when N-> ∞.

In a second section we will propose another equivalence, between past versus future oriented data and complexity depending a collaborative semantic projective capacity into the objective dimensionality mentioned above, back and forth, hence from project, goal-oriented, future and constructive, most significant and semantic levels, versus the big objective data levels. We will use physics experimental results to relate these results from the computational experimental series of previous section to equate both equivalences. Then, to widen the clustering phenomena observed in the former and modeled in referenced papers at the crossroads from Computational Complexity into Physics (k-SAT phenomenology) and conversely for latter (quantum measurements).

From there we will derive and present collaborative processes deemed optimal for such an ambition and for instance proposed to be widely tested through experimental platforms such as the ‘Discinnet process’ and others still to be designed in order to cover a diversity of types of problems. Therefore we conclude, from these theoretical and experimental congruent results, that the integration of such series of collaborative platforms may bring the desired optimal large scale and versatile organization. 


File From Big to Meta Data.pptx2.19 MB
Journeau, P.; From Big to Meta Data; Summer Meeting 2013. ESIP Commons , June 2013