Journal on Data Semantics 4(4): 231-267 (2015)
Abstract. Data-intensive ecosystems are conglomerations of data repositories surrounded by applications that depend on them for their operation. In this paper, we address the problem of performing what-if analysis for the evolution of the database part of a data-intensive ecosystem, in order to identify what other parts of an ecosystem are affected by a potential change in the database schema, and how will the ecosystem look like once the change has been performed, while, at the same time, retaining the ability to regulate the flow of events. We model the ecosystem as a graph, uniformly covering relations, views and queries as nodes and their internal structure and interdependencies as the edges of the graph. We provide a simple language to annotate the modules of the graph with policies for their response to evolutionary events in order to regulate the flow of events and their impact by (i) vetoing ("blocking") the change in parts that the developers want to retain unaffected and (ii) allowing ("propagating") the change in parts that we need to adapt to the new schema. Our method for the automatic adaptation of ecosystems is based on three algorithms that automatically (i) assess the impact of a change, (ii) compute the need of different variants of an ecosystem's components, depending on policy conflicts, and (iii) rewrite the modules to adapt to the change. We theoretically prove the coverage of the language, as well as the termination, consistency and confluence of our algorithms, and experimentally verify our methods effectiveness and efficiency.