Kavouras and Kokla (2007) state that
“interoperability is the ability of systems or products to operate effectively and efficiently in conjunction, on the exchange and reuse of available resources, services, procedures, and information, in order to fulfil the requirements of a specific task. […] It is not exhausted with integration, but also involves means of intelligent communication such as querying, extraction, transformation etc.”
The European Interoperability Framework (EU, 2017), intended to provide recommendations and guidance to support a shared an interoperable digital environment for communication and exchange of data with the public administrations in Europe, identifies four interoperability layers:
For the European Interoperability Framework, ‘Interoperability’ is “the ability of organisations to interact towards mutually beneficial goals, involving the sharing of information and knowledge between these organisations, through the business processes they support, by means of the exchange of data between their ICT systems” (EU, 2017).
Simplifying the concept, interoperability can be considered as a characteristic of single dataset, determining their reusability across systems (e.g., for technical interoperability, their potential for being consistently imported-exported by software) (Noardo et al., 2021a,b).
Semantic interoperability, although being mainly a technical issue, has strict relationship and influence on the human side of interoperability as well, concerning data interpretation and description for reuse.
Several frameworks define guidance on how to manage data in order to support an enhanced concept of interoperability, including accessibility, findability and reuse as well. Some relevant examples are reported here: the European Interoperability Framework principles and recommendations (Table 1); the Data Management Principles provided by the Group on Earth Observations System of Systems (GEOSS) , aimed at a good management of Earth Observation data (Table 2); and the Findability, Accessibility, Reusability and Interoperability (FAIR) principles, intended to support good sharing of data in science (Wilkinson et al. 2016) (Table 3).
Integration is the combination or conflation of information from different datasets (Worboys & Duckham, 2004).
Figure 1 depicts what the two concepts of ‘interoperability’ and ‘integration’ entail and how are they related to each other.
Figure 1. Data interoperability vs. data integration. (Noardo, 2022)
Noardo (2022) summarizes a workflow for an effective methodology for data integration as in Figure 1. It starts with the definition of requirements, go through data retrieval and possible processing to harmonise the different sources, data merging, validation of the data, to get to the final phase regarding the update of metadata after data merging.
Figure 2. A workflow for suitable data integration.
An effective data integration can only be planned and obtained whether a specific use case is considered. Therefore, the essential starting point is the definition of the requirements for the data to be obtained after the integration.
After a positive assessment of data integrability, harmonization actions must be chosen and applied for each considered aspect, among which:
After a proper harmonisation processing is applied to the origin dataset to make it fully consistent with the destination data requirements, data fusion operations will then allow obtaining the integrated dataset. Data validation and update of metadata to keep track of the applied processing, through a proper provenance tracking are the final steps.
Defining Data Requirements is the process to identify, prioritize, precisely formulate and validate the data necessary to achieve specific business objectives .
To enable any interoperable process, the definition of data requirements is essential, it supports transparency and allows a proper and effective data retrieval, re-use and processing, including multisource data integration.
Several parameters have to be considered, related to the different aspects of data. Such data characteristics should be always explicitly prescribed by data requirements in case of data acquisition and modelling, or harvesting; as well as described in metadata. Noardo (2022) summarises parameters which were defined in literature (Doerr, 2004; Kavouras & Kokla, 2007; Worboys & Duckham, 2004) as well as recognised as more and more important for enabling interoperability ecosystems, such as data spaces and digital twins (DSSC, 2024).
Figure 3 summarizes the parameters to be considered for data to be reciprocally integrated. In particular, the objective parameters in the picture, which are further specified in Figure 4, are the features of data to be considered as data requirements.
Figure 3. Parameters to be considered to assess and prepare effective data integration (Noardo, 2022)
Figure 4. Synthesis of parameters for data integration potential assessment
Besides the geometrical parameters, which are essential, although often underestimated, resources on spatial data management and integration distinguish between “semantic” level (i.e., difference in conceptualization and definition, including terms used, specific meaning and classifications); “structural” or “schematic” level (i.e., the conceptual model or schema structuring the data, relations between entities and attributes, relationships, and hierarchies); and “syntactical” level (i.e., the format of the data) (Noardo, 2022, Doerr, 2004; Kavouras & Kokla, 2007; Worboys & Duckham, 2004, Mohammadi et al., 2006).
Some standards propose guidelines to define data requirements. In the building and civil engineering works domain, the concept of Level Of Information Need is defined by the ISO 19650-1:2018 for information stored in BIM. buildingSMART has defined the Information Delivery Specification (IDS) standard to define the exchange requirements in a computer interpretable format, to define Industry Foundation Classes-based data requirements and allowing data validation .
In the geospatial domain, data requirements are equally important, and depend on the use case for which data are intended (Malinowski & Zimányi, 2006). ISO 19131 “Data product specification” defines another reference for geographic data products in particular.
Some tools are being provided and currently developed to support data requirements definition for different fields and to support users in referring to standards consistently and comprehensively when defining such data requirements.
G-reqs: Geospatial in-situ requirements is a tool developed within the European project InCase in order to define data requirements related to in-situ Earth Observation data.
The OGC Data Exchange Toolkit is being developed to facilitate data requirements definition for 3D city models, and more in general, to data which are structured according to standard data models and semantics. It leverages semantics technologies and enable automatic reference to unique identifiers to standards and standards components, as well as data validation against the defined profile (Noardo et al., 2023).