Eliciting Well-Formed Quality Indicators And Metadata In GEOSS Earth Observation Products
Masů Pau, Joan1; Sevillano, Eva2; Bastin, Lucy3; Blower, Jon4; Smeets, Joost5; Nust, Daniel6; Bigagli, Lorenzo7; Thum, Simon8; Guidetti, Veronica9; Evano, Pascal10; Alameh, Nadine11
1Grumets research group, CREAF, Universitat AutÚnoma de Barcelona, SPAIN; 2Grumets research group, Dep Geografia, Universitat AutÚnoma de Barcelona, SPAIN; 3Aston University, Birmingham, UNITED KINGDOM; 4Reading e-Science Centre, Environmental Systems Science Centre, University of Reading, UNITED KINGDOM; 5S[&]T Corporation, NETHERLANDS; 6Univ Munster, Inst Geoinformat, GERMANY; 7Institute of Atmospheric Pollution Research (CNR-IIA), ITALY; 8Fraunhofer Inst Graph Datenverarbeitung, GERMANY; 9ESA, ITALY; 10CEA CNRS, Lab Climate Sci & Environm, Joint Unit,, FRANCE; 11Open Geospatial Consortium, UNITED STATES

With advances in satellite technologies and GIS software, the creation and processing of geospatial data has become simpler, leading to large volumes of data being generated and therefore accumulated. Dataset users face much greater choice when selecting a dataset for a given purpose, lending increased importance to the issue of quality and users' ability to comparatively assess the quality of candidate datasets (Delavar & Devillers, 2010). The Global Earth Observation System of Systems (GEOSS) is a distributed 'system of systems' which is being constructed by the Group on Earth Observation (GEO) to "provide decision support tools to a wide variety of users" (GEO, 2012). Given that GEOSS is estimated to contain more than 28 million dataset records (Diaz et al., 2012), and is constantly growing, quality assessment and comprehensive tools are needed to support data selection and decision-making. From the user perspective, the GEOPortal (www.geoportal.org) is the currently official entry point to the GEOSS Common Infrastructure (GCI): a set of geospatial catalogues aimed at aggregating all relevant Earth Observation (EO) data to serve its 9 Societal Benefit Areas (Agriculture, Biodiversity, Climate, Disaster response, Ecosystems, Energy, Health, Water, and Weather). The "QUAlity aware Visualisation for the Global Earth Observation system of systems" (GeoViQua)" is a EU FP7-funded research project developing a quality framework, deploying architectural components and contributing to the design of quality aware tools that will be embedded in the GEO Portal. By means of the analysis of the quality elements contained in the GEOSS Clearinghouse (one of the unified metadata catalogues of the GCI) (Diaz et al., 2012), we concluded that quality measures in metadata documents are not rare although not general enough. Nonetheless, whenever present, the study corroborated that quality is far from being well presented and disseminated to the user (e.g., lack of tools). GeoViQua has joined the efforts invested in describing how to parameterize quality and uncertainty coming from the scientific community and the Quality Assessment For Earth Observation (QA4EO) group, by tackling spatialized quality indicators for continuous and categorical variables in the EO domains. To optimize the software developersí interaction with the data producer and the fulfillment of the data user requirements, attention should be paid at fostering a better comprehension of a reduced list of quality indicators on the data producersí side, such as the ones provided in the forthcoming ISO 19157. Quantitative quality information describes the internal quality of datasets and is particularly important to expert users. In particular, apart from dataset overall indicators, per pixel and object level quality measures are extremely appreciated (i.e., uncertainty coverages associated to the data). However, it should be kept in mind that qualitative quality descriptions are also relevant to fit-for-purpose quality assessment. Unfortunately, at present data quality is mainly produced in the form of quality reports and scientific papers that make comparison difficult and spatial representation challenging. GeoViQua Producer Quality Model (PQM) includes the ISO model with some additions mainly for including publications, discovered issues and quality parameters traceability. Enhancements introduced in the PQM are complemented by a user feedback system aimed to collect user opinions and experiences related to the usage of the corresponding EO data (Yang 2013). It has been proved that users of geospatial data consider a wide variety of metadata elements and indicate the importance of the availability of complete metadata records for effective dataset selection. GeoViQua is promoting the design of a GEO Label that by providing at a glance iconic representation of quality facets could significantly improve the usersí recognition of the quality of geospatial datasets, trustworthiness and the acknowledgment of the fitness for use quality inherent essence. Additionally, complementary quality aware visualization techniques for uncertainty coverages are under development to better communicate quality measurements to the user, tailored to match categorical or continuous variables quality parameterization issues. Nonetheless, enhanced tools are required to foster the practical implementation of metadata documentation, the inclusion of quality indicators and provenance information in the metadata records that serve as a baseline for data discovery in the Global Earth Observation System of Systems (GEOSS). Quality enabled GeoViQua components, would ideally by implemented in subsequent server interfaces and user-friendly portlets to be integrated in the GEO Portal. For example, the GEO portal can access the quality enabled Discovery and Access Broker (DAB) and incorporate visualization techniques that show data overlaid with uncertainty coverages. By the end of the project the benefits of the GeoViQua project are to be illustrated in scenarios, formalising demonstrative insights into quality and visualization within GEOSS. This paper communication presents two scenarios; the ecosystems/agriculture scenario focused on local management showing the fragile balance between ecosystems and agriculture exploitation, and the carbon scenario focused in the carbon cycle expert quality assessment of carbon fluxes models and budgets, relevant inputs related to variables estimations and global figures of climatic change. In the Ebro delta ecosystem, many bird species rely on the flooded waters of the river banks. The fertile lands in the area are highly suitable for rice cultivation. A European Commission regulation was enacted to protect the autochthonous fauna that encourages farmers to flood their fields more than the rice cultivation practices strictly require by receiving an economic compensation in exchange. Set up in-situ controls have proved too expensive and have been replaced by an optimized remote sensing procedure based on Landsat imagery to detect the evolution of the flooding practices at field level all through the season. Objective quality control procedures and several quality indicators are generated so the data can be used to identify the farmers who are not following the regulation. The scenario pretends to demonstrate what would happen if Landsat constellation failed to provide data. Finding another remote sensing source that can be equivalent to Landsat and that have the same quality parameters to replace or complement Landsat data is not easy for such a highly specific product. The current GEOPortal cannot provide a clear answer because quality indicators are not queriable. In this context, GeoViQua developments allow setting a query for remote sensing imagery data with equivalent quality parameters, or at least the intercomparison of quality elements between different products. Hence, the user will easily find the data with complete descriptive information, choose between the records with the right GEOLabel and be able to compare individual metadata records to find the product that best fits the purpose. In the second scenario, the diversity of carbon budget models and resulting products is confronted in a data intercomparison website. There are several scientific groups around the world that elaborate global carbon budget maps using different source data and models. The GEO Portal will allow for the discovery of products related to this topic. Using the GEO Label, the user is able to discard the records that have no provenance information (but also shows that no user feedback comments were provided so far). After carefully studying the metadata records from the results, the selection of products with the right quality measures and coming from trusted scientific teams is possible. A new version of ncWMS software is used to feed a side-by side comparison map viewer that is called the Global Carbon Portal (GCP). Once published, scientists are able to carefully examine the model results and e.g. to report small contradictions between models using a "comments" button. Scientists only have to provide information about the discovered issue and the system is able to assist them by providing the region of interest, and the dataset names compared using the portal context. Internally, this comments are not stored in the GCP but in a more general feedback catalogue in GEOSS, making them available to the whole GEOSS community. This is demonstrated by the fact that, after those feedback inputs, the GEO Label of those products changes to indicate the presence of user feedback. This scenario illustrates how a system of systems that integrates EO data can contribute to a transparent debate about data features increasing the traceability of the process and improving the public opinion confidence in carbon budgets estimations and climate change studies in general. The final aim is to influence decision makers to take measures to mitigate the climate change effects. Remote sensing producers regularly and methodically assess and document the quality of their data and edit detailed reports (e.g., in human readable PDF format). The previous scenarios highlight that much more structured geospatial datasets quality information is needed for the correct performance of the system, as demanded by GEOSS Communities of Practice. Should well-formed and complete metadata records be produced, GeoViQua will demonstrate that a system of system like GEOSS can really channel the discovery and access to accurate fit-for-purpose information. REFERENCES Delavar, M. and Devillers, R. (2010). Spatial Data Quality: From Process to Decisions. Transactions in GIS, 14(4), pp. 379-386. Diaz P, Masó J, Sevillano E, Ninyerola M, Zabala A, Serral I, Pons X (2012) Data Quality Analysis in the GEOSS Clearinghouse. IJSDIR International Journal of Spatial Data Infrastructures Research. Vol 7: 352-377. ISSN 1725-0463. GEO (2012). What is GEOSS?: The Global Earth Observation System of Systems. [online] Available at: http://www.earthobservations.org/geoss.shtml [Accessed on 04 April 2013]. Yang, K., J. Blower, L. Bastin, V. Lush, A. Zabala, J. Maso, D. Cornford, P. Diaz; J . Lumsden (2013) An Integrated View of Data Quality in Earth Observation. Philosophical Transactions of the Royal Society A. Vol. 371 no. 1983.