This use case develops data-driven environmental hazards data space. The data space will give an assessment of human exposures to environmental hazard factors, such as pollution or intense heat, on continental or global scale, to quantify relations between exposures and health outcomes as well as for the exploration of intervention policies.
The SAGE (Sustainable Green Europe Data Space) project targets the four strategic pillars in the European Green Deal— Zero Pollution, Climate Adaptation, Biodiversity, and the Circular Economy Action Plan, by implementing a rich portfolio of use cases in each of them.
The project demonstrates a total of 10 pilot use cases to foster data-driven sustainability solutions across biodiversity, climate, circular economy, and pollution monitoring. Below you can read about one of these pilots.
The overview of all use cases can be found here: Use-cases
Epidemiologists at universities or health research institutes, need an assessment of human exposures to environmental hazard factors, such as pollution or intense heat, on continental or global scale, to quantify relations between exposures and health outcomes as well as for the exploration of intervention policies. Currently, such data is not available for health analysis due to the heterogeneity, size and storage requirements of the required input data and compute demands in creating personal exposure estimations at a one-hectare resolution.
In the pilot phase, we will work towards a minimum viable product demonstrating the technical setup and working for the Environmental Hazards Data Space (EHDS). For the end user, we will support a first use case (e.g. ‘obtain information about and download average exposure per country at province level’).
The pilot will use one environmental hazard variable (e.g. PM2.5 air pollution concentrations) on global scale at 100m resolution.
Present Scenario Data Workflow
The SAGE GDDS will provide the end user with a single-entry point providing search facilities and access to environmental data for exposure assessment. The data will be based on continental or global scale high resolution input data. Each of the provided environmental factors will be generated by a consistent, reproducible workflow.
Benefits:
- Currently no data products are available providing personal exposures at a continental scale. EHDS will be first in providing a complete set of all relevant human environmental exposures, produced in a harmonized manner, such that multiple exposures can be used by data consumers in an integrated manner for assessment of environmental effects on health outcomes.
- Participants in the EHDS will improve the credibility of their research or policy making as they rely on state-of-the-art personal exposure data. State-of-the-art in terms of spatial resolution, sophistication of data production, validation of data, as well as reproducibility of data.
Future Scenario Data workflow
In the pilot phase, the software implementation of the processing workflow will migrate to the open-source LUE environmental modelling framework (https://github.com/computationalgeography/lue). This is a software framework tailored to the construction of HPC-ready environmental models. The migration allows us to process multiple, entire contiguous global-scale datasets at high-resolution (100m) at once. Computational and administrative overhead of creating, processing and managing data chunks in pre- and postprocessing phases is therefore no longer required.
The envisioned approach is one where a single computational framework is used to estimate personal environmental exposures from the complete set of environmental factor data. This computational framework runs in the backend of the data space and takes existing environmental hazard maps (often available in the public domain) as input and converting these to personal exposures, taking into account the fact that exposures are integrated over the spatial-temporal activity context of persons, implemented as spatial buffer operations.
Personal exposures are in this manner estimated for each geographical location (e.g. a home location) as well as statistical averages and ranges of values over multiple administrative areas (e.g. municipalities, cities, countries). These personal exposure data sets are then made available through the Environmental Hazards Data Space. In this setup, data consumers get access to a harmonised data set that includes all relevant environmental factors, processed in one single standardized way. It also allows data consumers to suggest new data sets to be included in the processing chain, such that user data can be integrated with exposure data already available in the data space.
Expected results
- The objective is to harmonise the processing of environmental data into personal exposure data through a universal computational framework running at the back end of the data space.
- The objective is to offer personal exposure through a searchable catalogue, where environmental data sets can be found; data can be downloaded for particular geographical locations as well as over particular geographical areas or time spans.
- For the EHDS we expect:
- An automated and harmonised assessment of personal exposures at continental scale instead of ad-hoc processing of data which leads to products that are not standardised and is less efficient regarding personnel costs
- User community of data providers and data consumers leading to more efficient sharing of data. Capability to integrate personal exposure data with other green deal data.
Coordinating institutes: Utrecht University and SURF

