Environmental Hazards Data Space (EHDS)

This use case develops data-driven environmental hazards data space. The data space will give an assessment of human exposures to environmental hazard factors, such as pollution or intense heat, on continental or global scale, to quantify relations between exposures and health outcomes as well as for the exploration of intervention policies.

The SAGE (Sustainable Green Europe Data Space) project targets the four strategic pillars in the European Green Deal— Zero Pollution, Climate Adaptation, Biodiversity, and the Circular Economy Action Plan, by implementing a rich portfolio of use cases in each of them.

The project demonstrates a total of 10 pilot use cases to foster data-driven sustainability solutions across biodiversity, climate, circular economy, and pollution monitoring​. Below you can read about one of these pilots. 

The overview of all use cases can be found here: Use-cases

Epidemiologists at universities or health research institutes, need an assessment of human exposures to environmental hazard factors, such as pollution or intense heat, on continental or global scale, to quantify relations between exposures and health outcomes as well as for the exploration of intervention policies. Currently, such data is not available for health analysis due to the heterogeneity, size and storage requirements of the required input data and compute demands in creating personal exposure estimations at a one-hectare resolution.

In the pilot phase, we will work towards a minimum viable product demonstrating the technical setup and working for the Environmental Hazards Data Space (EHDS). For the end user, we will support a first use case (e.g. ‘obtain information about and download average exposure per country at province level’).
The pilot will use one environmental hazard variable (e.g. PM2.5 air pollution concentrations) on global scale at 100m resolution.

Present Scenario Data Workflow

The SAGE GDDS will provide the end user with a single-entry point providing search facilities and access to environmental data for exposure assessment. The data will be based on continental or global scale high resolution input data. Each of the provided environmental factors will be generated by a consistent, reproducible workflow.
Benefits:

Future Scenario Data workflow

In the pilot phase, the software implementation of the processing workflow will migrate to the open-source LUE environmental modelling framework (https://github.com/computationalgeography/lue). This is a software framework tailored to the construction of HPC-ready environmental models. The migration allows us to process multiple, entire contiguous global-scale datasets at high-resolution (100m) at once. Computational and administrative overhead of creating, processing and managing data chunks in pre- and postprocessing phases is therefore no longer required.


The envisioned approach is one where a single computational framework is used to estimate personal environmental exposures from the complete set of environmental factor data. This computational framework runs in the backend of the data space and takes existing environmental hazard maps (often available in the public domain) as input and converting these to personal exposures, taking into account the fact that exposures are integrated over the spatial-temporal activity context of persons, implemented as spatial buffer operations.


Personal exposures are in this manner estimated for each geographical location (e.g. a home location) as well as statistical averages and ranges of values over multiple administrative areas (e.g. municipalities, cities, countries). These personal exposure data sets are then made available through the Environmental Hazards Data Space. In this setup, data consumers get access to a harmonised data set that includes all relevant environmental factors, processed in one single standardized way. It also allows data consumers to suggest new data sets to be included in the processing chain, such that user data can be integrated with exposure data already available in the data space.

Expected results

Coordinating institutes: Utrecht University and SURF