S2Coast-2023: 10m Global Coastline Data From Sentinel-2

by Alex Johnson 56 views

Ever wondered about the exact shape of our planet's ever-changing coastlines? S2Coast-2023 is here to answer that question with remarkable precision. This groundbreaking dataset marks the first global coastline dataset to be produced at a stunning 10-meter resolution, all derived from the powerful Sentinel-2 satellite imagery. It offers a unified and consistent view of the land-sea boundary, a crucial element for anyone involved in global coastal research and monitoring. Coastlines are, by their very nature, dynamic interfaces. They are constantly being reshaped by the ebb and flow of tides, the relentless power of waves, the gradual rise of sea levels, and of course, human activities. While we've made incredible strides in satellite remote sensing technology, a comprehensive global product that directly leverages the high-resolution, frequent observations from Sentinel-2 had been surprisingly absent. Until now. To bridge this significant gap, the S2Coast project was born. We've developed a sophisticated, knowledge-based, and fully automated extraction framework, ingeniously built upon the Google Earth Engine (GEE) platform. This framework is designed to detect a standardized coastline indicator, which we call the High Water Line (HWL_{Sentinel-2}), from annual composites of Sentinel-2 imagery, carefully selecting images that are free from cloud cover. The methodology employed is quite clever; it integrates multi-temporal information – meaning it looks at data from different times – along with spectral signatures (the unique way different surfaces reflect light) and spatial features (how things are arranged geographically). This comprehensive approach allows us to delineate the stable extent of seawater submergence over the course of a year. The end result is a land-water binary image, and the boundary of this image is precisely what defines the HWL_{Sentinel-2}. After this initial raster-to-vector conversion and a thorough optimization process, the workflow was executed through an impressive 12,275 coordinated processing tasks. This monumental effort has resulted in the S2Coast-2023 dataset, which meticulously maps approximately 2.17 million kilometers of coastline across all continents and a vast majority of islands larger than 100 square meters. It's important to note that Antarctica and extremely remote polar islands were excluded from this particular dataset. The robustness and universality of the S2Coast approach have been rigorously confirmed through global validation. We used 1,146 coastline samples, and the results showed strong temporal consistency. Astonishingly, 88% of repeatedly generated coastline segments (spanning from 2021 to 2023) fell within a mere 10-meter buffer zone. This level of consistency underscores the reliability of the S2Coast methodology. Furthermore, a detailed positional accuracy assessment, conducted using 532 reference coastline samples derived from very high-resolution, image-supported OpenStreetMap data, revealed an average positional deviation of -1.10 meters (with a 95% confidence interval of -2.06 to -0.15 meters) and an average Root Mean Square Error (RMSE) of 17.40 meters (95% CI: 16.23 to 18.65 meters). This dataset is a game-changer for understanding our dynamic coastlines.

The Power of Sentinel-2 for Coastal Mapping

The Sentinel-2 mission, a cornerstone of the European Space Agency's Copernicus program, provides an invaluable resource for global environmental monitoring, and its application in creating the S2Coast-2023 dataset highlights its immense potential for coastline analysis. Sentinel-2's key strengths lie in its multispectral imaging capabilities and its frequent revisit time. It carries a 13-band multispectral optical instrument, offering data across visible, near-infrared, and shortwave infrared spectral regions. This rich spectral information is crucial for distinguishing between land, water, and various coastal features like vegetation and sediment plumes. The 10-meter spatial resolution of its imagery is particularly significant for coastline mapping. This resolution is fine enough to capture the intricate details of the land-sea interface, allowing for the precise delineation of the High Water Line (HWL), which is vital for accurate erosion and accretion studies. Before S2Coast-2023, achieving this level of detail globally and consistently was a significant challenge. Previous datasets often relied on coarser resolution data, limiting their utility for fine-scale coastal process studies, or were based on ad-hoc methodologies that lacked global standardization. The Sentinel-2 constellation ensures a high revisit frequency, typically 5 days over many areas at the equator. This frequent revisitation is absolutely critical for dealing with the dynamic nature of coastlines and for generating cloud-free composites. Clouds are a perennial problem in optical remote sensing, but by stacking and analyzing multiple images acquired over a year, S2Coast-2023 can effectively 'see through' the clouds and capture the most representative surface conditions. The S2Coast framework leverages this temporal dimension intelligently. It's not just about a single snapshot; it's about understanding the consistent presence of water over time. The method integrates multi-temporal information to identify areas that are reliably inundated by seawater during high tide cycles, thus defining the stable HWL. This temporal consistency filtering is a key innovation that enhances the reliability and reduces the noise in the extracted coastline. The Google Earth Engine (GEE) platform plays an indispensable role in enabling the global scale of this project. GEE provides access to a massive, planetary-scale catalog of satellite imagery and geospatial data, along with a powerful cloud-based computational engine. This allows researchers to process vast amounts of data from Sentinel-2 without needing to download and manage petabytes of imagery locally. The S2Coast team utilized GEE to develop and execute their automated workflow, coordinating thousands of processing tasks to cover the entire globe. This demonstrates the scalability and efficiency that cloud-based platforms offer for big Earth science data analysis. In essence, S2Coast-2023 is a testament to how the advanced capabilities of Sentinel-2, combined with innovative processing techniques on platforms like GEE, can unlock new insights into our planet's critical coastal zones, providing essential data for research and policy.

The S2Coast-2023 Methodology: Automating Global Coastline Extraction

Creating a global coastline dataset with 10-meter resolution is no small feat, and the S2Coast-2023 project achieved this through a sophisticated, automated methodology built on the Google Earth Engine (GEE) platform. The core of the S2Coast approach lies in its ability to reliably extract the High Water Line (HWL_{Sentinel-2}), representing the stable extent of seawater inundation over a year. This is a crucial metric for understanding coastal dynamics, including erosion and accretion processes. The process begins with the creation of annual, cloud-free Sentinel-2 composites. Given the challenges posed by cloud cover in tropical and temperate regions, generating a truly representative image of the coast can be difficult. S2Coast tackles this by leveraging the temporal density of Sentinel-2 data. By analyzing multiple images acquired throughout a year, the system can effectively mitigate the impact of clouds and atmospheric disturbances. The framework employs a knowledge-based approach, integrating several key data sources and analytical techniques. Firstly, it utilizes the rich spectral information from Sentinel-2's 13 bands. Different materials on Earth's surface reflect and absorb sunlight differently across these bands, creating unique spectral signatures. Water, for instance, has a distinct spectral response, particularly in the near-infrared and shortwave infrared regions, where it strongly absorbs light. By analyzing these spectral characteristics in conjunction with optical bands, the system can differentiate between water bodies and land. Secondly, the methodology incorporates multi-temporal information. This is perhaps the most critical aspect for defining the HWL. Instead of relying on a single image, S2Coast analyzes the temporal behavior of pixels over the year. Pixels that are consistently covered by water during high tide periods are identified. This temporal filtering helps to distinguish the permanent or semi-permanent high water mark from temporary inundation due to storm surges or seasonal variations that are not representative of the stable coastline. Thirdly, spatial features are considered. The algorithm analyzes the context of pixels, looking at their neighborhood and patterns. This helps to refine the coastline boundary and reduce spurious detections that might arise from isolated features or spectral misclassifications. The combination of these elements allows for the generation of a land-water binary image. The boundary of this binary image, where land meets water, is then defined as the HWL_{Sentinel-2}. Following the generation of this raster data, a crucial step involves converting the raster boundary into a vector format (lines or polygons). This conversion is followed by an optimization process. This step is essential for cleaning up the data, removing small, isolated artifacts, and ensuring the geometric integrity of the coastline. The entire workflow was designed to be fully automated and scalable. To achieve global coverage, the processing was distributed across 12,275 coordinated tasks within the GEE platform. This massive computational effort highlights the power of GEE for handling large-scale geospatial analysis. The resulting S2Coast-2023 dataset provides an unprecedented 2.17 million kilometers of mapped coastline, offering a consistent and reliable foundation for a wide array of coastal studies. The validation statistics, showing high temporal consistency and strong positional accuracy, further attest to the effectiveness and robustness of this automated extraction methodology. This approach sets a new standard for global data creation from satellite observations.

Validating S2Coast-2023: Ensuring Accuracy and Reliability

The creation of the S2Coast-2023 dataset represents a significant leap forward in global coastline mapping, but its true value lies in its accuracy and reliability. The research team implemented rigorous validation procedures to ensure that the derived 10-meter resolution coastline data, generated from Sentinel-2 imagery, meets high standards for both temporal consistency and positional accuracy. This meticulous validation process is crucial for building trust in the dataset and for its effective use in scientific research and practical applications, particularly in areas prone to erosion.

Temporal Consistency: A Measure of Stability

Coastlines are dynamic, but the HWL_{Sentinel-2} aims to capture a stable annual inundation boundary. To assess the temporal consistency of the extracted coastlines, the researchers generated coastline segments repeatedly over a period of three years (2021–2023). This allowed them to evaluate how stable the detected coastline was from year to year. The results were highly encouraging: 88% of the repeatedly generated coastline segments fell within a 10-meter buffer. This remarkable consistency indicates that the S2Coast methodology is robust and captures a stable representation of the high water line, minimizing year-to-year variations that are not related to significant geomorphic changes. Such temporal stability is vital for long-term monitoring of coastal changes, allowing researchers to confidently track subtle shifts due to sea-level rise or sediment dynamics rather than variations introduced by the data processing itself.

Positional Accuracy: Aligning with Ground Truth

To evaluate the positional accuracy of the S2Coast-2023 coastline, a comparison was made against a high-quality reference dataset. Specifically, 532 reference coastline samples were sourced from OpenStreetMap (OSM) data that had been supported by very high-resolution imagery. OSM data, when carefully curated and verified with aerial or satellite imagery, provides a reliable benchmark for assessing the accuracy of geospatial products. The comparison revealed highly favorable results:

  • Average Positional Deviation: The dataset showed an average positional deviation of −1.10 meters. This negative value suggests a slight tendency for the S2Coast line to be landward of the reference line, which can be an acceptable characteristic depending on the application. Crucially, the 95% confidence interval (CI) for this deviation was −2.06 to −0.15 meters, indicating a high degree of certainty that the true average deviation lies within this narrow range.
  • Average Root Mean Square Error (RMSE): The RMSE, a measure of the overall magnitude of error, was calculated to be 17.40 meters. The 95% CI for the RMSE was 16.23 to 18.65 meters. While an RMSE of 17.40m might seem substantial at first glance, it's important to consider the context. This value reflects the overall discrepancy between the S2Coast line and the OSM reference. Given the inherent complexities of defining a precise coastline (e.g., soft shorelines, vegetation boundaries, intertidal zones) and the potential for slight differences in feature extraction between S2Coast and OSM, this level of accuracy, especially when coupled with the narrow confidence intervals, demonstrates the high fidelity of the S2Coast dataset. The 10-meter resolution of the Sentinel-2 data is a key enabler of this accuracy, allowing for much finer detail than previously available at a global scale.

These validation metrics collectively confirm that S2Coast-2023 is not only a comprehensive global dataset but also a scientifically sound and reliable product. The strong temporal consistency and the favorable positional accuracy assessments make it an invaluable tool for researchers studying coastal processes, monitoring erosion, and understanding the impacts of climate change on our shorelines.

Citation and Data Access

To properly acknowledge the effort and resources invested in creating this valuable resource, please cite the S2Coast-2023 dataset using the following reference:

Duan, Y., Sanchez-Azofeifa, A., Chen, C., Tian, B., Li, X., Sengupta, D., Zhou, Y., 2026. S2Coast-2023: The first global 10-meter resolution coastline dataset derived from enhanced Sentinel-2 composite imagery using Google Earth Engine. Remote Sensing of Environment, 334, 115186. https://doi.org/10.1016/j.rse.2025.115186

Access to the dataset is facilitated through Google Earth Engine (GEE), making it readily available for analysis by the global research community. An example snippet for accessing the data within GEE is provided below:

// Load the S2Coast-2023 coastline feature collection
var s2_coast = ee.FeatureCollection("projects/ee-dhritirajsen/assets/S2Coast-2023_Polyline_diss");

// Set the map to a satellite view for better visualization
Map.setOptions('SATELLITE');

// Add the S2Coast coastline layer to the map with a distinct color
Map.addLayer(s2_coast.style({
  color: 'cyan',
  width: 2
}), {}, 'S2 Coastline 2023 ');

// You can further filter and analyze the 's2_coast' feature collection as needed.
// For example, to view the coastline of a specific region:
// Map.centerObject(s2_coast.filterBounds(ee.Geometry.Point(lon, lat)), zoom);

This snippet allows users to easily visualize the S2Coast-2023 coastline directly within the GEE Code Editor, providing a starting point for further exploration and analysis. The dataset is released under the Creative Commons Attribution 4.0 International license, which permits users to share and adapt the data as long as they give appropriate credit, provide a link to the license, and indicate if changes were made. This ensures that the dataset remains open and accessible for scientific advancement.

Conclusion: A New Era for Coastal Monitoring

The S2Coast-2023 dataset represents a monumental achievement in the field of remote sensing and coastal science. By harnessing the power of Sentinel-2 satellite imagery and the scalable computational capabilities of Google Earth Engine, researchers have delivered the first global coastline dataset at an unprecedented 10-meter resolution. This dataset addresses a critical need for standardized, high-fidelity data on the world's land-sea boundaries, offering invaluable insights for understanding and managing coastal zones. The sophisticated automated methodology employed, which integrates spectral, temporal, and spatial information, ensures a robust and consistent delineation of the High Water Line (HWL_{Sentinel-2}). Rigorous validation, including assessments of temporal consistency and positional accuracy against reliable reference data, confirms the high quality and reliability of S2Coast-2023. Its applications are vast, ranging from monitoring the impacts of sea-level rise and quantifying erosion rates to supporting coastal management strategies and informing infrastructure development. The availability of this data through Google Earth Engine democratizes access, empowering researchers worldwide to conduct more detailed and accurate coastal analyses than ever before. As we face the increasing challenges posed by climate change and human pressures on coastal environments, datasets like S2Coast-2023 are not just informative; they are essential tools for building a more resilient future. We encourage researchers and practitioners to explore this dataset and contribute to our collective understanding of these vital and dynamic regions. For further insights into coastal processes and remote sensing applications, you might find the resources at the United Nations Group of Experts on Geographical Names (UN-GGIM) and the European Space Agency (ESA) websites particularly valuable.