cigChannel: a large-scale 3D seismic dataset with labeled paleochannels for advancing deep learning in seismic interpretation
<p>Identifying paleochannels in 3D seismic volumes (seismic paleochannel interpretation) is essential for georesource development and offering insights into paleoclimate conditions. However, it remains a labor-intensive and time-consuming task. Deep learning has shown great promise in autom...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Copernicus Publications
2025-07-01
|
| Series: | Earth System Science Data |
| Online Access: | https://essd.copernicus.org/articles/17/3447/2025/essd-17-3447-2025.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | <p>Identifying paleochannels in 3D seismic volumes (seismic paleochannel interpretation) is essential for georesource development and offering insights into paleoclimate conditions. However, it remains a labor-intensive and time-consuming task. Deep learning has shown great promise in automating seismic paleochannel interpretation with high efficiency and accuracy, as demonstrated in similar image segmentation tasks in computer vision (CV). Yet, unlike the CV domain, seismic exploration lacks a comprehensive labeled dataset for paleochannels, significantly hindering the development, application, and evaluation of deep learning methods in this field. Manual labeling of paleochannels in 3D seismic volumes is tedious and subjective, potentially leading to mislabeling that degrades deep learning model's performance. To address this, we propose a workflow to generate a synthetic seismic dataset, <i>cigChannel</i>, consisting of 1600 seismic volumes with over 10 000 labeled paleochannels. Each volume has a size of <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M1" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">256</mn><mo>×</mo><mn mathvariant="normal">256</mn><mo>×</mo><mn mathvariant="normal">256</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="79pt" height="10pt" class="svg-formula" dspmath="mathimg" md5hash="aa81b7525bb2a615d46addb872c74d66"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="essd-17-3447-2025-ie00001.svg" width="79pt" height="10pt" src="essd-17-3447-2025-ie00001.png"/></svg:svg></span></span> samples. This is the largest dataset to date for seismic paleochannel interpretation, featuring geologically reasonable seismic volumes with accurately labeled meandering channels, tributary channel networks, and submarine canyons. A convolutional neural network (simplified from U-Net) trained on this dataset achieves <span class="inline-formula"><i>F</i><sub>1</sub></span> scores of 0.52, 0.73, and 0.63 in identifying meandering channels, tributary channel networks, and submarine canyons in three field seismic volumes, respectively. However, the synthetic seismic volumes in the <i>cigChannel</i> dataset still lack the variability and realism of field seismic data, potentially affecting the deep learning model's generalizability. To facilitate further research, we publicly release the dataset (<span class="cit" id="xref_altparen.1"><a href="#bib1.bibx66">Wang et al.</a>, <a href="#bib1.bibx66">2024</a></span>, <a href="https://doi.org/10.5281/zenodo.10791151">https://doi.org/10.5281/zenodo.10791151</a>), data generation codes, and trained U-Net model (<span class="cit" id="xref_altparen.2"><a href="#bib1.bibx64">Wang</a>, <a href="#bib1.bibx64">2024</a></span>, <span class="uri">https://github.com/wanggy-1/cigChannel</span>, last access: 5 July 2025), aiming to advance deep learning approaches for seismic paleochannel interpretation.</p> |
|---|---|
| ISSN: | 1866-3508 1866-3516 |