ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

Wang, Jiayi; Mao, Yi-An; Ma, Xiaoyu; Guo, Sicen; Shao, Yuting; Lv, Xiao; Han, Wenting; Christopher, Mark; Zangwill, Linda M.; Bi, Yanlong; Fan, Rui

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2405.09552v2 (eess)

[Submitted on 15 Apr 2024 (v1), last revised 2 Jun 2024 (this version, v2)]

Title:ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

Authors:Jiayi Wang, Yi-An Mao, Xiaoyu Ma, Sicen Guo, Yuting Shao, Xiao Lv, Wenting Han, Mark Christopher, Linda M. Zangwill, Yanlong Bi, Rui Fan

View PDF HTML (experimental)

Abstract:Optic nerve head (ONH) detection has been a crucial area of study in ophthalmology for years. However, the significant discrepancy between fundus image datasets, each generated using a single type of fundus camera, poses challenges to the generalizability of ONH detection approaches developed based on semantic segmentation networks. Despite the numerous recent advancements in general-purpose semantic segmentation methods using convolutional neural networks (CNNs) and Transformers, there is currently a lack of benchmarks for these state-of-the-art (SoTA) networks specifically trained for ONH detection. Therefore, in this article, we make contributions from three key aspects: network design, the publication of a dataset, and the establishment of a comprehensive benchmark. Our newly developed ONH detection network, referred to as ODFormer, is based upon the Swin Transformer architecture and incorporates two novel components: a multi-scale context aggregator and a lightweight bidirectional feature recalibrator. Our published large-scale dataset, known as TongjiU-DROD, provides multi-resolution fundus images for each participant, captured using two distinct types of cameras. Our established benchmark involves three datasets: DRIONS-DB, DRISHTI-GS1, and TongjiU-DROD, created by researchers from different countries and containing fundus images captured from participants of diverse races and ages. Extensive experimental results demonstrate that our proposed ODFormer outperforms other state-of-the-art (SoTA) networks in terms of performance and generalizability. Our dataset and source code are publicly available at this http URL.

Subjects:	Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2405.09552 [eess.IV]
	(or arXiv:2405.09552v2 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2405.09552

Submission history

From: Rui Fan [view email]
[v1] Mon, 15 Apr 2024 11:49:37 UTC (1,358 KB)
[v2] Sun, 2 Jun 2024 10:49:47 UTC (1,445 KB)

✅2024-10-01: arxiv.org is back to normal.✅

Electrical Engineering and Systems Science > Image and Video Processing

Title:ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

✅2024-10-01: arxiv.org is back to normal.✅

Electrical Engineering and Systems Science > Image and Video Processing

Title:ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators