NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images

Abstract Low-dose computed tomography (LDCT) is the most effective tools for early detection of lung cancer. With advancements in artificial intelligence, various Computer-Aided Diagnosis (CAD) systems are now supported in clinical practice. For radiologists dealing with a huge volume of CT scans, C...

Full description

Saved in:
Bibliographic Details
Main Authors: Kun-Hui Chen, Yi-Hui Lin, Shawn Wu, Nai-Wen Shih, Hsing-Chen Meng, Yen-Yu Lin, Chun-Rong Huang, Jing-Wen Huang
Format: Article
Language:English
Published: Nature Portfolio 2025-08-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-05742-x
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Low-dose computed tomography (LDCT) is the most effective tools for early detection of lung cancer. With advancements in artificial intelligence, various Computer-Aided Diagnosis (CAD) systems are now supported in clinical practice. For radiologists dealing with a huge volume of CT scans, CAD systems are helpful. However, the development of these systems depends on precisely annotated datasets, which are currently limited. Although several lung imaging datasets exist, there is only few of publicly available datasets with segmentation annotations on LDCT images. To address this problem, we developed a dataset based on NLST LDCT images with pixel-level annotations of lung lesions. The dataset includes LDCT scans from 605 patients and 715 annotated lesions, including 662 lung tumors and 53 lung nodules. Lesion volumes range from 0.03 cm3 to 372.21 cm3, with 500 lesions smaller than 5 cm3, mostly located in the right upper lung. A 2D U-Net model trained on the dataset achieved a 0.95 IoU on training dataset. This dataset enhances the diversity and usability of lung cancer annotation resources.
ISSN:2052-4463