NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images
Abstract Low-dose computed tomography (LDCT) is the most effective tools for early detection of lung cancer. With advancements in artificial intelligence, various Computer-Aided Diagnosis (CAD) systems are now supported in clinical practice. For radiologists dealing with a huge volume of CT scans, C...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-08-01
|
| Series: | Scientific Data |
| Online Access: | https://doi.org/10.1038/s41597-025-05742-x |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849226631862812672 |
|---|---|
| author | Kun-Hui Chen Yi-Hui Lin Shawn Wu Nai-Wen Shih Hsing-Chen Meng Yen-Yu Lin Chun-Rong Huang Jing-Wen Huang |
| author_facet | Kun-Hui Chen Yi-Hui Lin Shawn Wu Nai-Wen Shih Hsing-Chen Meng Yen-Yu Lin Chun-Rong Huang Jing-Wen Huang |
| author_sort | Kun-Hui Chen |
| collection | DOAJ |
| description | Abstract Low-dose computed tomography (LDCT) is the most effective tools for early detection of lung cancer. With advancements in artificial intelligence, various Computer-Aided Diagnosis (CAD) systems are now supported in clinical practice. For radiologists dealing with a huge volume of CT scans, CAD systems are helpful. However, the development of these systems depends on precisely annotated datasets, which are currently limited. Although several lung imaging datasets exist, there is only few of publicly available datasets with segmentation annotations on LDCT images. To address this problem, we developed a dataset based on NLST LDCT images with pixel-level annotations of lung lesions. The dataset includes LDCT scans from 605 patients and 715 annotated lesions, including 662 lung tumors and 53 lung nodules. Lesion volumes range from 0.03 cm3 to 372.21 cm3, with 500 lesions smaller than 5 cm3, mostly located in the right upper lung. A 2D U-Net model trained on the dataset achieved a 0.95 IoU on training dataset. This dataset enhances the diversity and usability of lung cancer annotation resources. |
| format | Article |
| id | doaj-art-2e1d2e8800164be79ffcbf3885b83164 |
| institution | Kabale University |
| issn | 2052-4463 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Data |
| spelling | doaj-art-2e1d2e8800164be79ffcbf3885b831642025-08-24T11:07:19ZengNature PortfolioScientific Data2052-44632025-08-0112111210.1038/s41597-025-05742-xNLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT ImagesKun-Hui Chen0Yi-Hui Lin1Shawn Wu2Nai-Wen Shih3Hsing-Chen Meng4Yen-Yu Lin5Chun-Rong Huang6Jing-Wen Huang7Department of Orthopedic Surgery, Taichung Veterans General HospitalDepartment of Radiation Oncology, Pingtung Veterans General HospitalDepartment of Diagnostic Imaging, SY Research InstituteDepartment of Radiation Oncology, Pingtung Veterans General HospitalGraduate Degree Program of AI, National Yang Ming Chiao Tung UniversityDepartment of Computer Science, National Yang Ming Chiao Tung UniversityDepartment of Computer Science, National Yang Ming Chiao Tung UniversityDepartment of Radiation Oncology, Taichung Veterans General HospitalAbstract Low-dose computed tomography (LDCT) is the most effective tools for early detection of lung cancer. With advancements in artificial intelligence, various Computer-Aided Diagnosis (CAD) systems are now supported in clinical practice. For radiologists dealing with a huge volume of CT scans, CAD systems are helpful. However, the development of these systems depends on precisely annotated datasets, which are currently limited. Although several lung imaging datasets exist, there is only few of publicly available datasets with segmentation annotations on LDCT images. To address this problem, we developed a dataset based on NLST LDCT images with pixel-level annotations of lung lesions. The dataset includes LDCT scans from 605 patients and 715 annotated lesions, including 662 lung tumors and 53 lung nodules. Lesion volumes range from 0.03 cm3 to 372.21 cm3, with 500 lesions smaller than 5 cm3, mostly located in the right upper lung. A 2D U-Net model trained on the dataset achieved a 0.95 IoU on training dataset. This dataset enhances the diversity and usability of lung cancer annotation resources.https://doi.org/10.1038/s41597-025-05742-x |
| spellingShingle | Kun-Hui Chen Yi-Hui Lin Shawn Wu Nai-Wen Shih Hsing-Chen Meng Yen-Yu Lin Chun-Rong Huang Jing-Wen Huang NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images Scientific Data |
| title | NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images |
| title_full | NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images |
| title_fullStr | NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images |
| title_full_unstemmed | NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images |
| title_short | NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images |
| title_sort | nlstseg a pixel level lung cancer dataset based on nlst ldct images |
| url | https://doi.org/10.1038/s41597-025-05742-x |
| work_keys_str_mv | AT kunhuichen nlstsegapixellevellungcancerdatasetbasedonnlstldctimages AT yihuilin nlstsegapixellevellungcancerdatasetbasedonnlstldctimages AT shawnwu nlstsegapixellevellungcancerdatasetbasedonnlstldctimages AT naiwenshih nlstsegapixellevellungcancerdatasetbasedonnlstldctimages AT hsingchenmeng nlstsegapixellevellungcancerdatasetbasedonnlstldctimages AT yenyulin nlstsegapixellevellungcancerdatasetbasedonnlstldctimages AT chunronghuang nlstsegapixellevellungcancerdatasetbasedonnlstldctimages AT jingwenhuang nlstsegapixellevellungcancerdatasetbasedonnlstldctimages |