A large-scale prospective nested case-control study: developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community-based ESPRIT-AI cohort
Background: Pancreatic cancer (PC) remains a significant public health concern due to its late diagnosis and limited effective screening methods. This study aimed to develop a robust risk prediction model for early detection, utilizing a large prospective cohort to ensure generalizability. Method: W...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-02-01
|
| Series: | The Lancet Regional Health. Western Pacific |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2666606524003043 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849721043598442496 |
|---|---|
| author | Chaoliang Zhong Penghao Li Jia Zhao Xue Han Beilei Wang Gang Jin |
| author_facet | Chaoliang Zhong Penghao Li Jia Zhao Xue Han Beilei Wang Gang Jin |
| author_sort | Chaoliang Zhong |
| collection | DOAJ |
| description | Background: Pancreatic cancer (PC) remains a significant public health concern due to its late diagnosis and limited effective screening methods. This study aimed to develop a robust risk prediction model for early detection, utilizing a large prospective cohort to ensure generalizability. Method: We established a large-scale, continuous, real-world cohort, termed the Artificial Intelligence-based Early Screening of Pancreatic Cancer and High-Risk Tracing (ESPRIT-AI). This cohort encompasses 12 community health centers in Yangpu District, Shanghai, China. Based on this comprehensive dataset, we conducted a prospective, nested case-control study. Nine centers served as the training cohort, while three centers served as the test cohort. A total of 51,490 participants aged 50-75 years underwent annual health examinations from 2021.1 to 2023.12. The risk-related information and informed consent were collected from all the participants. PC diagnosis was obtained from the Center for Disease Control and Prevention's cancer registry. Model training utilized a 1:20 case-control ratio, employing LASSO regression and expert opinion to select features. Multiple machine learning algorithms were compared, with the best performing algorithm selected for the final predictive model, subsequently validated using a real-world external test cohort. The study was registered with ClinicalTrials.gov (NCT04743479). Findings: The cohort was divided into training (n=39,929, including 45 cases and 900 nested controls) and test (n=11,561, including 15 cases and 11,546 controls) sets. Following variable selection, four optimal variables were identified: Body Mass Index (BMI), Fasting Blood Glucose (FBG), Symptom, and Age. Multiple machine learning algorithms were evaluated, with the Random Forest demonstrating superior performance and selected as the final model. In a large-scale, independent real-world test cohort, the model demonstrated a specificity of 97.21% and sensitivity of 33.33%. The model effectively stratified the population, identifying 316 high-risk individuals (2.73% of the test set), among whom 5 were diagnosed with PC. This resulted in a PC prevalence of 1.58% within the high-risk group, representing a 1.93-fold increase compared to the 0.82% prevalence in newly diagnosed diabetes. Interpretation: These findings demonstrated our established model’s capacity to effectively identify a subpopulation with significantly elevated PC risk, potentially facilitating targeted imaging-based early detection strategies, balancing screening benefits and burdens. Funding: This work was funded by the Shanghai Science and Technology Committee Program (grant number 20511101200). |
| format | Article |
| id | doaj-art-4f62d4d0e35c45659635e9e4a8a03b02 |
| institution | DOAJ |
| issn | 2666-6065 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | Elsevier |
| record_format | Article |
| series | The Lancet Regional Health. Western Pacific |
| spelling | doaj-art-4f62d4d0e35c45659635e9e4a8a03b022025-08-20T03:11:48ZengElsevierThe Lancet Regional Health. Western Pacific2666-60652025-02-015510131010.1016/j.lanwpc.2024.101310A large-scale prospective nested case-control study: developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community-based ESPRIT-AI cohortChaoliang Zhong0Penghao Li1Jia Zhao2Xue Han3Beilei Wang4Gang Jin5Department of Hepatobiliary and Pancreatic Surgery, Changhai Hospital affiliated to Naval Medical University, ChinaDepartment of Hepatobiliary and Pancreatic Surgery, Changhai Hospital affiliated to Naval Medical University, ChinaDepartment of Infectious Disease Control, Center for Disease Control and Prevention of Yangpu District, ChinaDepartment of Infectious Disease Control, Center for Disease Control and Prevention of Yangpu District, ChinaDepartment of Hepatobiliary and Pancreatic Surgery, Changhai Hospital affiliated to Naval Medical University, ChinaDepartment of Hepatobiliary and Pancreatic Surgery, Changhai Hospital affiliated to Naval Medical University, ChinaBackground: Pancreatic cancer (PC) remains a significant public health concern due to its late diagnosis and limited effective screening methods. This study aimed to develop a robust risk prediction model for early detection, utilizing a large prospective cohort to ensure generalizability. Method: We established a large-scale, continuous, real-world cohort, termed the Artificial Intelligence-based Early Screening of Pancreatic Cancer and High-Risk Tracing (ESPRIT-AI). This cohort encompasses 12 community health centers in Yangpu District, Shanghai, China. Based on this comprehensive dataset, we conducted a prospective, nested case-control study. Nine centers served as the training cohort, while three centers served as the test cohort. A total of 51,490 participants aged 50-75 years underwent annual health examinations from 2021.1 to 2023.12. The risk-related information and informed consent were collected from all the participants. PC diagnosis was obtained from the Center for Disease Control and Prevention's cancer registry. Model training utilized a 1:20 case-control ratio, employing LASSO regression and expert opinion to select features. Multiple machine learning algorithms were compared, with the best performing algorithm selected for the final predictive model, subsequently validated using a real-world external test cohort. The study was registered with ClinicalTrials.gov (NCT04743479). Findings: The cohort was divided into training (n=39,929, including 45 cases and 900 nested controls) and test (n=11,561, including 15 cases and 11,546 controls) sets. Following variable selection, four optimal variables were identified: Body Mass Index (BMI), Fasting Blood Glucose (FBG), Symptom, and Age. Multiple machine learning algorithms were evaluated, with the Random Forest demonstrating superior performance and selected as the final model. In a large-scale, independent real-world test cohort, the model demonstrated a specificity of 97.21% and sensitivity of 33.33%. The model effectively stratified the population, identifying 316 high-risk individuals (2.73% of the test set), among whom 5 were diagnosed with PC. This resulted in a PC prevalence of 1.58% within the high-risk group, representing a 1.93-fold increase compared to the 0.82% prevalence in newly diagnosed diabetes. Interpretation: These findings demonstrated our established model’s capacity to effectively identify a subpopulation with significantly elevated PC risk, potentially facilitating targeted imaging-based early detection strategies, balancing screening benefits and burdens. Funding: This work was funded by the Shanghai Science and Technology Committee Program (grant number 20511101200).http://www.sciencedirect.com/science/article/pii/S2666606524003043 |
| spellingShingle | Chaoliang Zhong Penghao Li Jia Zhao Xue Han Beilei Wang Gang Jin A large-scale prospective nested case-control study: developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community-based ESPRIT-AI cohort The Lancet Regional Health. Western Pacific |
| title | A large-scale prospective nested case-control study: developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community-based ESPRIT-AI cohort |
| title_full | A large-scale prospective nested case-control study: developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community-based ESPRIT-AI cohort |
| title_fullStr | A large-scale prospective nested case-control study: developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community-based ESPRIT-AI cohort |
| title_full_unstemmed | A large-scale prospective nested case-control study: developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community-based ESPRIT-AI cohort |
| title_short | A large-scale prospective nested case-control study: developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community-based ESPRIT-AI cohort |
| title_sort | large scale prospective nested case control study developing a comprehensive risk prediction model for early detection of pancreatic cancer in the community based esprit ai cohort |
| url | http://www.sciencedirect.com/science/article/pii/S2666606524003043 |
| work_keys_str_mv | AT chaoliangzhong alargescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT penghaoli alargescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT jiazhao alargescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT xuehan alargescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT beileiwang alargescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT gangjin alargescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT chaoliangzhong largescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT penghaoli largescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT jiazhao largescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT xuehan largescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT beileiwang largescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort AT gangjin largescaleprospectivenestedcasecontrolstudydevelopingacomprehensiveriskpredictionmodelforearlydetectionofpancreaticcancerinthecommunitybasedespritaicohort |