Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort study
Abstract Objective This study aimed to (i) describe the procedures for generating self-generated identification codes (SGICs) in a prospective longitudinal evaluation of a sexual health program for secondary school students in Hong Kong; (ii) outline the matching strategies and processes; (iii) exam...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-06-01
|
| Series: | BMC Medical Informatics and Decision Making |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12911-025-03028-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850224240726376448 |
|---|---|
| author | Edmond Pui Hang Choi Ellie Bostwick Andres Heidi Sze Lok Fan Lai Ming Ho Alice Wai Chi Fung Kevin Wing Chung Lau Neda Hei Tung Ng Monique Yeung Janice Mary Johnston |
| author_facet | Edmond Pui Hang Choi Ellie Bostwick Andres Heidi Sze Lok Fan Lai Ming Ho Alice Wai Chi Fung Kevin Wing Chung Lau Neda Hei Tung Ng Monique Yeung Janice Mary Johnston |
| author_sort | Edmond Pui Hang Choi |
| collection | DOAJ |
| description | Abstract Objective This study aimed to (i) describe the procedures for generating self-generated identification codes (SGICs) in a prospective longitudinal evaluation of a sexual health program for secondary school students in Hong Kong; (ii) outline the matching strategies and processes; (iii) examine rates of successful matching and associated factors; and (iv) compare the responses of participants whose data could be matched to those whose data could not. Methods A prospective longitudinal cohort study was conducted. The SGIC comprised a 5-element code with 4 digits and 3 letters. A matching algorithm was developed to link baseline and follow-up data collected from students in Years 1 to 3 (n = 1,064) during the 2019–2020 school year. Matching success and associated factors were analyzed, and responses from matched and unmatched participants were compared. Results The rate of perfectly matched cases was 49.06%, while 23.59% were partially matched, and 27.35% were unmatched. Logistic regression analysis revealed that male students (adjusted odds ratio [aOR]: 0.63) and Year 1 students (vs. Year 3; aOR: 0.56) were less likely to be perfectly matched. Compared to unmatched cases, perfectly and partially matched cases were less likely to have missing values and more likely to exhibit positive attitudes toward the sexual health program and related topics, such as the importance of sexual health, equal relationships, and condom use. Conclusion The use of SGICs successfully matched approximately 72.65% of the study sample over a one-year period. These findings highlight the potential of SGICs as a tool for longitudinal data matching while underscoring the need for further refinement of code generation processes and matching algorithms to minimize data wastage and improve effectiveness. |
| format | Article |
| id | doaj-art-39715bf4710140bbbb20c97b3594e757 |
| institution | OA Journals |
| issn | 1472-6947 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Medical Informatics and Decision Making |
| spelling | doaj-art-39715bf4710140bbbb20c97b3594e7572025-08-20T02:05:41ZengBMCBMC Medical Informatics and Decision Making1472-69472025-06-0125111210.1186/s12911-025-03028-1Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort studyEdmond Pui Hang Choi0Ellie Bostwick Andres1Heidi Sze Lok Fan2Lai Ming Ho3Alice Wai Chi Fung4Kevin Wing Chung Lau5Neda Hei Tung Ng6Monique Yeung7Janice Mary Johnston8School of Nursing, University of Hong KongDuke-NUS Medical SchoolThe University of British Columbia - Okanagan CampusSchool of Public Health, University of Hong KongMother’s ChoiceMother’s ChoiceMother’s ChoiceMother’s ChoiceSchool of Public Health, University of Hong KongAbstract Objective This study aimed to (i) describe the procedures for generating self-generated identification codes (SGICs) in a prospective longitudinal evaluation of a sexual health program for secondary school students in Hong Kong; (ii) outline the matching strategies and processes; (iii) examine rates of successful matching and associated factors; and (iv) compare the responses of participants whose data could be matched to those whose data could not. Methods A prospective longitudinal cohort study was conducted. The SGIC comprised a 5-element code with 4 digits and 3 letters. A matching algorithm was developed to link baseline and follow-up data collected from students in Years 1 to 3 (n = 1,064) during the 2019–2020 school year. Matching success and associated factors were analyzed, and responses from matched and unmatched participants were compared. Results The rate of perfectly matched cases was 49.06%, while 23.59% were partially matched, and 27.35% were unmatched. Logistic regression analysis revealed that male students (adjusted odds ratio [aOR]: 0.63) and Year 1 students (vs. Year 3; aOR: 0.56) were less likely to be perfectly matched. Compared to unmatched cases, perfectly and partially matched cases were less likely to have missing values and more likely to exhibit positive attitudes toward the sexual health program and related topics, such as the importance of sexual health, equal relationships, and condom use. Conclusion The use of SGICs successfully matched approximately 72.65% of the study sample over a one-year period. These findings highlight the potential of SGICs as a tool for longitudinal data matching while underscoring the need for further refinement of code generation processes and matching algorithms to minimize data wastage and improve effectiveness.https://doi.org/10.1186/s12911-025-03028-1AdolescentsLongitudinal studyCohort studiesAnonymitySelf-generated identification codesSexual health |
| spellingShingle | Edmond Pui Hang Choi Ellie Bostwick Andres Heidi Sze Lok Fan Lai Ming Ho Alice Wai Chi Fung Kevin Wing Chung Lau Neda Hei Tung Ng Monique Yeung Janice Mary Johnston Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort study BMC Medical Informatics and Decision Making Adolescents Longitudinal study Cohort studies Anonymity Self-generated identification codes Sexual health |
| title | Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort study |
| title_full | Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort study |
| title_fullStr | Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort study |
| title_full_unstemmed | Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort study |
| title_short | Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort study |
| title_sort | using self generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students a cohort study |
| topic | Adolescents Longitudinal study Cohort studies Anonymity Self-generated identification codes Sexual health |
| url | https://doi.org/10.1186/s12911-025-03028-1 |
| work_keys_str_mv | AT edmondpuihangchoi usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy AT elliebostwickandres usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy AT heidiszelokfan usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy AT laimingho usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy AT alicewaichifung usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy AT kevinwingchunglau usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy AT nedaheitungng usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy AT moniqueyeung usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy AT janicemaryjohnston usingselfgeneratedidentificationcodestomatchanonymouslongitudinaldatainasexualhealthstudyofsecondaryschoolstudentsacohortstudy |