Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective
Abstract Visualizing high-dimensional data is essential for understanding biomedical data and deep learning models. Neighbor embedding methods, such as t-SNE and UMAP, are widely used but can introduce misleading visual artifacts. We find that the manifold learning interpretations from many prior wo...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-025-60434-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850231495267975168 |
|---|---|
| author | Zhexuan Liu Rong Ma Yiqiao Zhong |
| author_facet | Zhexuan Liu Rong Ma Yiqiao Zhong |
| author_sort | Zhexuan Liu |
| collection | DOAJ |
| description | Abstract Visualizing high-dimensional data is essential for understanding biomedical data and deep learning models. Neighbor embedding methods, such as t-SNE and UMAP, are widely used but can introduce misleading visual artifacts. We find that the manifold learning interpretations from many prior works are inaccurate and that the misuse stems from a lack of data-independent notions of embedding maps, which project high-dimensional data into a lower-dimensional space. Leveraging the leave-one-out principle, we introduce LOO-map, a framework that extends embedding maps beyond discrete points to the entire input space. We identify two forms of map discontinuity that distort visualizations: one exaggerates cluster separation and the other creates spurious local structures. As a remedy, we develop two types of point-wise diagnostic scores to detect unreliable embedding points and improve hyperparameter selection, which are validated on datasets from computer vision and single-cell omics. |
| format | Article |
| id | doaj-art-802b8bc55aff4658b324ec231c3f9a8e |
| institution | OA Journals |
| issn | 2041-1723 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-802b8bc55aff4658b324ec231c3f9a8e2025-08-20T02:03:31ZengNature PortfolioNature Communications2041-17232025-05-0116111610.1038/s41467-025-60434-9Assessing and improving reliability of neighbor embedding methods: a map-continuity perspectiveZhexuan Liu0Rong Ma1Yiqiao Zhong2Department of Statistics, University of Wisconsin-MadisonDepartment of Biostatistics, T.H. Chan School of Public Health, Harvard UniversityDepartment of Statistics, University of Wisconsin-MadisonAbstract Visualizing high-dimensional data is essential for understanding biomedical data and deep learning models. Neighbor embedding methods, such as t-SNE and UMAP, are widely used but can introduce misleading visual artifacts. We find that the manifold learning interpretations from many prior works are inaccurate and that the misuse stems from a lack of data-independent notions of embedding maps, which project high-dimensional data into a lower-dimensional space. Leveraging the leave-one-out principle, we introduce LOO-map, a framework that extends embedding maps beyond discrete points to the entire input space. We identify two forms of map discontinuity that distort visualizations: one exaggerates cluster separation and the other creates spurious local structures. As a remedy, we develop two types of point-wise diagnostic scores to detect unreliable embedding points and improve hyperparameter selection, which are validated on datasets from computer vision and single-cell omics.https://doi.org/10.1038/s41467-025-60434-9 |
| spellingShingle | Zhexuan Liu Rong Ma Yiqiao Zhong Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective Nature Communications |
| title | Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective |
| title_full | Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective |
| title_fullStr | Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective |
| title_full_unstemmed | Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective |
| title_short | Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective |
| title_sort | assessing and improving reliability of neighbor embedding methods a map continuity perspective |
| url | https://doi.org/10.1038/s41467-025-60434-9 |
| work_keys_str_mv | AT zhexuanliu assessingandimprovingreliabilityofneighborembeddingmethodsamapcontinuityperspective AT rongma assessingandimprovingreliabilityofneighborembeddingmethodsamapcontinuityperspective AT yiqiaozhong assessingandimprovingreliabilityofneighborembeddingmethodsamapcontinuityperspective |