Efficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data Observations
Searching for novel small signals in noisy data is preferably pursued by correlation in two or more independently operating channels. Signals of potential interest exist in tails beyond κσ , where κ denotes a multiple of the standard deviation σ of the data. Since moving data is a major cost factor...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IOP Publishing
2025-01-01
|
| Series: | The Astrophysical Journal Supplement Series |
| Subjects: | |
| Online Access: | https://doi.org/10.3847/1538-4365/adec9d |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849770816159350784 |
|---|---|
| author | Maryam Aghaei Abchouyeh Maurice H. P. M. van Putten Seyong Kim |
| author_facet | Maryam Aghaei Abchouyeh Maurice H. P. M. van Putten Seyong Kim |
| author_sort | Maryam Aghaei Abchouyeh |
| collection | DOAJ |
| description | Searching for novel small signals in noisy data is preferably pursued by correlation in two or more independently operating channels. Signals of potential interest exist in tails beyond κσ , where κ denotes a multiple of the standard deviation σ of the data. Since moving data is a major cost factor in big-data analysis and heterogeneous computing more generally, efficiency may be optimized by restricting the computation of correlations to the tails of two-channel data exceeding κσ . Already, a moderate value κ ≳ 2 realizes a data reduction by at least an order of magnitude. Here, we study this approach using a novel excess probability ratio (EPR), correlating Boolean data resulting from tails beyond a cutoff κσ . We compare and rank EPR performance against conventional direct cross correlation and the Pearson coefficient (PC), applicable to the original data with no cutoff. This benchmark is performed over different combinations of background noise (Gaussian, Poisson and uniform) and signals (Gaussian, Poisson, uniform, chirps and sine waves). Results show the performance of EPR to be comparable to that of the PC, providing a new approach for significant improvements in efficiency with essentially no loss of sensitivity, relevant to the present era of big-data observatories. |
| format | Article |
| id | doaj-art-5cea910bb7b34b20bf43a9d465c73594 |
| institution | DOAJ |
| issn | 0067-0049 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IOP Publishing |
| record_format | Article |
| series | The Astrophysical Journal Supplement Series |
| spelling | doaj-art-5cea910bb7b34b20bf43a9d465c735942025-08-20T03:02:52ZengIOP PublishingThe Astrophysical Journal Supplement Series0067-00492025-01-012801710.3847/1538-4365/adec9dEfficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data ObservationsMaryam Aghaei Abchouyeh0https://orcid.org/0000-0002-1518-1946Maurice H. P. M. van Putten1https://orcid.org/0000-0002-9212-411XSeyong Kim2https://orcid.org/0000-0002-2102-7398Department of Physics and Astronomy, Sejong University , 98 Gunja-Dong, Gwangjin-gu, Seoul 143-747, Republic of Korea ; mvp@sejong.ac.krDepartment of Physics and Astronomy, Sejong University , 98 Gunja-Dong, Gwangjin-gu, Seoul 143-747, Republic of Korea ; mvp@sejong.ac.kr; INAF-OAS Bologna , via P. Gobetti, 101, I-40129 Bologna, ItalyDepartment of Physics and Astronomy, Sejong University , 98 Gunja-Dong, Gwangjin-gu, Seoul 143-747, Republic of Korea ; mvp@sejong.ac.krSearching for novel small signals in noisy data is preferably pursued by correlation in two or more independently operating channels. Signals of potential interest exist in tails beyond κσ , where κ denotes a multiple of the standard deviation σ of the data. Since moving data is a major cost factor in big-data analysis and heterogeneous computing more generally, efficiency may be optimized by restricting the computation of correlations to the tails of two-channel data exceeding κσ . Already, a moderate value κ ≳ 2 realizes a data reduction by at least an order of magnitude. Here, we study this approach using a novel excess probability ratio (EPR), correlating Boolean data resulting from tails beyond a cutoff κσ . We compare and rank EPR performance against conventional direct cross correlation and the Pearson coefficient (PC), applicable to the original data with no cutoff. This benchmark is performed over different combinations of background noise (Gaussian, Poisson and uniform) and signals (Gaussian, Poisson, uniform, chirps and sine waves). Results show the performance of EPR to be comparable to that of the PC, providing a new approach for significant improvements in efficiency with essentially no loss of sensitivity, relevant to the present era of big-data observatories.https://doi.org/10.3847/1538-4365/adec9dAstronomy data analysis |
| spellingShingle | Maryam Aghaei Abchouyeh Maurice H. P. M. van Putten Seyong Kim Efficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data Observations The Astrophysical Journal Supplement Series Astronomy data analysis |
| title | Efficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data Observations |
| title_full | Efficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data Observations |
| title_fullStr | Efficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data Observations |
| title_full_unstemmed | Efficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data Observations |
| title_short | Efficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data Observations |
| title_sort | efficient searches for small signals across two channel noisy data a challenge in big data observations |
| topic | Astronomy data analysis |
| url | https://doi.org/10.3847/1538-4365/adec9d |
| work_keys_str_mv | AT maryamaghaeiabchouyeh efficientsearchesforsmallsignalsacrosstwochannelnoisydataachallengeinbigdataobservations AT mauricehpmvanputten efficientsearchesforsmallsignalsacrosstwochannelnoisydataachallengeinbigdataobservations AT seyongkim efficientsearchesforsmallsignalsacrosstwochannelnoisydataachallengeinbigdataobservations |