Efficient Searches for Small Signals across Two-channel Noisy Data: A Challenge in Big-data Observations

Searching for novel small signals in noisy data is preferably pursued by correlation in two or more independently operating channels. Signals of potential interest exist in tails beyond κσ , where κ denotes a multiple of the standard deviation σ of the data. Since moving data is a major cost factor...

Full description

Saved in:
Bibliographic Details
Main Authors: Maryam Aghaei Abchouyeh, Maurice H. P. M. van Putten, Seyong Kim
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:The Astrophysical Journal Supplement Series
Subjects:
Online Access:https://doi.org/10.3847/1538-4365/adec9d
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Searching for novel small signals in noisy data is preferably pursued by correlation in two or more independently operating channels. Signals of potential interest exist in tails beyond κσ , where κ denotes a multiple of the standard deviation σ of the data. Since moving data is a major cost factor in big-data analysis and heterogeneous computing more generally, efficiency may be optimized by restricting the computation of correlations to the tails of two-channel data exceeding κσ . Already, a moderate value κ  ≳ 2 realizes a data reduction by at least an order of magnitude. Here, we study this approach using a novel excess probability ratio (EPR), correlating Boolean data resulting from tails beyond a cutoff κσ . We compare and rank EPR performance against conventional direct cross correlation and the Pearson coefficient (PC), applicable to the original data with no cutoff. This benchmark is performed over different combinations of background noise (Gaussian, Poisson and uniform) and signals (Gaussian, Poisson, uniform, chirps and sine waves). Results show the performance of EPR to be comparable to that of the PC, providing a new approach for significant improvements in efficiency with essentially no loss of sensitivity, relevant to the present era of big-data observatories.
ISSN:0067-0049