Bounds on the Excess Minimum Risk via Generalized Information Divergence Measures

Given finite-dimensional random vectors <i>Y</i>, <i>X</i>, and <i>Z</i> that form a Markov chain in that order (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi&g...

Full description

Saved in:
Bibliographic Details
Main Authors: Ananya Omanwar, Fady Alajaji, Tamás Linder
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/27/7/727
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Given finite-dimensional random vectors <i>Y</i>, <i>X</i>, and <i>Z</i> that form a Markov chain in that order (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>Y</mi><mo>→</mo><mi>X</mi><mo>→</mo><mi>Z</mi></mrow></semantics></math></inline-formula>), we derive the upper bounds on the excess minimum risk using generalized information divergence measures. Here, <i>Y</i> is a target vector to be estimated from an observed feature vector <i>X</i> or its stochastically degraded version <i>Z</i>. The excess minimum risk is defined as the difference between the minimum expected loss in estimating <i>Y</i> from <i>X</i> and from <i>Z</i>. We present a family of bounds that generalize a prior bound based on mutual information, using the Rényi and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>α</mi></semantics></math></inline-formula>-Jensen–Shannon divergences, as well as Sibson’s mutual information. Our bounds are similar to recently developed bounds for the generalization error of learning algorithms. However, unlike these works, our bounds do not require the sub-Gaussian parameter to be constant, and therefore, apply to a broader class of joint distributions over <i>Y</i>, <i>X</i>, and <i>Z</i>. We also provide numerical examples under both constant and non-constant sub-Gaussianity assumptions, illustrating that our generalized divergence-based bounds can be tighter than the ones based on mutual information for certain regimes of the parameter <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>α</mi></semantics></math></inline-formula>.
ISSN:1099-4300