U-Turn Diffusion

We investigate diffusion models generating synthetic samples from the probability distribution represented by the ground truth (GT) samples. We focus on how GT sample information is encoded in the score function (SF), computed (not simulated) from the Wiener–Ito linear forward process in the artific...

Full description

Saved in:
Bibliographic Details
Main Authors: Hamidreza Behjoo, Michael Chertkov
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/27/4/343
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We investigate diffusion models generating synthetic samples from the probability distribution represented by the ground truth (GT) samples. We focus on how GT sample information is encoded in the score function (SF), computed (not simulated) from the Wiener–Ito linear forward process in the artificial time <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>t</mi><mo>∈</mo><mo>[</mo><mn>0</mn><mo>→</mo><mo>∞</mo><mo>]</mo></mrow></semantics></math></inline-formula>, and then used as a nonlinear drift in the simulated Wiener–Ito reverse process with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>t</mi><mo>∈</mo><mo>[</mo><mo>∞</mo><mo>→</mo><mn>0</mn><mo>]</mo></mrow></semantics></math></inline-formula>. We propose U-Turn diffusion, an augmentation of a pre-trained diffusion model, which shortens the forward and reverse processes to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>t</mi><mo>∈</mo><mo>[</mo><mn>0</mn><mo>→</mo><msub><mi>T</mi><mi>u</mi></msub><mo>]</mo></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>t</mi><mo>∈</mo><mo>[</mo><msub><mi>T</mi><mi>u</mi></msub><mo>→</mo><mn>0</mn><mo>]</mo></mrow></semantics></math></inline-formula>. The U-Turn reverse process is initialized at <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>u</mi></msub></semantics></math></inline-formula> with a sample from the probability distribution of the forward process (initialized at <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>t</mi><mo>=</mo><mn>0</mn></mrow></semantics></math></inline-formula> with a GT sample) ensuring a detailed balance relation between the shortened forward and reverse processes. Our experiments on the class-conditioned SF of the ImageNet dataset and the multi-class, single SF of the CIFAR-10 dataset reveal a critical Memorization Time <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>m</mi></msub></semantics></math></inline-formula>, beyond which generated samples diverge from the GT sample used to initialize the U-Turn scheme, and a Speciation Time <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>s</mi></msub></semantics></math></inline-formula>, where for <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>T</mi><mi>u</mi></msub><mo>></mo><msub><mi>T</mi><mi>s</mi></msub><mo>></mo><msub><mi>T</mi><mi>m</mi></msub></mrow></semantics></math></inline-formula>, samples begin representing different classes. We further examine the role of SF nonlinearity through a Gaussian Test, comparing empirical and Gaussian-approximated U-Turn auto-correlation functions and showing that the SF becomes effectively affine for <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>t</mi><mo>></mo><msub><mi>T</mi><mi>s</mi></msub></mrow></semantics></math></inline-formula> and approximately affine for <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>t</mi><mo>∈</mo><mo>[</mo><msub><mi>T</mi><mi>m</mi></msub><mo>,</mo><msub><mi>T</mi><mi>s</mi></msub><mo>]</mo></mrow></semantics></math></inline-formula>.
ISSN:1099-4300