Quasi-Analytical Least-Squares Generative Adversarial Networks: Further 1-D Results and Extension to Two Data Dimensions

Generative adversarial networks (GANs) are notoriously difficult to analyse, necessitating empirical studies in high dimensional spaces that suffer from stochastic sampling noise. Quasi-analytical, low-dimensional GANs can be developed in various special cases to elucidate aspects of GAN training in...

Full description

Saved in:
Bibliographic Details
Main Author: Graham W. Pulford
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11030454/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Generative adversarial networks (GANs) are notoriously difficult to analyse, necessitating empirical studies in high dimensional spaces that suffer from stochastic sampling noise. Quasi-analytical, low-dimensional GANs can be developed in various special cases to elucidate aspects of GAN training in a manageable, precise setting where variables of interest can be easily visualised. A previously developed 1-D Rayleigh/Square/Exponential/Erf (R/S/E/E) least squares GAN (LSGAN), with 1-D latent variable z and 1-D data x, is extended to the case of 2-D exponentially distributed data. The 2-D R/S/E/E LSGAN has 8 parameters and its dynamics under gradient descent ascent (GDA) are analysable to high accuracy via two 1-D numerical integrals. Visualisation strategies are given for the cost function and parameter trajectories during training. Numerical performance is compared with the equivalent stochastic GDA algorithm, obtaining precise agreement. It is shown that the 2-D R/S/E/E LSGAN, which satisfies <inline-formula> <tex-math notation="LaTeX">${{ dim}}(z)\lt { { dim}}(x)$ </tex-math></inline-formula>, has an optimal discriminator that is not differentiable, does not depend on the data PDF and is nowhere equal to 1/2, contradicting conventional GAN theory. For numerical simulations in the 1-D case, when the functional form of the optimal discriminator (a scaled logistic function) is fixed but its parameters are not matched to the optimal generator and can vary, convergence to the optimal settings does not occur, and, for certain initial settings, severe error propagation results. It is proven that the optimal generator setting cannot be a stable point of the GDA recursion. For a specific 1-D case, we also characterise the range of initial conditions for which convergence to the neighbourhood of the optimal generator occurs in a given number of steps. Finally, the extension to an exponential mixture data PDF is considered. A 2-D mixture R/S/E/E LSGAN with bifurcating (chaotic) parameter trajectories is exhibited. Empirical evidence is provided of long-term oscillatory behaviour in the parameters and cost function when both the step size (learning rate) and the support of the data distribution are large. In this instance, the oscillations are not due to mode collapse.
ISSN:2169-3536