The architecture of the genome integrates scale independence with inverse symmetry

The simplest building blocks of the genome, the k-mers, show two properties that are widely observed. Their frequency distribution is scale-free (a variant Zipfian distribution), and the inverse symmetry of k-mers is observable on the same strand. These phenomena are linked; Watson–Crick...

Full description

Saved in:
Bibliographic Details
Main Authors: Greg Warr, Les Hatton
Format: Article
Language:English
Published: Academia.edu Journals 2025-04-01
Series:Academia Molecular Biology and Genomics
Online Access:https://www.academia.edu/128851197/The_architecture_of_the_genome_integrates_scale_independence_with_inverse_symmetry
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849469739206705152
author Greg Warr
Les Hatton
author_facet Greg Warr
Les Hatton
author_sort Greg Warr
collection DOAJ
description The simplest building blocks of the genome, the k-mers, show two properties that are widely observed. Their frequency distribution is scale-free (a variant Zipfian distribution), and the inverse symmetry of k-mers is observable on the same strand. These phenomena are linked; Watson–Crick base pairing generates inverse symmetry (IS) under the condition that the same frequency distribution of k-mers is present on both strands of the genome. A stable scale-free equilibrium distribution of k-mer frequency in all genomes is predicted by a purely probabilistic theory, the Conservation of Hartley–Shannon Information (CoHSI). This does not replace the diverse mechanism-based explanations of IS that have been advanced, but in principle, it aggregates all operative mechanisms. CoHSI predicts that both the scale-free distribution of k-mers and the IS that follows from it should decay gradually and stochastically as the genome size decreases and the length of the k-mers increases. These predictions were tested in 178 genomes from all domains of life and viruses. The precision of both the Zipfian distribution of k-mer frequency and of IS decayed progressively as the genome size decreased and k-mer length increased, regardless of the structure of the genome; DNA or RNA, nuclear or plastid, double- or single-stranded. No clear partition into IS-compliant and non-compliant genomes could be inferred. These results suggest that both IS and scale-free distributions of k-mer frequency in genomes are linked properties that emerge probabilistically and in a mechanism-agnostic manner across the three domains of life and viruses.
format Article
id doaj-art-2d2ec515b28c406d93a18fbfb5a20536
institution Kabale University
issn 3064-9765
language English
publishDate 2025-04-01
publisher Academia.edu Journals
record_format Article
series Academia Molecular Biology and Genomics
spelling doaj-art-2d2ec515b28c406d93a18fbfb5a205362025-08-20T03:25:22ZengAcademia.edu JournalsAcademia Molecular Biology and Genomics3064-97652025-04-012210.20935/AcadMolBioGen7650The architecture of the genome integrates scale independence with inverse symmetryGreg Warr0Les Hatton1Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC 29403, USA.School of Computer Science and Mathematics, Kingston University, London KT1 1LQ, UK. The simplest building blocks of the genome, the k-mers, show two properties that are widely observed. Their frequency distribution is scale-free (a variant Zipfian distribution), and the inverse symmetry of k-mers is observable on the same strand. These phenomena are linked; Watson–Crick base pairing generates inverse symmetry (IS) under the condition that the same frequency distribution of k-mers is present on both strands of the genome. A stable scale-free equilibrium distribution of k-mer frequency in all genomes is predicted by a purely probabilistic theory, the Conservation of Hartley–Shannon Information (CoHSI). This does not replace the diverse mechanism-based explanations of IS that have been advanced, but in principle, it aggregates all operative mechanisms. CoHSI predicts that both the scale-free distribution of k-mers and the IS that follows from it should decay gradually and stochastically as the genome size decreases and the length of the k-mers increases. These predictions were tested in 178 genomes from all domains of life and viruses. The precision of both the Zipfian distribution of k-mer frequency and of IS decayed progressively as the genome size decreased and k-mer length increased, regardless of the structure of the genome; DNA or RNA, nuclear or plastid, double- or single-stranded. No clear partition into IS-compliant and non-compliant genomes could be inferred. These results suggest that both IS and scale-free distributions of k-mer frequency in genomes are linked properties that emerge probabilistically and in a mechanism-agnostic manner across the three domains of life and viruses.https://www.academia.edu/128851197/The_architecture_of_the_genome_integrates_scale_independence_with_inverse_symmetry
spellingShingle Greg Warr
Les Hatton
The architecture of the genome integrates scale independence with inverse symmetry
Academia Molecular Biology and Genomics
title The architecture of the genome integrates scale independence with inverse symmetry
title_full The architecture of the genome integrates scale independence with inverse symmetry
title_fullStr The architecture of the genome integrates scale independence with inverse symmetry
title_full_unstemmed The architecture of the genome integrates scale independence with inverse symmetry
title_short The architecture of the genome integrates scale independence with inverse symmetry
title_sort architecture of the genome integrates scale independence with inverse symmetry
url https://www.academia.edu/128851197/The_architecture_of_the_genome_integrates_scale_independence_with_inverse_symmetry
work_keys_str_mv AT gregwarr thearchitectureofthegenomeintegratesscaleindependencewithinversesymmetry
AT leshatton thearchitectureofthegenomeintegratesscaleindependencewithinversesymmetry
AT gregwarr architectureofthegenomeintegratesscaleindependencewithinversesymmetry
AT leshatton architectureofthegenomeintegratesscaleindependencewithinversesymmetry