Formation, persistence, and breakdown of carrier-set topology in linkage disequilibrium: empirical structure in 1000 Genomes and a two locus Wright Fisher model

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Formation, persistence, and breakdown of carrier-set topology in linkage disequilibrium: empirical structure in 1000 Genomes and a two locus Wright Fisher model

Authors

Ichikawa, Y.

Abstract

Linkage disequilibrium between two biallelic loci is usually summarized by scalar association measures such as r2 and D'. These measures quantify how visible an allelic association is to a symmetric LD scan, but they do not directly represent the topology of carrier sets: whether the carriers of one variant are contained within, partially overlap with, or are disjoint from the carriers of the other. This distinction is structural. On the haplotype frequency simplex, carrier set inclusion corresponds to a boundary face where one haplotype class is absent. In the rare common regime, a nested rare variant is further constrained by the ceiling r2[≤] pA/pB, so that complete carrier-set inclusion can remain nearly invisible to r2. Here, as a companion to the Fisher-geometry preprint 1, we examine the empirical and dynamic behavior of this carrier set topology. In 1000 Genomes Phase 3, across 156,604,320 SNP pairs from the MHC and NEGR1 regions, pairs on the |D'|=1 boundary span a wide range of r2 and |C|. Within fixed r2 strata, r2 poorly distinguishes nested from non-nested carrier set configurations, with AUROC values of approximately 0.54 to 0.62, whereas the boundary sensitive normalization D' separates them much more effectively, with AUROC values of approximately 0.90 to 0.92. The empirical data also obey the predicted r2 [≤] pA/pB ceiling. We then introduce a temporal axis using a two-locus Wright Fisher model on the same simplex. Carrier set topology evolves through three motions relative to the |D'|=1 boundary: formation or persistence, in which recombination suppression establishes and maintains inclusion without requiring selection; visibility change, in which selection or drift moves r2 along the boundary while preserving the inclusion relation; and breaking, in which a recombination pulse introduces the previously absent haplotype and dissolves inclusion. A fourth mode, specificity erosion, expands the partner carrier set while preserving inclusion, thereby lowering P(A|B)while keeping P(B|A)and |D'| equal to one. This mode shows that asymmetric conditional probabilities are best understood as diagnostic coordinates for carrier-set topology, not as the primary object itself. Together, these results show that topology and visibility are separable axes of LD structure. Conventional r2 based scans and carrier-set topology scans therefore answer complementary, not interchangeable, questions.

Follow Us on

0 comments

Add comment