Genome Assembly of the Iconic Samba Mahsuri Delineates Locus-specific Population Structure within Indica Rice
Genome Assembly of the Iconic Samba Mahsuri Delineates Locus-specific Population Structure within Indica Rice
Rao, D.; K, S. K.; T, N. S.; Khan, E.; Sonti, R. V.; Tiwari, S.; Patel, H. K.
AbstractHigh-quality reference genomes enable detailed analysis of structural variation and its consequences for genome organization in crops. Here, we present a chromosome-scale genome assembly of Oryza sativa cv. Samba Mahsuri (SM), an elite Indian mega rice variety cultivated for its grain and cooking quality. Using PacBio HiFi sequencing in combination with Illumina reads and Bionano optical mapping, we generated a ~395 Mb assembly (SMv1.0) with 97.7% BUSCO completeness. A robust annotation framework identified 31,138 evidence-guided protein-coding gene models alongside 59,152 ab initio predictions. Comparative genomic analyses revealed extensive macrosynteny with established rice reference genomes, while uncovering pronounced locus-specific sequence and structural polymorphisms. Notably, a complex inversion-match-inversion (IMI) configuration on chromosome 6 differentiates SM from the japonica reference Nipponbare, but not from the indica reference R498. Population-scale analyses of 533 cultivated and 4 wild rice accessions demonstrate that genetic variation within the IMI region produces a markedly sharper and more coherent population structure than is observed in flanking regions or genome-wide, including tight subpopulation-based clustering and segregation of alternative IMI configurations within indica rice. Together, these results establish SMv1.0 as a robust chromosome-scale reference genome sequence for rice and demonstrate how large structural polymorphisms can shape locus-specific patterns of relatedness that diverge from genome-wide ancestry.