Populations and Evolution (q-bio.PE)
Mon, 24 Apr 2023
1.The Theory of Gene Family Histories
Authors:Marc Hellmuth, Peter F. Stadler
Abstract: Most genes are part of larger families of evolutionary related genes. The history of gene families typically involves duplications and losses of genes as well as horizontal transfers into other organisms. The reconstruction of detailed gene family histories, i.e., the precise dating of evolutionary events relative to phylogenetic tree of the underlying species has remained a challenging topic despite their importance as a basis for detailed investigations into adaptation and functional evolution of individual members of the gene family. The identification of orthologs, moreover, is a particularly important subproblem of the more general setting considered here. In the last few years, an extensive body of mathematical results has appeared that tightly links orthology, a formal notion of best matches among genes, and horizontal gene transfer. The purpose of this chapter is the broadly outline some of the key mathematical insights and to discuss their implication for practical applications. In particular, we focus on tree-free methods, i.e., methods to infer orthology or horizontal gene transfer as well as gene trees, species trees and reconciliations between them without using \emph{a priori} knowledge of the underlying trees or statistical models for the inference of phylogenetic trees. Instead, the initial step aims to extract binary relations among genes.
2.Reporting delays: a widely neglected impact factor in COVID-19 forecasts
Authors:Long MA, Piet Van Mieghem, Maksim Kitsak
Abstract: Epidemic forecasts are only as good as the accuracy of epidemic measurements. Is epidemic data, particularly COVID-19 epidemic data, clean and devoid of noise? Common sense implies the negative answer. While we cannot evaluate the cleanliness of the COVID-19 epidemic data in a holistic fashion, we can assess the data for the presence of reporting delays. In our work, through the analysis of the first COVID-19 wave, we find substantial reporting delays in the published epidemic data. Motivated by the desire to enhance epidemic forecasts, we develop a statistical framework to detect, uncover, and remove reporting delays in the infectious, recovered, and deceased epidemic time series. Our framework can uncover and analyze reporting delays in 8 regions significantly affected by the first COVID-19 wave. Further, we demonstrate that removing reporting delays from epidemic data using our statistical framework may decrease the error in epidemic forecasts. While our statistical framework can be used in combination with any epidemic forecast method that intakes infectious, recovered, and deceased data, to make a basic assessment, we employed the classical SIRD epidemic model. Our results indicate that the removal of reporting delays from the epidemic data may decrease the forecast error by up to 50. We anticipate that our framework will be indispensable in the analysis of novel COVID-19 strains and other existing or novel infectious diseases.