Diverse tendencies in codon usage evolution of SARS-CoV-2 genes
Diverse tendencies in codon usage evolution of SARS-CoV-2 genes
Blazej, P.; Mackiewicz, D.; Mackiewicz, P.
AbstractThe evolution of SARS-CoV-2 virus has raised questions about evolutionary trends in protein-coding sequences and their adaptation to human host. Thus, we studied 94,571 viral genomes from January 2020 to October 2024 using a novel representation of codon usage, which recoded gene sequences to labels reflecting human codon usage. Our analysis reveals that the genes coding for structural proteins tend to exhibit a less optimal adaptation to the human codon usage, whereas open reading frames ORF1a and ORF1ab encoding non-structural proteins show an opposite trend. The sequences for the accessory proteins demonstrated a variable tendency to change the codon preferences. The evolution of the more optimal codon usage in ORF1a and ORF1ab sequences can be associated with a higher speed and efficiency of translation of the coded polyproteins. Following their cleavage, the products play important roles in viral replication and transcription. Thus, the adaptation of their codons can increase the virus proliferation. In contrast, alterations in codon usage within structural protein-coding sequences may be associated with changes in their less accurate translation and folding during synthesis, which can provide an advantage in evading the host immune response. The results show that codon usage adaptations to the human host differ based on the gene type and function, reflecting a balance between their conflicting evolutionary pressures. Our findings on variations in codon usage among coronavirus genes provide valuable insights that can aid in developing new strategies for optimization of codons in vaccine mRNA and DNA for emerging strains.