A minimalist binary/digital approach to large-scale single molecule protein identification with optically labeled tRNAs and multiple carboxypeptidases and its extension to peptide sequencing

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

A minimalist binary/digital approach to large-scale single molecule protein identification with optically labeled tRNAs and multiple carboxypeptidases and its extension to peptide sequencing

Authors

Sampath, G.

Abstract

Recently a binary/digital scheme based on the superspecificity property of transfer RNAs (tRNAs) was proposed for the identification of single amino acids (AAs) from binary-valued measurements (Eur. Phys. J. E 45, 94, 2022). There are two formulations, they can be used to sequence short peptides and/or identify their parent proteins. In one of them an array of peptides is sequenced in 20 cycles by adding 20 different tRNAs carrying a fluorescent tag, optically recognizing the C-terminal residues, and cleaving the latter with a carboxypeptidase; the process is repeated over the peptides in parallel. Here this scheme is used to develop in theory a minimalist approach to protein identification that uses only two tRNAs and the carboxypeptidases A, B, and C. The latter form a complete and mutually exclusive set capable of cleaving all 20 AA types; this divides the 20 AAs into three classes. The sequences obtained are partial sequences in the reduced alphabet, their parent proteins can be obtained by search through a proteome database. The AA class of the terminal residue of every peptide in the array can be identified in a single cycle by using the three carboxypeptidases in the order C-B-A. With peptide lengths of ~20 and a cycle time of ~1 hour, the parent proteins of K peptides can be obtained in about 20 hours. This is independent of K (within the limits imposed by the imaging method used) and the dynamic range of a proteome; thus in theory a whole proteome can be processed in less than a day. Computational results suggest that the parent proteins of over 92% of peptides from the human proteome (Uniprot id UP000005640_9606) can be identified. The identification rate when residues are skipped due to carboxypeptidases cleaving the second and later residues in delayed reactions is about ~90% with 1 or 2 skips. Full sequencing without skipped residues can be done by using all 20 tRNA types over 20 cycles in increasing order of cleavage time of the 20 AA types; a recursive procedure is given.

Follow Us on

0 comments

Add comment