Zipf's Law Probably Existing in Protein Sequences
Graphical Abstract
In order to analyze whether Zipf's law in linguistics exists in protein sequences, this paper uses 1.735 7 × 104 protein sequences labeled with secondary structures which are selected from the DSSP database. The segments of successive amino acid residues with a same code of secondary structure are defined as words. The results show that the distribution of word emerging frequency follows Zipf's law with the exponent as 0.981.