A set of bitopic proteins from six species was prepared using protein entries (main isoforms) from UniProt Swiss-Prot and TrEMBL databases (release 2015-07) (Figure 1).
FMAP method was used to identify TM α-helices in amino acid sequences and generate their 3D models in membranes (see METHODS). Predicted bitopic proteins were filtered to remove polytopic proteins and proteins with signal sequences or cleavable C-terminal helices based on UniProt and Pfam annotations. We also excluded proteins with hydrophobic helices that overlaped with globular domains indicated in Pfam, InterPro or PDB. Human intervention and analysis of related publications was required to resolve some ambiguous cases.
Figure 1. Numbers of bitopic proteins from different organisms in the Membranome database (2016-02).
The sets of bitopic proteins included in the Membranome database (Figure 2) are similar to those in UniProt for the fully annotated proteomes (human, yeast and prokaryotic), however they are almost 2 fold expanded relative to UniProt for A. thaliana and D. discoideum proteomes due to inclusion of numerous TrEMBL entries (Figure 2).
Figure 2. Comparison of bitopic protein sets in Membranome and UniProt databases