1 http://www.uniprot.org/locations/?query=*.

2 http://www.proteinatlas.org/.

3 http://www.uniprot.org/.

4 Note that a real protein sequence should have more than six amino acid residues. For ease of presentation, we truncate the length to six.

5 To distinguish the PairAA-based vector from the AA-based vector, we use [fi,21,fi,22,…] instead of [fi,1,fi,2,…] to represent the elements of a PairAA-based vector. Similar notations apply to GapAA.

6 When m = 1, the feature vectors represent the sequence-order correlation factor of dipeptides.

7 http://www.geneontology.org.

8 ftp://ftp.ebi.ac.uk/pub/databases/interpro/interpro2go.

9 http://www.ebi.ac.uk/GOA

10 http://www.ebi.ac.uk/GOA.

11 http://www.geneontology.org.

12 ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/ ...

Get Machine Learning for Protein Subcellular Localization Prediction now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.