In silico re-identication of properties of drug target proteins


2017/03/17: Drug target entity data sets are uploaded.
Protein list
AC: AC number in Swiss-prot
Gene: Emsembl gene ID
Trans: Emsembl transcriptor ID
Protein: Emsembl protein ID
Canonical: Emsembl canonical gene ID
ID: Swiss-prot ID
Type: 0: drug non-target proteins/ 1: drug target proteins
Set A: one indicate proteins are in set A, zero is not.
Set B: one indicate proteins are in set B, zero is not.
Set C: one indicate proteins are in set C, zero is not.
Set D: one indicate proteins are in set D, zero is not.
# paraloguos: number of paraloguos

Simple sequence properties
Tool: Pepstat

Enzyme commission number
DB: Swiss-prot
Row: AC, ID, EC1, EC2, EC3, EC4, EC5, EC6, total

Subcellular localization
DB: Swiss-prot, LOCATE, and Cell-Ploc
Tools: CELLO,Proteome Analyst, pTarget, WoLFPSORT, and MultiLoc

PEST region
Tool: epestfind
Row: AC, ID, Poor

Secondary structure
Tool: ganier
Row: AC, alpha, beta, turn, coil

Signal peptide cleavage
Tool: SignalP
Row: AC, motif, signalP

Transmembrane helices
Row: AC, # of TM

Download via PhosphositePlus

protein essentiality
DB: Georgi et al.
Row: AC, essential, non-essentiality, unknown

Gene expression level and tissue specificity
DB: Su et al.
Row: AC, expression level, tissue specificity

Solvent accessibility
Row: AC, accessibility score

GO term

R(Random forest)
hyunjulee at, kbs at