Proteins are the main, building cell blocks, responsible for the existing cell biological processes. Therefore, precise knowledge of protein function is of great significance. There are a lot of methods which are used for protein comparison and for determining protein function. Some of them use structure alignment, others use sequence alignment, while some use protein descriptors. Here, we use two protein descriptors: Voxel and Ray-based descriptors to encode the structural and biological features of proteins. In biology there is a trend to hierarchically organize the things, like protein functions, cell components and the whole living world. There are a lot of classification systems which generate proteins in tree structure. However, due to the fact that it often happens that one protein has more than one parent, the Directed Acyclic Graph (DAG) hierarchy is used. Gene Ontology (GO) is a system for structural and hierarchical representation of proteins and gene products which support DAG hierarchy. CLUS, however, is a system which deals with hierarchical data. In this paper, we present a comparison between the two previously mentioned protein descriptors for predicting protein function. Firstly, protein descriptors are extracted from the structural coordinates found in the Protein Data Bank (PDB) and proteins backbone, appropriately. Afterwards, GO class hierarchy is added to each protein which has descriptor data. This created file is used as an input to the CLUS system. CLUS generates a decision tree model which is trained from the protein structure. The results from this system are the GO classes in which the protein belongs. The generated output shows that the predicting protein function with the Voxel protein descriptor gives better results instead of predicting protein function with the Ray protein descriptor.
Gene Ontology CLUS Voxel protein descriptor Ray-based protein descriptor Predicting protein function