* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Improving Function Prediction Using Patterns of Native Disorder in Proteins Anna Lobley Abstract Instrinsically unstructured (disordered) proteins adopt little or no stable secondary structure in their native state. Proteins containing long disordered regions are abundant within eukaryotic genomes and can be predicted successfully from amino sequence. Disordered regions have been shown to be important for functional specificity and frequently contain binding motifs or are located at sites of covalent modification often inferring a regulatory role for the protein. Computational methods that predict protein function from sequence rely upon the use of homology information to transfer annotations between proteins. These methods are not applicable to orphan proteins or cases where whole families of protein sequences are not annotated. To address the requirement for protein function prediction methods that are independent of sequence homology and explore the use of information describing protein disorder, we have implemented a machine learning method for predicting protein function from sequence. A set of features for encoding disorder information was designed and their importance in predicting Gene Ontology (GO) categories demonstrated. The addition of disorder features significantly improved prediction of many GO categories. The method has been benchmarked against a competing method and the practical use of the classifiers demonstrated through the annotation of a set of orphan and unknown human proteins.