60-499 Project 1 – Winter 2017
Contact: Prof. Luis Rueda, 8107 LT, x 3002, [email protected]
Title: Data mining and Web interface for finding biomarkers in cancer
Group or individual work
This project consists of developing data mining and Web tools/interfaces used for prediction and
identification of transcritpomics biomarkers. One of the specific problems is to identify potential
protein isoforms and transcripts that are associated with progression of prostate cancer. Once
identified the biomarkers have to be shown to the user in various ways, including visualization of
tables, protein interaction networks via plug ins such as Cytoscape.
The participants of this project are expected to work in the following tasks (depending on the
depth required and the number of participants). One of the main tasks of this project to develop
tools to read next generation sequencing reads and assemble transcripts. The next step is to
integrate machine learning tools and transcriptomics data to obtain meaningful biomarkers as
transcripts and protein isoforms. The final step is to deploy the tools in a Web server or integrate
it with the well known Galaxy project.
Skills required: Algorithm design techniques, Java SE 8, Java Regex, Data/Text/Web mining,
HTML/PhP/Plugins, Python