Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Computational Thinking Vicky Chen Fundamental Theorem of Informatics Friedman C P J Am Med Inform Assoc 2009;16:169-170 What Informatics Is Not Friedman C P J Am Med Inform Assoc 2009;16:169-170 Computational Thinking Computational thinking is a way of solving problems, designing systems, and understanding human behavior that draws on concepts fundamental to computer science. To flourish in today's world, computational thinking has to be a fundamental part of the way people think and understand the world. http://www.cs.cmu.edu/~CompThink/ Computational Thinking • Analyzing and logically organizing data • Data modeling, data abstractions, and simulations • Formulating problems so computers may assist • Identifying, testing, and implementing possible solutions • Automating solutions via algorithmic thinking • Generalizing and applying this process to other problems Algorithm • A finite list of instructions that describe all required steps to perform a computation, written in general language Programming Steps • Specification – What the code should do • Design – Pseudocode • Implement – Programming • Test – Debugging Data Type / Data Structure • • • • • Integer Floating point Boolean Character String • List • Dictionary • Hash Table Data Types List Dictionary / Hash Table Exercise 1 We have a matrix with mutation information for different tumor samples. How can this data be represented? List of Lists • Data is a sparse matrix • Stores a lot of extra uninformative information Dictionary Opening Files • Mutation matrix contains data on 2337 genes and 779 samples • Inputting data by hand is not feasible • Data usually read in and processed from files Opening Files Input and print For Loops While Loops Conditional Statements Conditional Statements • • • • If, else if, else and or not Exercise 2 We have a dictionary that contains tumor sample mutation information. We want to print out a list of tumor samples after receiving a mutated gene of interest from the user. Opening Files Revisited Opening Files Revisited Data Extraction from Files • Many files will contain extra information • Focus on extracting only pertinent data • Applicable to many types of data – Natural language documents (e.g. articles) – Sequence data (e.g. FASTA files) – Files from databases (e.g. NCBI Gene, TCGA) – Etc. Regular Expressions Reusing Code • Some code can be useful in multiple situations • It is possible to just rewrite (or copy) the code each time – Less efficient – Multiple locations to fix when debugging Functions Exercise 3 We have a document containing human gene information downloaded from NCBI. We want to extract and store the Ensembl ID of each gene with its corresponding gene symbol.