* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Class Slides
Survey
Document related concepts
Transcript
Introduction to Computational Thinking Vicky Chen Fundamental Theorem of Informatics Friedman C P J Am Med Inform Assoc 2009;16:169-170 What Informatics Is Not Friedman C P J Am Med Inform Assoc 2009;16:169-170 Computational Thinking Computational thinking is a way of solving problems, designing systems, and understanding human behavior that draws on concepts fundamental to computer science. To flourish in today's world, computational thinking has to be a fundamental part of the way people think and understand the world. http://www.cs.cmu.edu/~CompThink/ Computational Thinking • Analyzing and logically organizing data • Data modeling, data abstractions, and simulations • Formulating problems so computers may assist • Identifying, testing, and implementing possible solutions • Automating solutions via algorithmic thinking • Generalizing and applying this process to other problems Algorithm • A finite list of instructions that describe all required steps to perform a computation, written in general language Programming Steps • Specification – What the code should do • Design – Pseudocode • Implement – Programming • Test – Debugging Data Type / Data Structure • • • • • Integer Floating point Boolean Character String • List • Dictionary • Hash Table Data Types List Dictionary / Hash Table Exercise 1 We have a matrix with mutation information for different tumor samples. How can this data be represented? List of Lists • Data is a sparse matrix • Stores a lot of extra uninformative information Dictionary Opening Files • Mutation matrix contains data on 2337 genes and 779 samples • Inputting data by hand is not feasible • Data usually read in and processed from files Opening Files Input and print For Loops While Loops Conditional Statements Conditional Statements • • • • If, else if, else and or not Exercise 2 We have a dictionary that contains tumor sample mutation information. We want to print out a list of tumor samples after receiving a mutated gene of interest from the user. Opening Files Revisited Opening Files Revisited Data Extraction from Files • Many files will contain extra information • Focus on extracting only pertinent data • Applicable to many types of data – Natural language documents (e.g. articles) – Sequence data (e.g. FASTA files) – Files from databases (e.g. NCBI Gene, TCGA) – Etc. Regular Expressions Reusing Code • Some code can be useful in multiple situations • It is possible to just rewrite (or copy) the code each time – Less efficient – Multiple locations to fix when debugging Functions Exercise 3 We have a document containing human gene information downloaded from NCBI. We want to extract and store the Ensembl ID of each gene with its corresponding gene symbol.