Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Today’s Agenda

Exam post-mortem (15-25 min)

Grades & Status (5 min)
Derek’s presentation (15-25 min)
Exam #2: Question #1 (time permitting)


Exam post-mortem
1. Edit distance
 -3 if you made a minor
(cascading) error
 -3 if you mis-initialized the
matrix
 For the Final: Check your
answer by hand.
 And Remember…
G T C A
0 1 2 3 4
A 1
T 2
G 3
Exam post-mortem
A G C A T C T 6
C A T G C T A edits
A G C A T C T 4
C A T G C T A edits
2. Naïve Approach
 No approach can ever produce a solution that is
better than optimal.
 No edit distance can be smaller than the optimal
edit distance.
 Full credit given if noticed that matching first A led to
a sub-optimal answer
Exam post-mortem
4. bits, binary code, 1’s and 0’s, etc
5. Deoxyribonucleic Acid
6. RNA  Uracil, Guanine, Cytosine, Adenine
8. A gene is segment of DNA that encodes a
protein or regulates a gene that does.
Exam post-mortem
9. Draw picture
Sugar
A
T
Sugar
Acid
Acid
Sugar
T
A
Acid
Acid
Sugar
G
C
Acid
Sugar
Acid
Acid
Sugar
Sugar
A
T
Sugar
Acid
Exam post-mortem
10. 3 billion
11. Transcription  Translation
12. Intron regions are sliced out (removed)
13. In reality, there are 20 amino acids in
protein sequences
14. In theory, 3 RNA bases can encode 43=64
different combinations
4 different RNA bases ACGU
Exam post-mortem
15. 40-50 Several times I mentioned that 10-20
was not accurate.
16. Several hundred. Technically 100’s but
1000’s is OK.
17. 3000
Exam post-mortem
18. Global alignment -> Whole genome comparison
19. Local alignment -> Searching for genes
20. Multiple alignment -> Shared pattern discovery ->
gene discovery
Exam post-mortem
21. Finding the first two symbols that match requires:
a.
Finding two symbols that match such that
b.
The number of edits to match the two symbols is minimized
T A A A C …
C A C T A …
C G C T G G C C …
A A A T A T T C …
T A A A C …
C A C T A …
T A A A C …
C A C T A …
Matching the pair
of A’s is the best
options (so far)
Some times the
best match is
deep in the
sequences
Exam post-mortem
21. Finding the first two symbols that match :
a. Is as hard (computationally) as solving the
whole problem.
b. What if there are no symbols that match
c. What if the sequences have 100’s or 1000’s
of different types of symbols
Exam post-mortem
min = i+j;
for (i = 0; i < n && i < min; i++)
for (j = 0; j < n && j < min; j++)
if ((seq1[i] == seq2[j]) &&((i+j) < min)) {
min = i+j;
mini = i;
minj = j;
}
The first match occurs at seq1[mini] and seq2[minj].
Today’s Agenda

Exam post-mortem (15-25 min)

Grades & Status (5 min)

Derek’s presentation (15-25 min)
Exam #2: Question #1 (time permitting)

Exam #1
97
96
Grades & Status
94
93
93
median
90
89
83
78
75
74
average
87.5
Current Ave.
97.6 A
96.4 A
Grades & Status
95.2 A
94.4 A
93.0 A
median
92.8 A
90.4 A89.8 A84.6 B
78.4 C+
74.0 C
average
89.7
Grades & Status




Things will get harder
Right now, there is no work load…
Soon, project #2 will be out
The remaining material involves
–
–
–
Math (probability)
Algorithms
Code
Grades & Status


Advice:
Get moving with your paper and
presentation. Get it out of the way.
–
–
Project #2 will be challenging…
Project #2 will be given out on Tuesday.
Today’s Agenda

Exam post-mortem (15-25 min)
Grades & Status (5 min)

Derek’s presentation (15-25 min)

Exam #2: Question #1 (time permitting)

Today’s Agenda




Exam post-mortem (15-25 min)
Grades & Status (5 min)
Derek’s presentation (15-25 min)
Exam #2: Question #1 (time
permitting)
Exam #2: Question #1

Given two sequences, compute the optimal
local alignment.
Related documents