Download Solutions

Comp 411 Computer Orginization Fall 2007 Problem Set #1 Solutions Problem 1 a) If all nucleic acids are equally likely, a 3-nucleic acid sequence can be represented with 6 bits, as such: 64 log2 ( 4∗4∗4 1 ) = log2 ( 1 ) = 6 Given that there are 20 amino acids and 1 stop code, the minimal number of bits to encode a single item in a protein chain is: log2 ( 21 1 ) ≈ 4.39 The numbers do not agree. Reasons why should include something along the lines of less information being stored in amino acid representation, the smaller set size of the amino acids, or such. b) Since there are 64 codons, and 61 are amino acids. Since the only information given is that the codon represents an amino acid, the bits conveyed are: 64 ) ≈ 0.07 log2 ( 61 Similarly for the 3 stop codes: log2 ( 64 3 ) ≈ 4.42 Since there are 6 possible codons for Serine: log2 ( 64 6 ) ≈ 3.42 c) There are 37 codons that contain the T nucleotide and 64 possible codons. Thus, the bits conveyed are: 64 log2 ( 37 ) ≈ 0.79 To figure out how many bits of information are added by knowing that the codon is a stop code, subtract the original amount of information conveyed from the new amount of information conveyed. 64 64 log2 ( ) − log2 ( ) ≈ 4.41 − 0.79 ≈ 3.62 3 37 You could also simply use log2 ( 37 3 ) to find the amount of additional information. 1 d) There are 20 amino acids and 3 bases: log2 ( 20 3 ) ≈ 2.74 There are 64 possible codons and 10 of them encode to bases: 64 log2 ( 10 ) ≈ 2.68 There are 2 ways to encode Lysine, so: 64 log2 ( 64 2 ) − log2 ( 10 ) ≈ 5 − 2.68 ≈ 2.32 Again, you could also simply evaluate log2 ( 10 3 ) to find the amount of additional information. e) Across the set of 64 codons, there are 32 available transitions. Thus, let us consider each position in a codon and its influence on the final value. For the right-most position, there is only one transition that changes the resulting amino acid: Isoleucine ↔ Methionine. For the middle position, all 32 of the possible 32 transitions change the protein. For the right-most position, 30 of the 32 transitions change the protein (TTA ↔ CTA and TTG ↔ CTG do not). Thus, the number of bits conveyed is: 96 log2 ( 32+32+32 1+32+30 ) = log2 ( 63 ) ≈ 0.61 f ) In the case of Glycine, since the first 2 nuclic acids are all that is needed to identify the resulting amino acid, no bits of information are conveyed in the last nucleic acid. P g) Using the entropy formula given in the lecture − i pi log2 (pi ), the entropy of the codes is: −(0.24 ∗ log2 (0.24) + 0.14 ∗ log2 (0.14) + 0.12 ∗ log2 (0.12) + 0.5 ∗ log2 (0.5)) ≈ −(0.24(−2.06) + 0.14(−2.84) + 0.12(−3.06) + 0.5(−1)) ≈ −(−0.49 − 0.4 − 0.38 − 0.5) ≈ 1.77 The bits wasted using a fixed length scheme are: 2 − 1.77 ≈ 0.23 h) The string 0011011101010 can be decoded as follows (only the last acid in the codon is shown): 0 |{z} 0 |{z} 110 |{z} 111 |{z} 0 |{z} 10 |{z} 10 |{z} 10 |{z} C C A T C G G G i) The expected length is: 1000(0.5 ∗ 1 + 0.24 ∗ 2 + 0.14 ∗ 3 + 0.12 ∗ 3) = 1760 Since GGT and GGA have an encoded length of 3, the worst case is: 1000(3) = 3000 Answers will vary, but generally when compared to 1000 ∗ log2 ( 14 ) = 2000, the worst case of 3000 (a 50% increase) seems very poor. 2 Problem 2 a) There are 32 positions in a 32 bit word and each position can hold 1 of 2 possible values, thus: 232 = 4294967296 b) 0 is represented as: 0000 0000 0000 0000 0000 0000 0000 00002 The most positive integer of 2147483647 (from 0111 1111 1111 1111 1111 1111 1111 11112 232 2 The most negative value of -2147483648 (from 1000 0000 0000 0000 0000 0000 0000 00002 232 2 ) − 1) is represented as: is represented as: Negating the most negative number is as such: initially → 1000 0000 0000 0000 0000 0000 0000 00002 (-2147483648) complement → 0111 1111 1111 1111 1111 1111 1111 11112 (2147483647) add 1→ 1000 0000 0000 0000 0000 0000 0000 00002 (-2147483648) c) See d) d) 1. 0x0000019B16 2. FFFF000016 3. BADC541116 4. ABADFACE16 5. FFFFFFFF16 e) All numbers are in their base 2 form, followed by their base 10 form. 1. 010100 (20) +001010 (10) 011110 (30) 2. 010011 (19) +110001 (-15) 000100 (4) 3. 001111 (15) +101101 (-19) 111100 (-4) 3 4. 011011 (27) +110111 (-9) 010010 (18) 5. 110111 (-9) +101111 (-17) 100110 (-26) 6. 010101 (21) +101011 (-21) 000000 (0) 7. 011111 (31) +001100 (12) 101011 (-21) Answers will vary, but they should mention that a 1 was carried into the most significant bit, causing the number to ’become’ negative. f ) Answers will vary. It would be nice to have something that describes how a number added to its complement is a very large number and adding 1 causes it to overflow to 0. Problem 3 a) 1. 1+1+1+0+1+1+0+1+1+1+1+0+1+1+0+1+1 = 13 (has error) 2. 1+1+0+1+1+1+1+0+1+0+1+0+1+1+0+1+1 = 12 (no errors) 3. 1+0+1+1+1+1+1+0+1+1+1+0+1+1+1+1+0 = 13(has error) 4. 0+0+0+0+0+0+0+0+0+0+0+0+0+0+0+0+0 = 0 (no errors) b) By summing the ’1’s in each row and column, it can be determined if any errors are the data block. 1. Error at 1,1 Corrected matrix: 0011 0110 0011 011 2. No errors 4 3. Error at 2,2 Corrected matrix: 000 101 10 4. Error in row 1 parity bit Corrected matrix: 0110 1001 0110 100 c) Answers will vary. Should include mention of 2 errors in a single row or column. d) 13 is the only index that is present in p0 , p2 , and p3 . Thus the bit at index 13 has the error. Since p0 , p2 , and p3 marked the error, e0 = 1, e1 = 0, e2 = 1, e3 = 1, and e4 = 0. The binary representation of e0 , e1 , e2 , e3 , and e4 is 01101. Note that e0 represents the least significant bit. e) Consider that the index with the error is 13 and the binary representation of checked error bits is 01101. The error bits (01101) encode the index of the error (13) in binary form. f ) Answers will vary, but they should mention that certain combinations of double bit errors are not detectable. Solutions may include such things as a parity bit that checks the other parity bits. Problem 4 a) Since there are 10 possibilities: log2 ( 10 ) ≈ 3.32 1 b) f (d) encodes as follows:  1    2 f (d) = 4    8 d=0 1≤d≤2 3≤d≤5 d≥6 The probabilities of f (d) is: p(f (d)) =        5 1 10 2 10 3 10 4 10 1 2 4 8 In the case of 1, the information about d is: log2 ( 10 ) ≈ 3.32 1 log2 ( 10 ) ≈ 1.32 4 For 8, the information conveyed is: c) The average using the formula P i pi log2 ( p1i ) is: 1 10 1 5 3 10 4 10 ∗ log2 ( ) + ∗ log2 ( ) + ∗ log2 ( ) + ∗ log2 ( ) 10 1 5 1 10 3 10 4 ≈ 1 1 3 4 (3.32) + (2.32) + (1.74) + (1.32) 10 5 10 10 ≈ 1.85 d) Since step 2 always operates on a pair of elements, it will repeat until there is a single element left in S. Thus, it will iterate n − 1 times. e) Answers will vary. Should be similar to this: Element 1 2 4 8 Encoding 011 010 00 1 1 HHH H 6 4 10 10 (8) H H H3 3 (4) 10 10 HH 2 1 10 (2) 10 (1) Probability 1 10 2 10 3 10 5 10 6

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Solutions