Download Lecture 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Jordan normal form wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Four-vector wikipedia , lookup

Determinant wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Brouwer fixed-point theorem wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix multiplication wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Matrix calculus wikipedia , lookup

Transcript
Mathematical preliminaries
1
What is numerical analysis and scientific computing?
Numerical analysis is the study of algorithms that use numerical approximation (as opposed to general
symbolic manipulations) for the problems of mathematical analysis. (from wiki)
The scientific computing/computational mathematics
• developing and analyzing numerical techniques for solving algebraic or differential equations
• use computers to analyze/solve scientific/engineering problems.
• based on mathematical modeling of practical problems
2
Calculus
KEY WORDS. Limit, Convergence, Continuous Function, derivative, Intermedia/Mean Value Theorem,
Taylor Expansion, Big O Notation.
GOAL. Understand the Intermedia/Mean Value Theorem; understand the big O notation; be able to
perform the basic Taylor expansion.
Definition 2.1. (Limit) A function f defined on a set X of real numbers has the limit L at x0 , written
lim f (x) = L,
x→x0
if for any given > 0, there exists a real number δ > 0, such that
|f (x) − L| < ,
whenever
x∈X
and
0 < |x − x0 | < δ
Definition 2.2. (Continuous) Let f be a function defined on a set X of real numbers and x0 ∈. Then f is
continuous at x0 if
lim f (x) = f (x0 ).
x→x0
The function f is continuous on the set X if it is continuous at each number in X.
Definition 2.3. (Convergence of sequence) Let {x0 }∞
n=1 be an infinite sequence of real or complex numbers.
The sequence {x0 }∞
n=1 has the limit x (converges to x), if for any > 0, there exists a positive integer
N () such that |x − xn | < , whenever n > N (). The notation
lim xn = x,
n→∞
or xn → x,
as
n → ∞,
means that the sequence {x0 }∞
n=1 converges to x.
Theorem 2.4. (Continuous) If f is a function defined on a set X of real numbers and x0 ∈ X, then the
following statement is equivalent
• f is continuous at x0
• If {xn }∞
n=1 is any sequence in X converging to x0 , then limn→∞ f (xn ) = f (x0 ).
Definition 2.5. (Differentiable) Let f be a function defined in an open interval containing x0 . The function
f is differentiable at x0 if
f (x) − f (x0 )
f 0 (x0 ) = lim
x→x0
x − x0
exists. The number f 0 (x0 ) is called the derivative of f at x0 . A function that has a derivative at each
number in a set X is differentiable on X.
1
Theorem 2.6. (Intermediate Value Theorem.) Let f (x) be a continuous function on the closed interval
[a, b]. Then for every value α in between f (a) and f (b), there is at least one point ξ ∈ [a, b] for which f (ξ) = α.
Theorem 2.7. (Mean Value Theorem.) If f (x) is a continuous function on the closed interval [a, b] and
differentiable on the open interval (a, b), then there is at least one number ξ ∈ (a, b) such that
f (b) − f (a)
= f 0 (ξ).
b−a
Theorem 2.8. (Rolle’s Theorem.) Let f (x) be a continuous function on the closed interval [a, b] and
differentiable on the open interval (a, b). If f (a) = f (b) = 0, then there is at least one number ξ ∈ (a, b) such
that
f 0 (ξ) = 0.
Theorem 2.9. (Taylor’s Theorem.) If f (x) has n + 1 continuous derivatives on [a, b] and c is some point
in [a, b], then for any c and x ∈ [a, b],
f (x) = f (c) + f 0 (c)(x − c) +
where
En+1 =
f (n) (c)(x − c)n
f 00 (c)(x − c)2
+ ... +
+ En+1
2!
n!
f (n+1) (ξ)(x − c)n+1
= O((x − c)n+1 ),
(n + 1)!
(1)
(2)
and ξ lies between c and x and depends on both. Equation (1) is the Taylor expansion of the function f (x)
around the point c.
Notations (the big O notation O) En+1 = O((x − c)n+1 ) is shorthand for the inequality
|En+1 | ≤ C|x − c|n+1 ,
where C is a constant. For example, in equation (2), C can be
f (n+1) (ξ)
(n+1)! .
Remark 2.10. Taylor expansion is useful in estimating the function value around x = c when
1. much is known about the function and its derivatives at x = c (however, little is know around the
neighborhood of x = c)
2. x is close to c, i.e. En+1 = O((x − c)n+1 ) is very small (within our tolerance for error).
Example 2.11. Taylor expand the following function.
1. f (x) = ex around x = 0
Solution. Since (ex )i = ex , for any i = 1, 2, · · ·
ex
f (n) (0)xn
f (n+1) (ξ)xn+1
f 00 (0)x2
+ ... +
+
2!
n!
(n + 1)!
0 2
0 n
ξ n+1
e x
e x
e x
+ ... +
+
2!
n!
(n + 1)!
xn
xn+1
+ ... +
+ eξ
n!
(n + 1)!
n
x
+ ... +
+ O(xn+1 )
n!
=
f (0) + f 0 (0)x +
=
e0 + e0 x +
=
=
x2
2!
x2
1+x+
2!
1+x+
2. sin(x) around x = 0
2
(3)
Solution. Since sin(x)2i+1 = (−1)i cos(x), for any i = 0, 1, 2, · · · , and sin(x)2i = (−1)i sin(x), for any
i = 1, 2, · · ·
sin(x)
f 00 (0)x2
f (n) (0)xn
f (n+1) (ξ)xn+1
+ ... +
+
2!
n!
(n + 1)!
=
f (0) + f 0 (0)x +
=
sin(0) + cos(0)x +
=
−sin(0)x2
sin(n) (x)xn
sin(n+1) (ξ)xn+1
+ ... +
+
2!
n!
(n + 1)!
3
5
2k+1
x
x
x
x−
+
+ · · · + (−1)k
+ O(x2k+3 )
3!
5!
(2k + 1)!
(4)
Reading Assignments/References:
1. About Intermedia Value Theorem: follow the link http://en.wikipedia.org/wiki/Intermediate_
value_theorem.
2. About Mean Value Theorem: follow the link http://en.wikipedia.org/wiki/Mean_value_theorem.
3. About the Big O Notation: follow the link http://en.wikipedia.org/wiki/Big_O_notation.
4. About Taylor Series: follow the link http://en.wikipedia.org/wiki/Taylor_expansion, or Chapter
1.2 of Cheney & Kincaid
Exercise 2.12. If the following statements are true.
1. If f (x) is a continuous function on the closed interval [a, b]. Let
M = max f (x),
a≤x≤b
m = min f (x).
a≤x≤b
Then for every value α ∈ [m, M ], there is at least one point ξ ∈ [a, b] for which
f (ξ) = α.
2. If f (x) be a continuous function on the closed interval [a, b] with f (a)f (b) < 0, then there is at least
one point ξ ∈ [a, b] for which f (ξ) = 0.
3. If f (x) be a continuous function on the closed interval [a, b] and differentiable on the open interval
(a, b). If f (a) = f (b) = 0, then there is at least one number ξ ∈ (a, b) such that f 0 (ξ) = 0.
Exercise 2.13. Taylor Expand the following function
1. cos(x) around x = 0.
2. log(x) around x = 1.
Exercise 2.14. (Big O Notation) In mathematics, big O notation (so called because it uses the symbol O)
describes the limiting behavior of a function for very small or very large arguments, usually in terms of simpler
functions. Although developed as a part of pure mathematics, it is now frequently also used in computational
complexity theory to describe how the size of the input data affects an algorithm’s usage of computational
resources (usually running time or memory). Please see http://en.wikipedia.org/wiki/Big_O_notation
for the formal definition.
By the definition of Big O notation, if the following statements are true.
1. 99x = O(x), as x → 0.
2. 99x = O(x), as x → ∞.
3. 99x2 = O(x), as x → 0.
4. 99x2 = O(x), as x → ∞.
5. 100N 3 = O(N 4 ), as N → ∞.
6. 100N 5 = O(N 4 ), as N → ∞.
3
3
Linear Algebra
KEY WORDS. Matrix, vector, transpose, inverse, determinant, linear system of equations.
GOAL. Be able to perform the basic matrix operation, such as transpose, inverse, product, determinant
etc; be able to represent a linear system of equations in a matrix vector form.
3.1
Matrix
Definition 3.1. (Matrix) A matrix is a rectangular collection of numbers arranged in rows and columns. A
matrix with n rows and m columns is said to be of dimension n × m. The element at the ith row and j th
column is denoted as aij .
Definition 3.2. (Transpose) The transpose of an n × m matrix A, denoted as AT , is an m × n matrix where
aTij = aji .
Definition 3.3. (Matrix Product) Let A be an n × m matrix, and B be an m × p matrix. The product of
A and B is an n × p matrix C whose elements are defined by
cij =
m
X
aik bkj .
k=1

2
Example 3.4. A =  −4
1


−10 −16
AB =  −4 −4  .
11
16

−6
1

0
is a 3 by 2 matrix, B =
2
5
1
3
is a 2 by 2 matrix. The product
Definition 3.5. (Inverse) If A is an n×n matrix. If there exists an n×n matrix B, such that AB = BA = I,
then B is the inverse of A.
Definition 3.6. (Determinant) The determinant of a square n × n matrix A, det(A), is defined recursively
as follows
1. When n = 1, det(A) = a11 .
2. When n > 1,
det(A) =
n
X
(−1)i+j aij mij ,
j=1
for any choice of row i, or
det(A) =
n
X
(−1)i+j aij mij ,
i=1
for any choice of column j, where mij is the determinant of the (n − 1) × (n − 1) matrix obtained by
deleting the ith row and j th column from A.
a11 a12
Example 3.7. The determinant of an 2 × 2 matrix A =
is a11 a22 − a12 a21 . The determinant
a21 a22
1 2
of A =
is −2.
3 4
a22
−a12
a11 a12
1
−1
Exercise 3.8. Check that the inverse of a 2 × 2 matrix A =
is A = det(A)
−a21 a11
a21 a22
2 3
Exercise 3.9. A 2 × 2 matrix A =
. What are AT , det(A), A−1 ?
4 5
4
4
Computer representation of numbers
Definition 4.1. (Floating point representation in decimal system.) In the decimal system, any real
number x can be represented in a normalized floating point form as
x = ±p × 10n = ±(0.d1 d2 d3 ...)10 × 10n .
where d1 6= 0, hence 0.1 ≤ q < 1. n is an integer. The numbers d1 , d2 , ... are decimal digits 0, 1, · · · 9. For
example, 37.2 = 0.372 × 102 .
Remark 4.2. The requirement of d1 6= 0 is to ensure a unique representation of real number. For example,
37.2 can not be represented as 0.0372 × 103 .
What is the smallest unit of information in computer? A bit. Information/numbers in computer chips
are stored by a series of on and off states (bit), representing 0 and 1. Therefore, numbers wants to be
represented in a binary form.
Definition 4.3. (Floating point representation in binary system.) In the binary system, any real
number x can be represented as
x = ±q × 2m = ±(0.b1 b2 b3 ...)2 × 2m ,
where b1 6= 0, hence b1 = 1 and q ≥ 1/2. For example,
(0.1)2 = 2−1 = 0.5,
(1.1)2 = 20 + 2−1 = 1.5,
(0.1)2 × 20 = 0.5,
(0.1)2 × 2−1 = 0.25
(11)2 = 21 + 20 = 3.
(0.1)2 × 23 = 4.
There are only a finite number of bits in computer to represent a number. A single precision floating
point number is stored in 32 bits; a double precision floating point number is stored in 64 bits. The real
numbers that can be represented in a computer are called its machine numbers. Not all real numbers are
machine numbers.
Definition 4.4. (Standard single-precision floating point representation) A machine number in
standard single-precision floating point representation is,
x = (−1)s × (1.f )2 × 2c−127 .
The number is represented by 32-bit.
• The leftmost bit is assigned to s for the sign. s = 0/1 corresponds to +/− respectively
• The next 8 bits are assigned to c. The value of c in representing a floating point number is restricted
by
0 < c < (11111111)2 = 255.
The values 0 and 255 are reserved for special case of 0 and ±∞. The range of representable machine
numbers is mainly determined by the number of bits assigned to c. Therefore, it can be concluded that
the largest single precision number is
2254−127 × (1.f )2 ≈ 2128 ≈ 3.4 × 1038 ;
The smallest single precision number is
21−127 × (1.0)2 = 2−126 ≈ 1.2 × 10−38 .
• The last 23 bits are assigned to f . The value of (1.f )2 is restricted by
1 ≤ (1.f )2 ≤ (1.1....1)2 = 2 − 2−23 .
The significant decimal digits of accuracy is mainly determined by the number of bits assigned to f .
The number of significant decimal digits of a single precision number is approximately 6, as
2−23 = 0.12 × 10−6 .
5
Definition 4.5. (Standard double-precision floating point representation) A machine number in
standard double-precision floating point representation is,
x = (−1)s × 2c−1023 × (1.f )2 .
The number is represented by 64-bit.
• The leftmost bit is assigned to s for the sign. s = 0/1 corresponds to +/− respectively
• The next 11 bits are assigned to c. The value of c in representing a floating point number is restricted
by
0 < c < (11111111111)2 = 2047.
The values 0 and 2047 are reserved for special case of 0 and ±∞.
• The last 52 bits are assigned to f . The value of (1.f )2 is restricted by
1 ≤ (1.f )2 ≤ (1.1....1)2 = 2 − 2−52 .
The significant decimal digits of accuracy is mainly determined by the number of bits assigned to f .
Exercise 4.6. Give an estimate on the largest and smallest double precision number. Given an estimate on
the number of significant decimal digits of a double precision number.
6