Download ON PSEUDOSPECTRA AND POWER GROWTH 1. Introduction and

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Eisenstein's criterion wikipedia , lookup

Factorization wikipedia , lookup

Capelli's identity wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Determinant wikipedia , lookup

Birkhoff's representation theorem wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Jordan normal form wikipedia , lookup

Matrix calculus wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Oscillator representation wikipedia , lookup

Congruence lattice problem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Transcript
ON PSEUDOSPECTRA AND POWER GROWTH
THOMAS RANSFORD
Abstract. The celebrated Kreiss matrix theorem is one of several results relating the norms of
the powers of a matrix to its pseudospectra (i.e. the level curves of the norm of the resolvent). But
to what extent do the pseudospectra actually determine the norms of the powers? Specifically, let
A, B be square matrices such that, with respect to the usual operator norm · ,
(∗)
(zI − A)−1 = (zI − B)−1 (z ∈ C).
Then it is known that 1/2 ≤ A/B ≤ 2. Are there similar bounds for An /B n for n ≥ 2?
Does the answer change if A, B are diagonalizable? What if (∗) holds, not just for the norm · ,
but also for higher-order singular values? What if we use norms other than the usual operator
norm? The answers to all these questions turn out to be negative, and in a rather strong sense.
1. Introduction and statement of results
Let N ≥ 1, let CN be complex Euclidean N -space, and let CN ×N be the algebra of complex
2 1/2 , and
N × N matrices. We write | · | for the Euclidean norm on CN , defined by |x| := ( N
1 |xj | )
· for the associated operator norm on CN ×N , defined by A := sup{|Ax| : |x| = 1}.
It is well known that, given A ∈ CN ×N , the long-term growth of the norms of powers of A is
governed by the spectral radius ρ(A). Indeed, by the spectral radius formula, we have
An ≥ ρ(A)n
(n ≥ 1)
and
lim An 1/n = ρ(A).
n→∞
However, in the shorter term, An may well be significantly larger that ρ(A)n . The recent book
of Trefethen and Embree [4] contains an account of these transient effects, illustrated by examples
drawn from many different fields. One of the main themes of the book is that, to predict accurately
the growth of An , it is important to study not only the spectrum of A, but also its pseudospectra,
which we now define.
Given A ∈ CN ×N and > 0, the -pseudospectrum of A is defined to be the set
σ (A) := {z ∈ C : (zI − A)−1 > −1 }.
Here and in what follows we adopt the useful convention that (zI − A)−1 = ∞ if z ∈ σ(A), the
spectrum of A. Thus σ (A) shrinks to σ(A) as ↓ 0. From a knowledge of the pseudospectra of
A, it is possible to deduce both upper and lower bounds on the growth of An . A well-known
result of this type is the Kreiss matrix theorem. We refer to [4] for this and several other such
results. In addition, there are efficient methods for numerical computation of pseudospectra (see
[4, Chapter IX]), so this approach is highly practical.
The purpose of this note is to show that, in predicting power growth, not even pseudospectra
tell the whole story. The issue was already addressed by Greenbaum and Trefethen in [3] (see also
2000 Mathematics Subject Classification. Primary 47A10; Secondary 15A18, 15A60, 65F15.
Key words and phrases. matrix, norm, spectral radius, eigenvalue, singular value, pseudospectra.
Research supported by grants from NSERC and the Canada Research Chairs program.
1
2
T. RANSFORD
[4, §47]). Suppose that two N × N matrices A, B have identical pseudospectra, i.e. that
(zI − A)−1 = (zI − B)−1 (1)
(z ∈ C).
Does it then follow that p(A) = p(B) for every polynomial p? In particular, can we conclude
that An = B n for all n ≥ 1? The answer is no. An example was given in [3] (and
√ again in
[4]) of two matrices A, B with identical pseudospectra such that A = 1 and B = 2. But this
example leaves several basic questions unresolved:
1.1. What about higher powers? By adapting the Greenbaum–Trefethen example, one can
construct, for each > 0, matrices A, B with identical pseudospectra such that A/B > 2 − .
On the other hand, it is known that if A, B satisfy (1), then we must have
1/2 ≤ A/B ≤ 2.
(2)
(For a proof, see [4, pp.168–169]; see also the remark after Theorem 5.1 below.) Are there similar
bounds for An /B n for n ≥ 2? If this were the case, then we could justifiably say that
pseudospectra determine power norms, at least up to a constant factor. However, our first result
answers this question negatively, and in a fairly strong sense.
Recall that (finite or infinite) sequence (αk ) is called submultiplicative if αk+l ≤ αk αl for all k, l
for which the inequality makes sense. For example, the sequence (Ak )k≥1 is submultiplicative
for every matrix A.
Theorem 1.1. Let n ≥ 2, and let α2 , . . . , αn and β2 , . . . , βn be positive submultiplicative sequences.
Then there exist N ≥ 1 and matrices A, B ∈ CN ×N such that
(zI − A)−1 = (zI − B)−1 (3)
(z ∈ C)
and
(4)
Ak = αk
and
B k = βk
(k = 2, . . . , n).
We may take N = 2n + 3.
This shows that matrices can have identical pseudospectra and yet their second and higher
powers have norms that are completely unrelated to each other.
1.2. What about diagonalizable matrices? The matrices A, B in the Greenbaum–Trefethen
example are nilpotent, as are those constructed in the proof of Theorem 1.1 above. Obviously,
these are rather special. What happens if, instead, we consider more generic matrices, for example
diagonalizable matrices? (By ‘diagonalizable’, we mean similar to a diagonal matrix.) Could it be
that, for such matrices at least, the pseudospectra completely determine the power growth? Until
now, no counterexample was known. We obtain one by combining the construction in Theorem 1.1
with a perturbation argument.
Theorem 1.2. Let n ≥ 2, let α2 , . . . , αn and β2 , . . . , βn be positive submultiplicative sequences,
and let > 0. Then there exist N ≥ 1 and diagonalizable matrices A, B ∈ CN ×N such that
(5)
(zI − A)−1 = (zI − B)−1 (z ∈ C)
and
(6)
αk − < Ak < αk + and
βk − < B k < βk + (k = 2, . . . , n).
ON PSEUDOSPECTRA AND POWER GROWTH
3
1.3. What if we use ‘higher-order’ pseudospectra? Given a matrix A ∈ CN ×N , its singular
values s1 (A), . . . , sN (A) are the square roots of the eigenvalues of A∗ A, listed in decreasing order.
In particular, s1 (A) = A. One the principal methods for computing the pseudospectra of A is
to calculate the singular values of zI − A for z ∈ C (once done for one value of z, it is relatively
inexpensive to do it for many z), and then use the fact that
1
(zI − A)−1 = s1 (zI − A)−1 =
.
sN (zI − A)
It is thus reasonable to ask whether, by retaining the other singular values of (zI − A)−1 , it is
possible to determine An for values of n ≥ 2. The following result gives a partial positive answer.
Theorem 1.3. Let N ≥ 1 and let A, B ∈ CN ×N be matrices satisfying
(z ∈ C, j = 1, . . . , N ).
(7)
sj (zI − A)−1 = sj (zI − B)−1
Then, for every polynomial p,
1
p(A) √
√ ≤
≤ N.
p(B)
N
(8)
It would be interesting to know if these bounds can be improved so as to be independent of N .
However, even if this were the case, the theorem would be a bit unrealistic, because it would require
us to keep track of all N singular values, which is probably too expensive in practice. Is there an
analogous result where, by keeping track of just a few singular values, we can obtain inequalities
like (8) at least for polynomials of low degree? The following generalization of Theorem 1.1 gives
a negative answer.
Theorem 1.4. Let n ≥ 2, let α2 , . . . , αn and β2 , . . . , βn be positive submultiplicative sequences,
and let m ≥ 1. Then there exist N ≥ 1 and matrices A, B ∈ CN ×N such that
(9)
sj (zI − A)−1 = sj (zI − B)−1
(z ∈ C, j = 1, . . . , m)
and
(10)
Ak = αk
and
B k = βk
(k = 2, . . . , n).
We may take N = (m + 1)(n + 2) − 1.
1.4. What about other norms? Though the Euclidean-norm case is undoubtedly the most
important one, there are instances where it is more appropriate to consider pseudospectra defined
with respect to other norms. In [4, §§56,57], several examples are given based on the 1-norm
on CN , defined by |x|1 := N
j=1 |xj | and the associated operator norm · 1 , given by A1 :=
sup{|Ax|1 : |x|1 = 1}. There is no analogue of the Greenbaum–Trefethen example for this norm,
because of the following theorem.
Theorem 1.5. Let N ≥ 1 and let A, B ∈ CN ×N be matrices satisfying
(11)
(zI − A)−1 1 = (zI − B)−1 1
(z ∈ C).
Then A1 = B1 .
Can we also deduce that An 1 = B n 1 for n ≥ 2? Once again, the answer turns out to be
no, and not just for · 1 , but for a whole variety of possible norms. To make this precise, it is
convenient to introduce some notation and terminology.
4
T. RANSFORD
Given square matrices A, B, perhaps of different sizes, we shall write A ⊕ B for the block matrix
A 0
.
0 B
A norm ||| · ||| on CN ×N will be called admissible if it satisfies the following three conditions:
• ||| · ||| is an algebra norm, i.e. |||AB||| ≤ |||A|||.|||B||| for all A, B ∈ CN ×N and |||I||| = 1;
• every permutation matrix Q ∈ CN ×N satisfies |||Q||| = 1;
• every block matrix A ⊕ B ∈ CN ×N satisfies |||A ⊕ B||| = max(|||A ⊕ 0|||, |||0 ⊕ B|||).
p 1/p , then the associated
For example, if |·|p is the usual p-norm on CN , given by |x|p := ( N
j=1 |xj | )
N
×N
operator norm on C
is admissible.
The following result is a generalization of Theorem 1.1 to this context.
Theorem 1.6. Let n ≥ 2, and let α2 , . . . , αn and β2 , . . . , βn be positive submultiplicative sequences.
Then there exist N ≥ 1 and A, B ∈ CN ×N such that, for every admissible norm ||| · ||| on CN ×N ,
|||(zI − A)−1 ||| = |||(zI − B)−1 |||
(12)
(z ∈ C)
and
(13)
|||Ak ||| = αk
|||B k ||| = βk
and
(k = 2, . . . , n).
We may take N = 2n + 3.
We conclude by remarking that there is at least one well-known norm on CN ×N for which
matrices A, B with identical pseudospectra do have identical power growth. This is the Hilbert–
Schmidt (or Frobenius) norm, as was shown by Greenbaum and Trefethen in [3]. We shall need
their result in §4, where more details will be given. Of course, the Hilbert–Schmidt norm is not
an admissible norm in our sense; in fact it fails all three parts of the definition.
The rest of the paper is devoted to the proofs of the six theorems above.
2. Proof of Theorem 1.1
The proof of Theorem 1.1 is based on a construction using weighted shifts, which will also serve
as a model in several other proofs to follow. It is therefore written in such a way as to be easy to
adapt to other situations.
Given ω1 , . . . , ωn > 0, we write
⎞
⎛
0
...
0
0 ω1
⎟
⎜0 0 ω2 /ω1 . . .
0
⎟
⎜
⎟
⎜
..
..
(14)
S(ω1 , . . . , ωn ) := ⎜ ... ...
⎟.
.
.
⎟
⎜
⎝0 0
0
. . . ωn /ωn−1 ⎠
0
0
0
...
0
Lemma 2.1. Let ω1 , . . . , ωn be a positive submultiplicative sequence, and let S = S(ω1 , . . . , ωn ).
Then
S k = ωk
(15)
(k = 1, . . . , n)
and
ωk |z|k
ωk |z|k
≤ (I − zS)−1 ≤ 1 +
1≤k≤n
2
n
(16)
1 + max
k=1
(z ∈ C).
ON PSEUDOSPECTRA AND POWER GROWTH
5
Proof. Let k ∈ {1, . . . , n}. Taking Q to be an appropriate cyclic permutation matrix, we have
S k Q = diag (ωk , ωk+1 /ω1 , . . . , ωn /ωn−k , 0, . . . , 0). Using the submultiplicativity of the sequence
ω1 , . . . , ωn , we obtain
S k Q = max(ωk , ωk+1 /ω1 , . . . , ωn /ωn−k , 0, . . . , 0) = ωk .
Since Q = 1 = Q−1 and · is an algebra norm, it follows that S k = ωk . This proves (15).
For the upper bound in (16), note that (I − zS)−1 = nk=0 z k S k , whence
(1 − zS)
−1
n
n
n
k k
k
k
=
z S ≤
|z| S = 1 +
|z|k ωk
k=0
k=0
(z ∈ C).
k=1
For the lower bound, first fix k ∈ {1, . . . , n} and z ∈ C. Let Q be the permutation matrix that
exchanges rows 2 and k + 1, and let P := diag (1, eiθ , 0, . . . , 0), where θ = − arg(z k ). Then
1 |z|k ωk
−1
⊕ 0,
P Q(I − zS) QP =
0
1
and hence
1 |z|k ωk
⊕ 0.
(I − zS)−1 ≥ 0
1
Conjugating by the permutation matrix that swaps the first
1
1 |z|k ωk
⊕ 0 = |z|k ωk
0
1
Taking averages, it follows that
−1
(I − zS)
≥
two rows, we have
0
⊕ 0.
1
1
|z|k ωk /2
⊕
0
.
k
|z| ωk /2
1
Since · is always at least as large as the spectral radius, we deduce that
|z|k ωk
1
|z|k ωk /2
−1
⊕0 =1+
.
(I − zS) ≥ ρ
k
|z| ωk /2
1
2
As this holds for each k ∈ {1, . . . , n} and each z ∈ C, we obtain the lower bound in (16).
Proof of Theorem 1.1. First, choose α1 , β1 large enough so that the sequences α1 , . . . , αn and
β1 , . . . , βn are submultiplicative, and set
A0 := S(α1 , . . . , αn )
and B0 := S(β1 , . . . , βn ).
Then set A := A0 ⊕ C0 and B := B0 ⊕ C0 , where C0 is the (n + 2) × (n + 2) matrix defined by
C0 := S(Γ, γ 2 , γ 3 , . . . , γ n+1 ).
Here Γ, γ are positive numbers to be chosen later. Note that the sequence Γ, γ 2 , . . . , γ n+1 is
submultiplicative provided that Γ ≥ γ. Defined in this way, A, B are (2n + 3) × (2n + 3) matrices,
and we shall show that they satisfy (3) and (4) if γ, Γ are chosen suitably.
We first choose γ > 0 small enough so that γ k < min(αk , βk ) (k = 2, . . . , n). With this choice,
Ak = max(Ak0 , C0k ) = max(αk , γ k ) = αk
and likewise B k = βk (k = 2, . . . , n).
(k = 2, . . . , n),
6
T. RANSFORD
It remains to choose Γ to ensure that A, B have identical pseudospectra. This will be the case
provided that
(17)
(I − zC0 )−1 ≥ (I − zA0 )−1 By Lemma 2.1, this will be true if
n
Γt γ n+1 tn+1 αk tk
max
,
≥
2
2
and (I − zC0 )−1 ≥ (I − zB0 )−1 and
k=1
n
Γt γ n+1 tn+1 max
βk tk
,
≥
2
2
(z ∈ C).
(t ≥ 0).
k=1
Now there exists t0 , depending on α1 , . . . , αn , β1 , . . . , βn , γ, but not on Γ, such that
γ n+1 tn+1 αk tk
≥
2
n
γ n+1 tn+1 βk tk
≥
2
n
and
k=1
(t ≥ t0 ).
k=1
Hence it will suffice that
Γt αk tk
≥
2
n
k=1
Γt βk tk
≥
2
n
and
(0 ≤ t ≤ t0 ).
k=1
This will certainly be true provided we choose Γ large enough. With this choice, the construction
is complete.
3. Proof of Theorem 1.2
The basic idea is to perturb the construction in in the proof of Theorem 1.1. However, keeping
track of the norms of the resolvents requires a certain amount of care. We shall need two lemmas.
Lemma 3.1. Let V, W ∈ CN ×N . If V is invertible and V − W < 1/(2V −1 ), then W is
invertible and
V −1 − W −1 ≤ 2V −1 2 V − W .
Proof. We have
I − V −1 W = V −1 (V − W ) ≤ V −1 V − W ≤ 1/2.
Therefore V −1 W is invertible, and hence so too is W .
We also have
(18)
V −1 − W −1 = V −1 (W − V )W −1 ≤ V −1 W − V W −1 .
Since W −V ≤ 1/(2V −1 ), it follows that V −1 −W −1 ≤ W −1 /2, whence W −1 ≤ 2V −1 .
Substituting this back into (18) gives the result.
In the next lemma, adj denotes adjugate and ρ denotes spectral radius.
Lemma 3.2. Let V ∈ CN ×N . Then
adj(V ) − (−V )N −1 ≤ 2N ρ(V )V N −2 .
Proof. It suffices to prove this when V is invertible, since invertible matrices are dense in CN ×N .
Let p(z) be the characteristic polynomial of V . Since p(z) = N
j=1 (z − λj ), where λ1 , . . . , λN
N
j
are the eigenvalues of V , we have p(z) = j=0 aj z , where
N
a0 = (−1)N det(V )
and
|aj | ≤
ρ(V )N −j (j = 0, . . . , N ).
(19)
aN = 1,
j
ON PSEUDOSPECTRA AND POWER GROWTH
7
Now by the Cayley–Hamilton theorem p(V ) = 0. Multiplying by V −1 and rearranging gives
a0 V
−1
+ aN V
N −1
=−
N
−1
aj V j−1 .
j=1
Using (19), it follows that
(−1) det(V )V
N
−1
+V
N −1
≤
N
−1 j=1
N
ρ(V )N −j V j−1 .
j
Since det(V )V −1 = adj(V ) and ρ(V ) ≤ V , we deduce that
N
−1 N
N
N −1
ρ(V )V N −j−1 V j−1 ≤ 2N ρ(V )V N −2 ,
≤
(−1) adj(V ) + V
j
j=1
whence the result.
Proof of Theorem 1.2. Define A0 , B0 , C0 as in the proof of Theorem 1.1. The choice of γ is the
same as before, so that (A0 ⊕ C0 )k = αk and (B0 ⊕ C0 )k = βk for k = 2, . . . , n. This time,
however, we choose Γ a little differently, stipulating that Γ ≥ γ and
Γt αk tk + 2t
≥
2
n
(20)
Γt βk tk + 2t
≥
2
n
and
k=1
(0 ≤ t ≤ 1/γ).
k=1
The next step is to perturb A0 , B0 , C0 so as to obtain diagonalizable matrices. To this end, we
fix distinct complex numbers ζ1 , . . . , ζn+1 of modulus 1/2, and set
D := diag (ζ1 , . . . , ζn+1 )
and
D := diag (ζ1 , . . . , ζn+1 , 0),
Then, for each δ > 0, we define
Aδ := A0 + δD,
Bδ := B0 + δD
and
Cδ := C0 + δD .
Each of Aδ , Bδ , Cδ has distinct eigenvalues, so Aδ ⊕ Cδ and Bδ ⊕ Cδ are diagonalizable.
By continuity, if δ > 0 is chosen small enough, then
k
k
(k = 2, . . . , n).
(Aδ ⊕ Cδ ) − αk < and (Bδ ⊕ Cδ ) − βk < We next show that, reducing δ if necessary, we have
(Cδ − zI)−1 ≥ (Aδ − zI)−1 (21)
(Cδ − zI)−1 ≥ (Bδ − zI)−1 (|z| ≥ γ).
By Lemma 2.1, we have
(I − wC0 )−1 − (I − wA0 )−1 ≥ (1 + Γ|w|/2) − (1 +
n
|αk ||w|k )
(w ∈ C).
k=1
From our choice of Γ in (20), it follows that
(I − wC0 )−1 − (I − wA0 )−1 ≥ 2|w|
(|w| ≤ 1/γ).
We now apply Lemma 3.1 with V = I − wA0 and W = I − wAδ . We find that, if δ|w|D ≤
1/(2(I − wA0 )−1 ), then
(I − wAδ )−1 − (I − wA0 )−1 ≤ 2(I − wA0 )−1 2 δ|w|D.
8
T. RANSFORD
It follows that, if δ is chosen small enough, then
(I − wAδ )−1 − (I − wA0 )−1 ≤ |w|
(|w| ≤ 1/γ).
Likewise, if δ is small enough, then
(I − wCδ )−1 − (I − wC0 )−1 ≤ |w|
(|w| ≤ 1/γ).
Putting all of this together, we find that, if δ is sufficiently small, then
(I − wCδ )−1 ≥ (I − wAδ )−1 (|w| ≤ 1/γ),
from which (21) follows for A. Evidently, a similar argument applies to B.
The next step is to show that, by reducing δ yet further, we may ensure that
(Cδ − zI)−1 ≥ (Aδ − zI)−1 (22)
(|z| ≤ δ).
(Cδ − zI)−1 ≥ (Bδ − zI)−1 For this we use Lemma 3.2. Applying this lemma with V = Aδ − zI, and recalling that this is an
(n + 1) × (n + 1) matrix, we obtain
adj(Aδ − zI) − (zI − Aδ )n ≤ 2n+1 ρ(Aδ − zI)Aδ − zIn−1 .
Now σ(Aδ − zI) = σ(δD − zI), so
ρ(Aδ − zI) = ρ(δD − zI) ≤ δD − zI ≤ δ + |z|.
It follows that
adj(Aδ − zI) − (zI − Aδ )n ≤ 2n+1 (δ + |z|)Aδ − zIn−1 .
Note also that sup|z|≤δ (zI − Aδ )n − (−A0 )n = O(δ) as δ → 0. Hence there exists a constant K,
independent of z, δ, such that
adj(Aδ − zI) − (−A0 )n ≤ Kδ
(|z| ≤ δ).
Similarly, as Cδ is an (n + 2) × (n + 2) matrix, there exists a constant K such that
adj(Cδ − zI) − (−C0 )n+1 ≤ K δ
(|z| ≤ δ).
Since An0 = 0 and C0n+1 = 0, it follows that, if δ is small enough, then
adj(Cδ − zI) ≥ C0n+1 /2
(|z| ≤ δ).
adj(Aδ − zI) ≤ 2An0 Now
adj(Cδ − zI) = det(Cδ − zI)(Cδ − zI)−1 ,
adj(Aδ − zI) = det(Aδ − zI)(Aδ − zI)−1
and
det(Cδ − zI)
det(δD − zI)
=
= −z.
det(Aδ − zI)
det(δD − zI)
Combining these facts, we obtain that, for sufficiently small δ > 0,
(Cδ − zI)−1 1 C0n+1 1 C0n+1 ≥
≥
(Aδ − zI)−1 |z| 4An0 δ 4An0 (|z| ≤ δ).
Thus, if δ is chosen small enough, then (22) holds for A. The argument for B is similar.
ON PSEUDOSPECTRA AND POWER GROWTH
9
Fix δ > 0 so that (21) and (22) hold. Summarizing what we have achieved so far, if we define
A := Aδ ⊕ Cδ and B := Bδ ⊕ Cδ , then A, B are diagonalizable matrices satisfying (6) and
(A − zI)−1 = (B − zI)−1 (z ∈ C \ Q),
where Q is the annulus {z ∈ C : δ < |z| < γ}. It remains to deal with the case z ∈ Q, which
we do as follows. Let L be the maximum of supz∈Q (A − zI)−1 and supz∈Q (B − zI)−1 .
Cover Q by a finite number of disks of radius 1/L, with centres µ1 , . . . , µm ∈ Q say. Define
E := diag (µ1 , . . . , µm ). Then we have
(E − zI)−1 ≥ (A − zI)−1 (z ∈ Q).
(E − zI)−1 ≥ (B − zI)−1 Thus, if we replace A, B by A ⊕ E, B ⊕ E respectively, then they have identical pseudospectra.
Evidently the new A, B are still diagonalizable. Finally, as µ1 , . . . , µm ∈ Q, we have E k ≤ γ k ≤
min(αk , βk ) for k = 2, . . . , n, and so (6) still holds. The construction is complete.
Remark. The construction yields matrices A, B having eigenvalues of multiplicity at most two. It
would be interesting to obtain an example where the eigenvalues were all of multiplicity one.
4. Proofs of Theorems 1.3 and 1.4
Theorem 1.3 is an easy consequence of the following result of Greenbaum and Trefethen. Recall
that the Hilbert–Schmidt norm of a square matrix A is defined by
AHS := trace(A∗ A).
Theorem 4.1 ([3, Theorem 3]). Let A, B ∈ CN ×N , and suppose that
(zI − A)−1 HS = (zI − B)−1 HS
(23)
(z ∈ C).
Then, for every polynomial p,
p(A)HS = p(B)HS .
(24)
Since [3] was never published, we also include a brief proof for the reader’s convenience.
Proof. Setting ζ = 1/z, we see that (23) is equivalent to
trace[(I − ζA∗ )−1 (I − ζA)−1 ] = trace[(I − ζB ∗ )−1 (I − ζB)−1 ]
(25)
Expanding, we deduce that, for some r > 0,
k
k
trace(A∗k Al )ζ ζ l =
trace(B ∗k B l )ζ ζ l
k,l≥0
Taking
∂ k ∂ l
∂ζ
∂ζ
(|ζ| < r).
k,l≥0
of both sides and then setting ζ = 0, we obtain
trace(A∗k Al ) = trace(B ∗k B l )
Now, let p be a polynomial, say p(z) = nj=0 aj z j . Then
∗
(ζ ∈ C).
trace(p(A) p(A)) =
n
k,l=0
∗k
l
ak al trace(A A ) =
n
(k, l ≥ 0).
ak al trace(B ∗k B l ) = trace(p(B)∗ p(B)),
k,l=0
whence p(A)HS = p(B)HS . This completes the proof.
10
T. RANSFORD
N
2
Proof of Theorem 1.3. Observe that A2HS =
j=1 sj (A) . Thus, hypothesis (7) implies that
(23) holds, and consequently also (24). For each polynomial p, we therefore have
N
sj (p(A))2 =
j=1
N
sj (p(B))2 .
j=1
Recalling that the usual operator norm · is just the first singular value s1 , we thus obtain
p(A)2 = s1 (p(A))2 ≤
N
sj (p(A))2 =
j=1
N
sj (p(B))2 ≤ N s1 (p(B))2 = N p(B)2 .
j=1
This gives the right-hand side of (8), and the left-hand side is proved similarly.
Remark. In going from (7) to (23), we are losing some information. In fact (7) is equivalent to the
following, more complicated version of (25):
(26) trace [(I − ζA∗ )−1 (I − ζA)−1 ]n = trace [(I − ζB ∗ )−1 (I − ζB)−1 ]n
(ζ ∈ C, n ≥ 1).
Till now, we have not seen how to exploit this.
Proof of Theorem 1.4. We repeat the construction in the proof of Theorem 1.1, defining A0 , B0 , C0
exactly as in that proof. This time, however, we define
m
m
A := A0 ⊕ C0 ⊕ · · · ⊕ C0
and
B := B0 ⊕ C0 ⊕ · · · ⊕ C0 .
Then A, B ∈ CN ×N , where N = (n + 1) + m(n + 2) = (m + 1)(n + 2) − 1. Just as before,
Ak = max(Ak0 , C0k , . . . , C0k ) = αk
(k = 2, . . . , n),
and similarly for B, so (10) holds. Also, since
m
−1
(zI − A)
−1
= (zI − A0 )
⊕ (zI − C0 )−1 ⊕ · · · ⊕ (zI − C0 )−1 ,
and (zI − C0 )−1 ≥ (zI − A0 )−1 for all z ∈ C (see (17)), it follows that
(z ∈ C, j = 1, . . . , m).
sj (zI − A)−1 = (zI − C0 )−1 Likewise, the same is true with A replaced by B. Thus (9) holds, and the proof is complete.
5. Proofs of Theorems 1.5 and 1.6
In fact we shall prove the following slight generalization of Theorem 1.5.
Theorem 5.1. Let A, B ∈ CN ×N , and suppose that
(27)
(I − ζA)−1 1 = (I − ζB)−1 1 + o(ζ)
as ζ → 0, ζ ∈ C.
Then A1 = B1 .
Proof. The norm ·1 has the particularity that A1 = max(|Ae1 |1 , . . . , |AeN |1 ), where e1 , . . . , eN
is the standard unit vector basis of CN . Fix a j so that A1 = |Aej |1 . Multiplying A and B by
the same unimodular constant, we may suppose that ajj ≥ 0, in other words the j-th entry in Aej
is non-negative. It then follows that, for all t ≥ 0,
|(I + tA)ej |1 = 1 + t|Aej |1 = 1 + tA1 .
ON PSEUDOSPECTRA AND POWER GROWTH
11
On the other hand, as t → 0+ , we have
|(I + tA)ej |1 ≤ I + tA1 = (I − tA)−1 1 + o(t) = (I − tB)−1 1 + o(t) ≤ 1 + tB1 + o(t).
Combining these facts, we deduce that A1 ≤ B1 . By symmetry B1 ≤ A1 as well.
Remark. This theorem may be viewed as a result about numerical ranges in Banach algebras.
For background on numerical ranges, we refer to [1, 2]. Let (A, · A ) be a Banach algebra with
identity 1, and given a ∈ A, let νA (a) denote the numerical radius of a. It is well known that
νA (a) = lim sup
ζ→0
1 + ζaA − 1
|ζ|
(a ∈ A),
and also that there exists a constant n(A) ∈ [e−1 , 1], called the numerical index of A, such that
n(A)−1 aA ≤ νA (a) ≤ aA
(a ∈ A).
From these facts it follows easily that, if a, b ∈ A satisfy
(1 − ζa)−1 = (1 − ζb)−1 + o(ζ)
as ζ → 0, ζ ∈ C,
then νA (a) = νA (b), and hence
n(A) ≤ aA /bA ≤ n(A)−1 .
Moreover, it is known that the numerical indices of (CN ×N , · ) and (CN ×N , · 1 ) are equal to
1/2 and 1 respectively. We thus recover as special cases both the result (2) mentioned earlier and
Theorem 5.1 above.
We now turn to the proof of Theorem 1.6. Recall that the notion of admissible norm on CN ×N
was defined in the introduction, and that the weighted shift S(ω1 , . . . , ωn ) was defined in (14).
Lemma 5.2. Let ω1 , . . . , ωn be a positive submultiplicative sequence, and let S = S(ω1 , . . . , ωn ).
Then, for every admissible norm ||| · ||| on C(n+1)×(n+1) ,
|||S k ||| = ωk
(28)
(k = 1, . . . , n)
and
ωk |z|k
ωk |z|k
≤ |||(I − zS)−1 ||| ≤ 1 +
1≤k≤n
2
n
(29)
1 + max
(z ∈ C).
k=1
Proof. Repeat the proof of Lemma 2.1, observing that it is valid for every admissible norm.
Proof of Theorem 1.6. Repeat the proof of Theorem 1.1, using Lemma 5.2 in place of Lemma 2.1.
Note that the choices of γ and Γ depend only on the αj and βj , and not on the particular norm.
Thus the same pair of matrices A, B works simultaneously for all admissible norms ||| · |||.
Acknowledgements. I am greatly indebted to Nick Trefethen for introducing me to this topic,
for making available the article [3], and for numerous invaluable discussions. This work was carried
out while I was visiting the Mathematical Institute of the University of Oxford and the Oxford
University Computing Laboratory, and I am grateful to both institutions for their hospitality.
12
T. RANSFORD
References
[1] F. F. Bonsall, J. Duncan, Numerical Ranges of Operators on Normed Spaces and of Elements of Normed Algebras,
Cambridge University Press, 1971.
[2] F. F. Bonsall, J. Duncan, Numerical Ranges II, Cambridge University Press, 1973.
[3] A. Greenbaum, L. N. Trefethen, ‘Do the pseudospectra of a matrix determine its behavior?’, Technical Report
TR 93-1371, Computer Science Department, Cornell University, 1993.
[4] L. N. Trefethen, M. Embree, Spectra and Pseudospectra, Princeton University Press, Princeton, 2005.
Département de mathématiques et de statistique, Université Laval, Québec (QC), Canada G1K 7P4
E-mail address: [email protected]