* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ON PSEUDOSPECTRA AND POWER GROWTH 1. Introduction and
Survey
Document related concepts
Eisenstein's criterion wikipedia , lookup
Factorization wikipedia , lookup
Capelli's identity wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Determinant wikipedia , lookup
Birkhoff's representation theorem wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Jordan normal form wikipedia , lookup
Matrix calculus wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Oscillator representation wikipedia , lookup
Congruence lattice problem wikipedia , lookup
Matrix multiplication wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Transcript
ON PSEUDOSPECTRA AND POWER GROWTH THOMAS RANSFORD Abstract. The celebrated Kreiss matrix theorem is one of several results relating the norms of the powers of a matrix to its pseudospectra (i.e. the level curves of the norm of the resolvent). But to what extent do the pseudospectra actually determine the norms of the powers? Specifically, let A, B be square matrices such that, with respect to the usual operator norm · , (∗) (zI − A)−1 = (zI − B)−1 (z ∈ C). Then it is known that 1/2 ≤ A/B ≤ 2. Are there similar bounds for An /B n for n ≥ 2? Does the answer change if A, B are diagonalizable? What if (∗) holds, not just for the norm · , but also for higher-order singular values? What if we use norms other than the usual operator norm? The answers to all these questions turn out to be negative, and in a rather strong sense. 1. Introduction and statement of results Let N ≥ 1, let CN be complex Euclidean N -space, and let CN ×N be the algebra of complex 2 1/2 , and N × N matrices. We write | · | for the Euclidean norm on CN , defined by |x| := ( N 1 |xj | ) · for the associated operator norm on CN ×N , defined by A := sup{|Ax| : |x| = 1}. It is well known that, given A ∈ CN ×N , the long-term growth of the norms of powers of A is governed by the spectral radius ρ(A). Indeed, by the spectral radius formula, we have An ≥ ρ(A)n (n ≥ 1) and lim An 1/n = ρ(A). n→∞ However, in the shorter term, An may well be significantly larger that ρ(A)n . The recent book of Trefethen and Embree [4] contains an account of these transient effects, illustrated by examples drawn from many different fields. One of the main themes of the book is that, to predict accurately the growth of An , it is important to study not only the spectrum of A, but also its pseudospectra, which we now define. Given A ∈ CN ×N and > 0, the -pseudospectrum of A is defined to be the set σ (A) := {z ∈ C : (zI − A)−1 > −1 }. Here and in what follows we adopt the useful convention that (zI − A)−1 = ∞ if z ∈ σ(A), the spectrum of A. Thus σ (A) shrinks to σ(A) as ↓ 0. From a knowledge of the pseudospectra of A, it is possible to deduce both upper and lower bounds on the growth of An . A well-known result of this type is the Kreiss matrix theorem. We refer to [4] for this and several other such results. In addition, there are efficient methods for numerical computation of pseudospectra (see [4, Chapter IX]), so this approach is highly practical. The purpose of this note is to show that, in predicting power growth, not even pseudospectra tell the whole story. The issue was already addressed by Greenbaum and Trefethen in [3] (see also 2000 Mathematics Subject Classification. Primary 47A10; Secondary 15A18, 15A60, 65F15. Key words and phrases. matrix, norm, spectral radius, eigenvalue, singular value, pseudospectra. Research supported by grants from NSERC and the Canada Research Chairs program. 1 2 T. RANSFORD [4, §47]). Suppose that two N × N matrices A, B have identical pseudospectra, i.e. that (zI − A)−1 = (zI − B)−1 (1) (z ∈ C). Does it then follow that p(A) = p(B) for every polynomial p? In particular, can we conclude that An = B n for all n ≥ 1? The answer is no. An example was given in [3] (and √ again in [4]) of two matrices A, B with identical pseudospectra such that A = 1 and B = 2. But this example leaves several basic questions unresolved: 1.1. What about higher powers? By adapting the Greenbaum–Trefethen example, one can construct, for each > 0, matrices A, B with identical pseudospectra such that A/B > 2 − . On the other hand, it is known that if A, B satisfy (1), then we must have 1/2 ≤ A/B ≤ 2. (2) (For a proof, see [4, pp.168–169]; see also the remark after Theorem 5.1 below.) Are there similar bounds for An /B n for n ≥ 2? If this were the case, then we could justifiably say that pseudospectra determine power norms, at least up to a constant factor. However, our first result answers this question negatively, and in a fairly strong sense. Recall that (finite or infinite) sequence (αk ) is called submultiplicative if αk+l ≤ αk αl for all k, l for which the inequality makes sense. For example, the sequence (Ak )k≥1 is submultiplicative for every matrix A. Theorem 1.1. Let n ≥ 2, and let α2 , . . . , αn and β2 , . . . , βn be positive submultiplicative sequences. Then there exist N ≥ 1 and matrices A, B ∈ CN ×N such that (zI − A)−1 = (zI − B)−1 (3) (z ∈ C) and (4) Ak = αk and B k = βk (k = 2, . . . , n). We may take N = 2n + 3. This shows that matrices can have identical pseudospectra and yet their second and higher powers have norms that are completely unrelated to each other. 1.2. What about diagonalizable matrices? The matrices A, B in the Greenbaum–Trefethen example are nilpotent, as are those constructed in the proof of Theorem 1.1 above. Obviously, these are rather special. What happens if, instead, we consider more generic matrices, for example diagonalizable matrices? (By ‘diagonalizable’, we mean similar to a diagonal matrix.) Could it be that, for such matrices at least, the pseudospectra completely determine the power growth? Until now, no counterexample was known. We obtain one by combining the construction in Theorem 1.1 with a perturbation argument. Theorem 1.2. Let n ≥ 2, let α2 , . . . , αn and β2 , . . . , βn be positive submultiplicative sequences, and let > 0. Then there exist N ≥ 1 and diagonalizable matrices A, B ∈ CN ×N such that (5) (zI − A)−1 = (zI − B)−1 (z ∈ C) and (6) αk − < Ak < αk + and βk − < B k < βk + (k = 2, . . . , n). ON PSEUDOSPECTRA AND POWER GROWTH 3 1.3. What if we use ‘higher-order’ pseudospectra? Given a matrix A ∈ CN ×N , its singular values s1 (A), . . . , sN (A) are the square roots of the eigenvalues of A∗ A, listed in decreasing order. In particular, s1 (A) = A. One the principal methods for computing the pseudospectra of A is to calculate the singular values of zI − A for z ∈ C (once done for one value of z, it is relatively inexpensive to do it for many z), and then use the fact that 1 (zI − A)−1 = s1 (zI − A)−1 = . sN (zI − A) It is thus reasonable to ask whether, by retaining the other singular values of (zI − A)−1 , it is possible to determine An for values of n ≥ 2. The following result gives a partial positive answer. Theorem 1.3. Let N ≥ 1 and let A, B ∈ CN ×N be matrices satisfying (z ∈ C, j = 1, . . . , N ). (7) sj (zI − A)−1 = sj (zI − B)−1 Then, for every polynomial p, 1 p(A) √ √ ≤ ≤ N. p(B) N (8) It would be interesting to know if these bounds can be improved so as to be independent of N . However, even if this were the case, the theorem would be a bit unrealistic, because it would require us to keep track of all N singular values, which is probably too expensive in practice. Is there an analogous result where, by keeping track of just a few singular values, we can obtain inequalities like (8) at least for polynomials of low degree? The following generalization of Theorem 1.1 gives a negative answer. Theorem 1.4. Let n ≥ 2, let α2 , . . . , αn and β2 , . . . , βn be positive submultiplicative sequences, and let m ≥ 1. Then there exist N ≥ 1 and matrices A, B ∈ CN ×N such that (9) sj (zI − A)−1 = sj (zI − B)−1 (z ∈ C, j = 1, . . . , m) and (10) Ak = αk and B k = βk (k = 2, . . . , n). We may take N = (m + 1)(n + 2) − 1. 1.4. What about other norms? Though the Euclidean-norm case is undoubtedly the most important one, there are instances where it is more appropriate to consider pseudospectra defined with respect to other norms. In [4, §§56,57], several examples are given based on the 1-norm on CN , defined by |x|1 := N j=1 |xj | and the associated operator norm · 1 , given by A1 := sup{|Ax|1 : |x|1 = 1}. There is no analogue of the Greenbaum–Trefethen example for this norm, because of the following theorem. Theorem 1.5. Let N ≥ 1 and let A, B ∈ CN ×N be matrices satisfying (11) (zI − A)−1 1 = (zI − B)−1 1 (z ∈ C). Then A1 = B1 . Can we also deduce that An 1 = B n 1 for n ≥ 2? Once again, the answer turns out to be no, and not just for · 1 , but for a whole variety of possible norms. To make this precise, it is convenient to introduce some notation and terminology. 4 T. RANSFORD Given square matrices A, B, perhaps of different sizes, we shall write A ⊕ B for the block matrix A 0 . 0 B A norm ||| · ||| on CN ×N will be called admissible if it satisfies the following three conditions: • ||| · ||| is an algebra norm, i.e. |||AB||| ≤ |||A|||.|||B||| for all A, B ∈ CN ×N and |||I||| = 1; • every permutation matrix Q ∈ CN ×N satisfies |||Q||| = 1; • every block matrix A ⊕ B ∈ CN ×N satisfies |||A ⊕ B||| = max(|||A ⊕ 0|||, |||0 ⊕ B|||). p 1/p , then the associated For example, if |·|p is the usual p-norm on CN , given by |x|p := ( N j=1 |xj | ) N ×N operator norm on C is admissible. The following result is a generalization of Theorem 1.1 to this context. Theorem 1.6. Let n ≥ 2, and let α2 , . . . , αn and β2 , . . . , βn be positive submultiplicative sequences. Then there exist N ≥ 1 and A, B ∈ CN ×N such that, for every admissible norm ||| · ||| on CN ×N , |||(zI − A)−1 ||| = |||(zI − B)−1 ||| (12) (z ∈ C) and (13) |||Ak ||| = αk |||B k ||| = βk and (k = 2, . . . , n). We may take N = 2n + 3. We conclude by remarking that there is at least one well-known norm on CN ×N for which matrices A, B with identical pseudospectra do have identical power growth. This is the Hilbert– Schmidt (or Frobenius) norm, as was shown by Greenbaum and Trefethen in [3]. We shall need their result in §4, where more details will be given. Of course, the Hilbert–Schmidt norm is not an admissible norm in our sense; in fact it fails all three parts of the definition. The rest of the paper is devoted to the proofs of the six theorems above. 2. Proof of Theorem 1.1 The proof of Theorem 1.1 is based on a construction using weighted shifts, which will also serve as a model in several other proofs to follow. It is therefore written in such a way as to be easy to adapt to other situations. Given ω1 , . . . , ωn > 0, we write ⎞ ⎛ 0 ... 0 0 ω1 ⎟ ⎜0 0 ω2 /ω1 . . . 0 ⎟ ⎜ ⎟ ⎜ .. .. (14) S(ω1 , . . . , ωn ) := ⎜ ... ... ⎟. . . ⎟ ⎜ ⎝0 0 0 . . . ωn /ωn−1 ⎠ 0 0 0 ... 0 Lemma 2.1. Let ω1 , . . . , ωn be a positive submultiplicative sequence, and let S = S(ω1 , . . . , ωn ). Then S k = ωk (15) (k = 1, . . . , n) and ωk |z|k ωk |z|k ≤ (I − zS)−1 ≤ 1 + 1≤k≤n 2 n (16) 1 + max k=1 (z ∈ C). ON PSEUDOSPECTRA AND POWER GROWTH 5 Proof. Let k ∈ {1, . . . , n}. Taking Q to be an appropriate cyclic permutation matrix, we have S k Q = diag (ωk , ωk+1 /ω1 , . . . , ωn /ωn−k , 0, . . . , 0). Using the submultiplicativity of the sequence ω1 , . . . , ωn , we obtain S k Q = max(ωk , ωk+1 /ω1 , . . . , ωn /ωn−k , 0, . . . , 0) = ωk . Since Q = 1 = Q−1 and · is an algebra norm, it follows that S k = ωk . This proves (15). For the upper bound in (16), note that (I − zS)−1 = nk=0 z k S k , whence (1 − zS) −1 n n n k k k k = z S ≤ |z| S = 1 + |z|k ωk k=0 k=0 (z ∈ C). k=1 For the lower bound, first fix k ∈ {1, . . . , n} and z ∈ C. Let Q be the permutation matrix that exchanges rows 2 and k + 1, and let P := diag (1, eiθ , 0, . . . , 0), where θ = − arg(z k ). Then 1 |z|k ωk −1 ⊕ 0, P Q(I − zS) QP = 0 1 and hence 1 |z|k ωk ⊕ 0. (I − zS)−1 ≥ 0 1 Conjugating by the permutation matrix that swaps the first 1 1 |z|k ωk ⊕ 0 = |z|k ωk 0 1 Taking averages, it follows that −1 (I − zS) ≥ two rows, we have 0 ⊕ 0. 1 1 |z|k ωk /2 ⊕ 0 . k |z| ωk /2 1 Since · is always at least as large as the spectral radius, we deduce that |z|k ωk 1 |z|k ωk /2 −1 ⊕0 =1+ . (I − zS) ≥ ρ k |z| ωk /2 1 2 As this holds for each k ∈ {1, . . . , n} and each z ∈ C, we obtain the lower bound in (16). Proof of Theorem 1.1. First, choose α1 , β1 large enough so that the sequences α1 , . . . , αn and β1 , . . . , βn are submultiplicative, and set A0 := S(α1 , . . . , αn ) and B0 := S(β1 , . . . , βn ). Then set A := A0 ⊕ C0 and B := B0 ⊕ C0 , where C0 is the (n + 2) × (n + 2) matrix defined by C0 := S(Γ, γ 2 , γ 3 , . . . , γ n+1 ). Here Γ, γ are positive numbers to be chosen later. Note that the sequence Γ, γ 2 , . . . , γ n+1 is submultiplicative provided that Γ ≥ γ. Defined in this way, A, B are (2n + 3) × (2n + 3) matrices, and we shall show that they satisfy (3) and (4) if γ, Γ are chosen suitably. We first choose γ > 0 small enough so that γ k < min(αk , βk ) (k = 2, . . . , n). With this choice, Ak = max(Ak0 , C0k ) = max(αk , γ k ) = αk and likewise B k = βk (k = 2, . . . , n). (k = 2, . . . , n), 6 T. RANSFORD It remains to choose Γ to ensure that A, B have identical pseudospectra. This will be the case provided that (17) (I − zC0 )−1 ≥ (I − zA0 )−1 By Lemma 2.1, this will be true if n Γt γ n+1 tn+1 αk tk max , ≥ 2 2 and (I − zC0 )−1 ≥ (I − zB0 )−1 and k=1 n Γt γ n+1 tn+1 max βk tk , ≥ 2 2 (z ∈ C). (t ≥ 0). k=1 Now there exists t0 , depending on α1 , . . . , αn , β1 , . . . , βn , γ, but not on Γ, such that γ n+1 tn+1 αk tk ≥ 2 n γ n+1 tn+1 βk tk ≥ 2 n and k=1 (t ≥ t0 ). k=1 Hence it will suffice that Γt αk tk ≥ 2 n k=1 Γt βk tk ≥ 2 n and (0 ≤ t ≤ t0 ). k=1 This will certainly be true provided we choose Γ large enough. With this choice, the construction is complete. 3. Proof of Theorem 1.2 The basic idea is to perturb the construction in in the proof of Theorem 1.1. However, keeping track of the norms of the resolvents requires a certain amount of care. We shall need two lemmas. Lemma 3.1. Let V, W ∈ CN ×N . If V is invertible and V − W < 1/(2V −1 ), then W is invertible and V −1 − W −1 ≤ 2V −1 2 V − W . Proof. We have I − V −1 W = V −1 (V − W ) ≤ V −1 V − W ≤ 1/2. Therefore V −1 W is invertible, and hence so too is W . We also have (18) V −1 − W −1 = V −1 (W − V )W −1 ≤ V −1 W − V W −1 . Since W −V ≤ 1/(2V −1 ), it follows that V −1 −W −1 ≤ W −1 /2, whence W −1 ≤ 2V −1 . Substituting this back into (18) gives the result. In the next lemma, adj denotes adjugate and ρ denotes spectral radius. Lemma 3.2. Let V ∈ CN ×N . Then adj(V ) − (−V )N −1 ≤ 2N ρ(V )V N −2 . Proof. It suffices to prove this when V is invertible, since invertible matrices are dense in CN ×N . Let p(z) be the characteristic polynomial of V . Since p(z) = N j=1 (z − λj ), where λ1 , . . . , λN N j are the eigenvalues of V , we have p(z) = j=0 aj z , where N a0 = (−1)N det(V ) and |aj | ≤ ρ(V )N −j (j = 0, . . . , N ). (19) aN = 1, j ON PSEUDOSPECTRA AND POWER GROWTH 7 Now by the Cayley–Hamilton theorem p(V ) = 0. Multiplying by V −1 and rearranging gives a0 V −1 + aN V N −1 =− N −1 aj V j−1 . j=1 Using (19), it follows that (−1) det(V )V N −1 +V N −1 ≤ N −1 j=1 N ρ(V )N −j V j−1 . j Since det(V )V −1 = adj(V ) and ρ(V ) ≤ V , we deduce that N −1 N N N −1 ρ(V )V N −j−1 V j−1 ≤ 2N ρ(V )V N −2 , ≤ (−1) adj(V ) + V j j=1 whence the result. Proof of Theorem 1.2. Define A0 , B0 , C0 as in the proof of Theorem 1.1. The choice of γ is the same as before, so that (A0 ⊕ C0 )k = αk and (B0 ⊕ C0 )k = βk for k = 2, . . . , n. This time, however, we choose Γ a little differently, stipulating that Γ ≥ γ and Γt αk tk + 2t ≥ 2 n (20) Γt βk tk + 2t ≥ 2 n and k=1 (0 ≤ t ≤ 1/γ). k=1 The next step is to perturb A0 , B0 , C0 so as to obtain diagonalizable matrices. To this end, we fix distinct complex numbers ζ1 , . . . , ζn+1 of modulus 1/2, and set D := diag (ζ1 , . . . , ζn+1 ) and D := diag (ζ1 , . . . , ζn+1 , 0), Then, for each δ > 0, we define Aδ := A0 + δD, Bδ := B0 + δD and Cδ := C0 + δD . Each of Aδ , Bδ , Cδ has distinct eigenvalues, so Aδ ⊕ Cδ and Bδ ⊕ Cδ are diagonalizable. By continuity, if δ > 0 is chosen small enough, then k k (k = 2, . . . , n). (Aδ ⊕ Cδ ) − αk < and (Bδ ⊕ Cδ ) − βk < We next show that, reducing δ if necessary, we have (Cδ − zI)−1 ≥ (Aδ − zI)−1 (21) (Cδ − zI)−1 ≥ (Bδ − zI)−1 (|z| ≥ γ). By Lemma 2.1, we have (I − wC0 )−1 − (I − wA0 )−1 ≥ (1 + Γ|w|/2) − (1 + n |αk ||w|k ) (w ∈ C). k=1 From our choice of Γ in (20), it follows that (I − wC0 )−1 − (I − wA0 )−1 ≥ 2|w| (|w| ≤ 1/γ). We now apply Lemma 3.1 with V = I − wA0 and W = I − wAδ . We find that, if δ|w|D ≤ 1/(2(I − wA0 )−1 ), then (I − wAδ )−1 − (I − wA0 )−1 ≤ 2(I − wA0 )−1 2 δ|w|D. 8 T. RANSFORD It follows that, if δ is chosen small enough, then (I − wAδ )−1 − (I − wA0 )−1 ≤ |w| (|w| ≤ 1/γ). Likewise, if δ is small enough, then (I − wCδ )−1 − (I − wC0 )−1 ≤ |w| (|w| ≤ 1/γ). Putting all of this together, we find that, if δ is sufficiently small, then (I − wCδ )−1 ≥ (I − wAδ )−1 (|w| ≤ 1/γ), from which (21) follows for A. Evidently, a similar argument applies to B. The next step is to show that, by reducing δ yet further, we may ensure that (Cδ − zI)−1 ≥ (Aδ − zI)−1 (22) (|z| ≤ δ). (Cδ − zI)−1 ≥ (Bδ − zI)−1 For this we use Lemma 3.2. Applying this lemma with V = Aδ − zI, and recalling that this is an (n + 1) × (n + 1) matrix, we obtain adj(Aδ − zI) − (zI − Aδ )n ≤ 2n+1 ρ(Aδ − zI)Aδ − zIn−1 . Now σ(Aδ − zI) = σ(δD − zI), so ρ(Aδ − zI) = ρ(δD − zI) ≤ δD − zI ≤ δ + |z|. It follows that adj(Aδ − zI) − (zI − Aδ )n ≤ 2n+1 (δ + |z|)Aδ − zIn−1 . Note also that sup|z|≤δ (zI − Aδ )n − (−A0 )n = O(δ) as δ → 0. Hence there exists a constant K, independent of z, δ, such that adj(Aδ − zI) − (−A0 )n ≤ Kδ (|z| ≤ δ). Similarly, as Cδ is an (n + 2) × (n + 2) matrix, there exists a constant K such that adj(Cδ − zI) − (−C0 )n+1 ≤ K δ (|z| ≤ δ). Since An0 = 0 and C0n+1 = 0, it follows that, if δ is small enough, then adj(Cδ − zI) ≥ C0n+1 /2 (|z| ≤ δ). adj(Aδ − zI) ≤ 2An0 Now adj(Cδ − zI) = det(Cδ − zI)(Cδ − zI)−1 , adj(Aδ − zI) = det(Aδ − zI)(Aδ − zI)−1 and det(Cδ − zI) det(δD − zI) = = −z. det(Aδ − zI) det(δD − zI) Combining these facts, we obtain that, for sufficiently small δ > 0, (Cδ − zI)−1 1 C0n+1 1 C0n+1 ≥ ≥ (Aδ − zI)−1 |z| 4An0 δ 4An0 (|z| ≤ δ). Thus, if δ is chosen small enough, then (22) holds for A. The argument for B is similar. ON PSEUDOSPECTRA AND POWER GROWTH 9 Fix δ > 0 so that (21) and (22) hold. Summarizing what we have achieved so far, if we define A := Aδ ⊕ Cδ and B := Bδ ⊕ Cδ , then A, B are diagonalizable matrices satisfying (6) and (A − zI)−1 = (B − zI)−1 (z ∈ C \ Q), where Q is the annulus {z ∈ C : δ < |z| < γ}. It remains to deal with the case z ∈ Q, which we do as follows. Let L be the maximum of supz∈Q (A − zI)−1 and supz∈Q (B − zI)−1 . Cover Q by a finite number of disks of radius 1/L, with centres µ1 , . . . , µm ∈ Q say. Define E := diag (µ1 , . . . , µm ). Then we have (E − zI)−1 ≥ (A − zI)−1 (z ∈ Q). (E − zI)−1 ≥ (B − zI)−1 Thus, if we replace A, B by A ⊕ E, B ⊕ E respectively, then they have identical pseudospectra. Evidently the new A, B are still diagonalizable. Finally, as µ1 , . . . , µm ∈ Q, we have E k ≤ γ k ≤ min(αk , βk ) for k = 2, . . . , n, and so (6) still holds. The construction is complete. Remark. The construction yields matrices A, B having eigenvalues of multiplicity at most two. It would be interesting to obtain an example where the eigenvalues were all of multiplicity one. 4. Proofs of Theorems 1.3 and 1.4 Theorem 1.3 is an easy consequence of the following result of Greenbaum and Trefethen. Recall that the Hilbert–Schmidt norm of a square matrix A is defined by AHS := trace(A∗ A). Theorem 4.1 ([3, Theorem 3]). Let A, B ∈ CN ×N , and suppose that (zI − A)−1 HS = (zI − B)−1 HS (23) (z ∈ C). Then, for every polynomial p, p(A)HS = p(B)HS . (24) Since [3] was never published, we also include a brief proof for the reader’s convenience. Proof. Setting ζ = 1/z, we see that (23) is equivalent to trace[(I − ζA∗ )−1 (I − ζA)−1 ] = trace[(I − ζB ∗ )−1 (I − ζB)−1 ] (25) Expanding, we deduce that, for some r > 0, k k trace(A∗k Al )ζ ζ l = trace(B ∗k B l )ζ ζ l k,l≥0 Taking ∂ k ∂ l ∂ζ ∂ζ (|ζ| < r). k,l≥0 of both sides and then setting ζ = 0, we obtain trace(A∗k Al ) = trace(B ∗k B l ) Now, let p be a polynomial, say p(z) = nj=0 aj z j . Then ∗ (ζ ∈ C). trace(p(A) p(A)) = n k,l=0 ∗k l ak al trace(A A ) = n (k, l ≥ 0). ak al trace(B ∗k B l ) = trace(p(B)∗ p(B)), k,l=0 whence p(A)HS = p(B)HS . This completes the proof. 10 T. RANSFORD N 2 Proof of Theorem 1.3. Observe that A2HS = j=1 sj (A) . Thus, hypothesis (7) implies that (23) holds, and consequently also (24). For each polynomial p, we therefore have N sj (p(A))2 = j=1 N sj (p(B))2 . j=1 Recalling that the usual operator norm · is just the first singular value s1 , we thus obtain p(A)2 = s1 (p(A))2 ≤ N sj (p(A))2 = j=1 N sj (p(B))2 ≤ N s1 (p(B))2 = N p(B)2 . j=1 This gives the right-hand side of (8), and the left-hand side is proved similarly. Remark. In going from (7) to (23), we are losing some information. In fact (7) is equivalent to the following, more complicated version of (25): (26) trace [(I − ζA∗ )−1 (I − ζA)−1 ]n = trace [(I − ζB ∗ )−1 (I − ζB)−1 ]n (ζ ∈ C, n ≥ 1). Till now, we have not seen how to exploit this. Proof of Theorem 1.4. We repeat the construction in the proof of Theorem 1.1, defining A0 , B0 , C0 exactly as in that proof. This time, however, we define m m A := A0 ⊕ C0 ⊕ · · · ⊕ C0 and B := B0 ⊕ C0 ⊕ · · · ⊕ C0 . Then A, B ∈ CN ×N , where N = (n + 1) + m(n + 2) = (m + 1)(n + 2) − 1. Just as before, Ak = max(Ak0 , C0k , . . . , C0k ) = αk (k = 2, . . . , n), and similarly for B, so (10) holds. Also, since m −1 (zI − A) −1 = (zI − A0 ) ⊕ (zI − C0 )−1 ⊕ · · · ⊕ (zI − C0 )−1 , and (zI − C0 )−1 ≥ (zI − A0 )−1 for all z ∈ C (see (17)), it follows that (z ∈ C, j = 1, . . . , m). sj (zI − A)−1 = (zI − C0 )−1 Likewise, the same is true with A replaced by B. Thus (9) holds, and the proof is complete. 5. Proofs of Theorems 1.5 and 1.6 In fact we shall prove the following slight generalization of Theorem 1.5. Theorem 5.1. Let A, B ∈ CN ×N , and suppose that (27) (I − ζA)−1 1 = (I − ζB)−1 1 + o(ζ) as ζ → 0, ζ ∈ C. Then A1 = B1 . Proof. The norm ·1 has the particularity that A1 = max(|Ae1 |1 , . . . , |AeN |1 ), where e1 , . . . , eN is the standard unit vector basis of CN . Fix a j so that A1 = |Aej |1 . Multiplying A and B by the same unimodular constant, we may suppose that ajj ≥ 0, in other words the j-th entry in Aej is non-negative. It then follows that, for all t ≥ 0, |(I + tA)ej |1 = 1 + t|Aej |1 = 1 + tA1 . ON PSEUDOSPECTRA AND POWER GROWTH 11 On the other hand, as t → 0+ , we have |(I + tA)ej |1 ≤ I + tA1 = (I − tA)−1 1 + o(t) = (I − tB)−1 1 + o(t) ≤ 1 + tB1 + o(t). Combining these facts, we deduce that A1 ≤ B1 . By symmetry B1 ≤ A1 as well. Remark. This theorem may be viewed as a result about numerical ranges in Banach algebras. For background on numerical ranges, we refer to [1, 2]. Let (A, · A ) be a Banach algebra with identity 1, and given a ∈ A, let νA (a) denote the numerical radius of a. It is well known that νA (a) = lim sup ζ→0 1 + ζaA − 1 |ζ| (a ∈ A), and also that there exists a constant n(A) ∈ [e−1 , 1], called the numerical index of A, such that n(A)−1 aA ≤ νA (a) ≤ aA (a ∈ A). From these facts it follows easily that, if a, b ∈ A satisfy (1 − ζa)−1 = (1 − ζb)−1 + o(ζ) as ζ → 0, ζ ∈ C, then νA (a) = νA (b), and hence n(A) ≤ aA /bA ≤ n(A)−1 . Moreover, it is known that the numerical indices of (CN ×N , · ) and (CN ×N , · 1 ) are equal to 1/2 and 1 respectively. We thus recover as special cases both the result (2) mentioned earlier and Theorem 5.1 above. We now turn to the proof of Theorem 1.6. Recall that the notion of admissible norm on CN ×N was defined in the introduction, and that the weighted shift S(ω1 , . . . , ωn ) was defined in (14). Lemma 5.2. Let ω1 , . . . , ωn be a positive submultiplicative sequence, and let S = S(ω1 , . . . , ωn ). Then, for every admissible norm ||| · ||| on C(n+1)×(n+1) , |||S k ||| = ωk (28) (k = 1, . . . , n) and ωk |z|k ωk |z|k ≤ |||(I − zS)−1 ||| ≤ 1 + 1≤k≤n 2 n (29) 1 + max (z ∈ C). k=1 Proof. Repeat the proof of Lemma 2.1, observing that it is valid for every admissible norm. Proof of Theorem 1.6. Repeat the proof of Theorem 1.1, using Lemma 5.2 in place of Lemma 2.1. Note that the choices of γ and Γ depend only on the αj and βj , and not on the particular norm. Thus the same pair of matrices A, B works simultaneously for all admissible norms ||| · |||. Acknowledgements. I am greatly indebted to Nick Trefethen for introducing me to this topic, for making available the article [3], and for numerous invaluable discussions. This work was carried out while I was visiting the Mathematical Institute of the University of Oxford and the Oxford University Computing Laboratory, and I am grateful to both institutions for their hospitality. 12 T. RANSFORD References [1] F. F. Bonsall, J. Duncan, Numerical Ranges of Operators on Normed Spaces and of Elements of Normed Algebras, Cambridge University Press, 1971. [2] F. F. Bonsall, J. Duncan, Numerical Ranges II, Cambridge University Press, 1973. [3] A. Greenbaum, L. N. Trefethen, ‘Do the pseudospectra of a matrix determine its behavior?’, Technical Report TR 93-1371, Computer Science Department, Cornell University, 1993. [4] L. N. Trefethen, M. Embree, Spectra and Pseudospectra, Princeton University Press, Princeton, 2005. Département de mathématiques et de statistique, Université Laval, Québec (QC), Canada G1K 7P4 E-mail address: [email protected]