Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The inaugural International Conference on Music Communication Science 5-7 December 2007, Sydney, Australia http://marcs.uws.edu.au/links/ICoMusic MUSICAL SCALES AND CANTONESE LEVEL TONES Rerrario Shui-Ching Ho1, Sui-Fong Ho2 1 2 Englisches Seminar, Universität Basel, Switzerland Cross-Culture Chinese Communications Centre, Hong Kong [email protected], [email protected] ABSTRACT There are strong links between music and speech [1] --- especially those aspects concerning tone and intonation. However, there is little communication between musicologists and linguists. Whereas traditionally the musical aspects of speech have not been the object of study for musicologists, linguists tackle these aspects without studying music seriously. The level tones of a tone language form a system comparable to a musical scale. The traditional description of canonical tone patterns have relied on musically and mathematically ill-defined tools and approaches [7]. This paper examines the general feasibility of characterizing the pitch levels of a tone language with reference to musical scales. By comparing the level tones of Cantonese to a musical scale and generalizing the static nature of the latter to a dynamic one, we formulate a mathematical framework to re-interpret and compare different authors’ accounts and to characterize the principle governing the pitch levels associated with a tone system based on the ratio of the pitch distance among the level tones rather than the pitch heights. Index Terms: Cantonese, lexical tone, musical scale, musical notation, pitch distance, tone-letter notation 1. INTRODUCTION There are strong links between music and speech. [1] Acoustically, both musical melody and speech melody (henceforth speech prosody) manifest themselves as temporal variation of pitch, stress and duration. For the former, pitch and duration are highly discrete and mathematically well defined. Despite the virtually infinite number of possible musical melodies that can be constructed out of the large set of discrete notes, description of the highly complex structure of a musical melody is objective and precise. By contrast, identifiable categories and patterns in speech prosody are far less in number. Yet they are continuous in nature and defy straightforward quantification. This is especially true for pitch description. Probably, pitch is the most subtle variable in speech. Linguistically, it is not only manifested as intonation at the phrase level but also as lexical tone at the syllabic level. As manifested in intonation, recurring pitch patterns are not fully identified or remain unidentifiable for most languages. Lexical tones are less fluid but nothing easier compared with musical notes. Although they are perceived and stored as discrete entities in the mind of native speakers, and can be retrieved intuitively by those with good ears with high Proceedings of ICoMCS December 2007 consistency, they can hardly be pinned down mathematically. The ‘music’ of speech is subtle, and finding its ‘key’ or ‘notes’ remains the biggest challenge in phonology. So far, linguistic description of intonation and tone patterns has been largely impressionistic. Most impressionistic accounts still rely on two sets of primitive qualifiers --- ‘high’, ‘mid’ and ‘low’ to indicate pitch height and ‘rising’, ‘falling’, ‘level’, etc. to denote the direction of pitch contour. A system of standardized tools for quantitative characterization like that in music is lacking. For tone languages in particular, which are highly constrained both in the pitch and temporal dimension, linguists contend themselves with the use of a so-called short-hand of musical notation [2], which is musically ill-defined. This is the motivation for us to reconsider musical notation seriously. We focus on level tones, which are distinguished essentially by pitch height. Cantonese is of particular interest as it possesses four level tones which are essentially level and are otherwise indistinguishable except by their pitch height. As a matter of fact, a slight deviation of pitch in foreigners’ speech is easily perceived as out of tune. In that sense, the Cantonese level tones form a system comparable to a musical scale, which we henceforth call the Cantonese level-tone scale (CLTS). The aim of our present paper is to find out the mathematical principle governing their pitch height and the pitch distance among them. We investigate the feasibility of characterizing the level tones with respect to musical scales by reviewing and comparing conventional accounts on their relative pitch height by musical notation and tone-letter notation. By analyzing the properties of CLTS and generalizing the static nature of a musical scale to a dynamic one, we formulate a mathematical framework to re-interpret and compare both the musical and tone-letter notations given by different authors [2, 3, 4] and to characterize the pitch levels associated with a tone system based on the ratio of their consecutive pitch distance. 2. CANTONESE TONE SYSTEM Cantonese is a major southern Chinese dialect spoken in Hong Kong, Guangdong, part of Guangxi province and among many overseas Chinese communities in Southeast Asia and Englishspeaking western countries. Depending on definition, Cantonese can be seen as having six or nine lexical tones. In the tradition of Chinese linguistics, the three ‘entering’ (‘checked’ or ‘clipped’) tones, which end with an unreleased /p/, /t/ or /k/, are listed as separate tones --- T7, T8 and T9, in addition to the six non-entering tones (Table 1). In the framework of western phonetics [3], the Page 64 The inaugural International Conference on Music Communication Science 5-7 December 2007, Sydney, Australia http://marcs.uws.edu.au/links/ICoMusic three non-entering tones are not in minimal distinctive opposition with the others due to the additional final phoneme. Among the six tones, T1, T3, T4 and T6 are essentially level. They define what we called the Cantonese lexical-tone scale (CLTS). For convenience of discussion, they are put between square brackets in ascending order of pitch --- [T4 T6 T3 T1.]. So are all their corresponding pitch notations. Chinese Tone Phonetic Character number transcription Pitch pattern Tone-letter notation [2] fifth, which corresponds to a frequency ratio of 3/2, would sound too harmonious, if not too exaggerated to native ears. This agrees with our experience with westerners’ speech, which always sounded funnily too melodious. Foreign learners very often follow too rigidly a static framework prescribed in textbooks or conceived of by themselves in terms of musical scales when learning tone languages in general and Cantonese in particular. A perfect 5th is certainly easier to memorize and reproduce. If a perfect 5th sounds too wide, an augmented 5th or a major 6th will sound even more. This can be confirmed easily by acoustical measurement. If all these pitch ranges are too wide, one might ask, is it more appropriate to represent it as diminished 5th or a major 4th? A quick answer to this question is that the question itself is ill posed, as we shall see shortly, because the interval is indefinite without reference to a third tone. 詩 T1 /si/ high level 55 史 T2 /si/ high rise 35 試 T3 /si/ mid high level 33 時 T4 /si/ low level 21 市 T5 /si/ low rise 13 事 T6 /si/ mid low level 22 3.2.1 Conventional description 色 null T7 /sik/ high stop 5 T8 /sik/ mid high stop 3 食 T9 /sik/ mid low stop 2 In 1930, Chao introduced the tone-letter notation [2], a kind of musical short-hand, to represent the pitch contours of lexical tones. He divided the pitch range of a speaker’s voice into four equal intervals, resulting in five pitch levels. The integers 1 to 5 are assigned to the five levels in ascending order of pitch respectively. He exemplified its use by representing the six Cantonese tones as 55 (or 53), 35, 33, 11 (or 21), 23, 22 (Table 1), where, the two numbers for each tone refer to the starting and ending point respectively with respect to the five-point scale. As we are focusing on the four level tones only, their denotation can be simplified to four single tone-letters. Arranged in the ascending order of pitch and put in square brackets, the representation of CLTS in Chao’s system can be reduced to [1 2 3 5]. Hashimoto’s tone-letter notation [4] runs as 22 33 44 55 (or 21 33 44 55), which can be converted to [1 2 3 4] (or [1 3 4 5]). Table 1: Cantonese tone system. 3. STATIC SCALE There are two main ways of quantitative representation. The first one was musical notation itself and the other was a diminished version of it --- a five-point scale of tone-letters proposed by Chao [2] in 1930. We compare the impressionistic accounts on the lexical pitch height given by three major authors --- Jones [3], Chao [2] and Hashimoto [4]. 3.1 Musical notation 3.1.1 Conventional description All three authors had tried to describe Cantonese tones in terms of musical notation. Jones [3] did it already a century ago. The sequence of musical notes used by him corresponding to the CLTS is [G3 A3 B3 D4]. Chao seemed to agree with Jones but at a point [5], he commented that T1 was better represented by an augmented fifth. Thus, his musical notation of the CLTS could be rendered as [G3 A3 B3 D4#]. Hashimoto’s perception, [G3 A3 C4 E4], disagrees not only in the pitch range but also in the second highest tone. T3 is one whole-tone higher than the other two representations and T1 is a major 6th higher than T4 rather than a perfect 5th or augmented 5th. One will easily be tempted to ask: which representation is more realistic? 3.1.2 Problems According to our native perception, none of the above musical notations are satisfactory. Judging from the pitch range between T4 and T1, the lowest range among the three descriptions is unnecessarily too wide in most ordinary speech. Jones’ perfect Proceedings of ICoMCS December 2007 3.2 Tone-letter notation 3.2.2 Problems To tell which author’s tone-letter representation of CLTS is more realistic, we need to be able to interpret Chao’s five-point scale [2, 5] and to correlate it to actual musical scales and frequencies. However, musically, this musical notation short-hand is loosely defined and mathematically unformulated. In medieval times, music used to be a discipline under mathematics [6]. Nevertheless, the traditional linguists and phoneticians have studied speech melody without bothering to learn music and mathematics. The problems with the tone-letter notation are not merely perceptual but conceptual. We would need a separate article to query the validity of the system and to call for the IPA to abandon it. 4. DYNAMIC SCALE 4.1 Observations It is not hard to imagine that a musical scale and a level-tone scale are both relative in the sense that the pitch of a starting note or syllable (or simply tone) in a musical or speech melodic phrase is arbitrary. A melody can be transposed to another by re-adjusting the absolute pitch of each note or tone so as to preserve the relative Page 65 The inaugural International Conference on Music Communication Science 5-7 December 2007, Sydney, Australia http://marcs.uws.edu.au/links/ICoMusic distance between every note or tone. However, they differ in the way the second note or tone is defined after the pitch of the starting note or tone is fixed. Whereas in a musical scale, the second note is automatically fixed after the initial one, it is not so for an LTS. To illustrate, let’s consider an utterance consisting of three different tones picked out from the LTS. Needless to say, the pitch of the first tone is arbitrary. By extensive observation, the pitch choice for the second tone is not yet defined even after the first tone is fixed. Otherwise, all tones will be generated from the antecedent ones and Cantonese speech would be rendered as real singing. To illustrate, let’s consider Cantonese songs. Everyday exposure to Cantonese pop songs around us will allow us to easily come across parts of lyrics of which their tone patterns conform very closely to their music. No matter how close they sound, their tone patterns are never identical to the corresponding music. Of course, the second tone is not as totally free as the first tone. Given the first tone, the second is subject to more narrow constraints than the first one. To account for the fact that foreigners’ tiny deviation of pitch levels can be heard as out of tune by native speakers requires that the pitch of the third be defined after the pitch of the second tone is. In other words, the pitch of the third tone is defined in terms of both the first and the second. In that respect, a lexical tone step has a ‘second-order relativity’. Based on this essential difference, we say that a musical scale is static and that a lexical tone scale is dynamic. Due to this two-fold relative nature, the pitch interval of the canonical form of any two level tones can never be pinned down to any absolute musical step. Since a lexical-tone scale is not static, any musical notation can help represent only one particular instance of tone-level realisation. It can be observed in acoustical measurement that in an utterance of [T4 T6 T3 T1], the pitch range between T4 and T1 may be close to a major fourth whereas in another utterance, it turns out to be near a major 5th. This answers our question in section 4.1. In general, an LTS can never be mapped to any musical scale. Only a particular manifestation of the LTS can be accounted for by a musical scale. 4.2 Pitch interval analysis Through intense observations and measurements, we have only arrived at the very first hint on the problem, namely the second interval (rather than the tone) is defined in terms of the first. But we still do not know how their relationship is governed mathematically. The mathematical relationship might be too subtle to perceive. Before we can find it out experimentally, it is most intuitive and straightforward at this stage to postulate that this relationship observes a simple ratio. With this assumption, we can re-analyze the musical and tone-letter notations in a new light. 4.2.1 Musical intervals Impression of musical distance presumes not only discretization but also a musical scale. Depending how a span of pitch is discretized, different musical scales are available. Depending on the musical scale with reference to which one interprets his or her pitch perception, different impressions of perceptual distance arise. If a listener is predisposed to discretize his or her perception of pitch in accordance with the western traditional major scale, Proceedings of ICoMCS December 2007 Jones’ sequence of notes corresponds to, starting with the tonic and using the anglicized "solfege syllable", [d r m s] in the G major scale. If one is ‘deaf’ to accidentals (flats or sharps) because one is never trained to hear them, he or she is inclined to claim that, in the [d r m s] sequence, there is no potential note between d and r nor between r and m but there is one note --- f, between m and s. This person will tend to feel that the rise in pitch from G3 to A3 (d to r in the solfege) and that from A3 to B3 (r to m in the solfege) both corresponds to a distance of one step but the distance between B3 and D4 (between m and s in the solfege) corresponds to two steps. Hence, from G3 to D4, there are 4 steps or intervals. The musical steps between consecutive levels of CLTS are in the ratios of 1:1:2. By the same token, Hashimoto’s musical representation of the four level tones conforms to [d r f l] in G major. Instead of a perfect 5th, the range between the highest and lowest tone spans a major 6th. The musical distance between the two middle tones corresponds to two steps in the major scale instead of one. Prohibiting insertion of accidentals, Hashimoto’s musical step ratios of CLTS become 1:2:2. Chao’s musical step ratios become 1: 1: 2.5. To represent the scale by a sequence of integer of numbers comparable to Chao’s scale, we assign the integer 1 to the lowest level tone and the integer 2 to the second-lowest. The numbers representing the remaining two levels can be found easily, resulting in a sequence of 4 integers or half integers representing the CLTS --- the CLTS sequence. Jones’, Chao’s and Hashimoto’s CLTS sequences become [1 2 3 5], [1 2 3 5.5] and [1 2 4 6] respectively. What musical scale one is disposed to adopt is conditioned by cultural and education background. Different musical traditions employ different scales, which include different number of pitches and different intervals. For example, the major scale is a kind of diatonic scale, where there are seven notes within an octave. However, in many cultures, including China, the pentatonic scale is more popular, which consists only of five notes within an octave. For a Chinese who has never been exposed to western music, he would tend to perceive music with reference to a pentatonic scale. If beginning on the pitch class C, this scale would run in ascending order as C D E G A or d r m s l. Jones’ sequence, though being able to fit into this scale, would be interpreted very differently. The musical steps between successive tones will now be expressed in ratios of 1: 1: 1 whereas Hashimoto’s musical notation will have step ratios of 1: 1: 2, which correspond to the CLTS sequence [1 2 3 4] and [1 2 3 5] respectively. 4.2.2 Tone-letter intervals By the same token, Chao’s musical notation short-hand can be reinterpreted by neglecting the arbitrary assignment of 1 and 5 to the lower and upper limits of tone space and by regarding the difference between tone-letters as proportional to perceptual distance. In this way, Chao’s Cantonese tone-letter sequence --- [1 2 3 5], corresponds to the interval ratios of 1: 1: 2 whereas that of Bauer --- [1 2 4 5], corresponds to 1: 2: 1. If the pitch distance of mathematically well-defined musical notations is susceptible to subjective interpretation with respect to musical scales, how should we interpret the perceptual distance of successive levels of Chao’s tone-letter scale musically? What kind of musical scale does this Page 66 The inaugural International Conference on Music Communication Science 5-7 December 2007, Sydney, Australia http://marcs.uws.edu.au/links/ICoMusic short-hand of music represent? Does it correspond to a major scale or pentatonic scale? We will not be able to talk about perceptual distance objectively without agreeing on a mathematically justified scale which is fine enough to accommodate all the above interpretations. No matter to which scale the original designer’s personal perception tended to conform, we are forced to abandon all the cultural prejudice, if the system is to be used across languages and cultures. This problem is analogous to the tuning problem in the middle ages. This led to the appearance of an equal-tempered scale, e.g. the chromatic scale. Our modern perception of music is predominantly based on the European tradition of diatonic scale, including minor and major scales, which has seven notes within an octave. In the chromatic scale, each of the five whole-tone steps can be divided further into semi-tones so that there are twelve steps in an octave. In terms of the chromatic scale, the musical interval ratios of CLTS of Jones, Chao and Hashimoto become 2:2:3, 1:1:2 and 2:3:4 respectively. The respective CLTS sequences become [1 2 3 4.5], [1 2 3 5] and [1 2 3.5 5.5]. In a similar vein, we need to calibrate our perception by a mathematically justified scale which is more general than the chromatic scale and does not presume fixed pitch intervals. 4.3 Mathematical framework Mathematically, the frequency of the successive notes of a welltempered scale can be derived from the formula f = AR n , whereby A is an arbitrary pitch, n is an integer and R the m-th root of 2. In western music of 12th equal temperament, m is equal to 12. In some culture, m is equal to 32. To generalize the mathematical form so as to represent a dynamic scale, m (hence R) is arbitrary. Pitch levels generated by consecutive integers n’s should now be perceived as separated by equal pitch distance. Let the essentially level pitch that can be elicited from any series of citation forms of T4, T6, T3 and T1 have frequencies f1, f2, f3 and f4 respectively. They can be expressed as f1 = ar f 2 = ar n f 3 = ar n f 4 = ar n n1 (1a) 2 (1b) 3 (1c) 4 (1d) where a, r, and the n’s are arbitrary. Taking the ratio of each successive pair of equations and then the algorithm, the successive differences in exponents of r are obtained: n 2 − n1 = log( f 2 f 1 ) / log( r ) (2a) n3 − n 2 = log( f 3 f 2 ) / log( r ) (2b) n 4 − n 3 = log( f 4 f 3 ) / log( r ) (2c) Thus, the interval ratios of CLTS can be expressed as (n2-n1): (n3n2): (n4-n3). Since r is arbitrary, it can be chosen in a way so that (n2-n1)=1. The normalized perceptual interval ratios of CLTS become log( f 3 f 2 ) log( f 4 f 3 ) 1: : log( f 2 f 1 ) log( f 2 f 1 ) (3) Proceedings of ICoMCS December 2007 Now that we have assigned 1 to the step distance between the lowest and the second lowest level, assigning the lowest level as 1 (n1=1) will make the second lowest level become 2 (n2=1). From the above ratios, the remaining two level tones can be deduced and hence the sequence of CLTS ⎡ ⎛ log(f3 f 2 ) ⎞ ⎛ log(f3 f 2 ) log(f 4 f 3 ) ⎞⎤ ⎟ ⎜2 + ⎟⎥ + ⎢1 2 ⎜⎜2 + ⎟ ⎜ ⎟ ⎢⎣ ⎝ log(f 2 f 1) ⎠ ⎝ log(f 2 f 1 ) log(f 2 f 1 ) ⎠⎥⎦ (4) conforming to the first two steps of Chao’s scale, is obtained. 5. DISCUSSION To sum up, we have thrown doubt not only on the validity of the conventional tone-letter notation of Cantonese tones in specific, but the validity of the notation system itself in general. We have formulated a simple mathematical framework to describe the pitch intervals among level tones of Cantonese, which can be extended to other tone languages. This framework not only enables a more realistic description of a tone system, which should be given in terms of the ratios of pitch intervals among the level tones according to equation (2) and (3), but also accommodates a unified re-interpretation of both musical and tone-letter notations by transformation to equation (4). With the new framework, we can proceed [7] with serious acoustical measurements to test the validity of various linguistic and engineering claims [8, 9] more objectively. 6. ACKNOWLEDGEMENTS We would like to thank the Swiss National Science Foundation and Cross-Culture Chinese Communications Centre for funding and subsidies. 7. REFERENCES 1. Sloboda, J., The Musical Mind: The Cognitive Psychology of Music, New York, Oxford University Press, 1989. 2. Chao, Y.-R. “A system of tone letters”, Maitre Phonétique 45: 24-27. 1930. 3. Jones, D. and Woo, K.T., A Cantonese phonetic reader, University of London Press, London, 1912. 4. Hashimoto, O. K.Y., Studies in Yue dialects. 1. Phonology of Cantonese, Cambridge University Press, Cambridge, 1972, p. 92, 122. 5. Chao, Y.-R., Cantonese Primer, Greenwood Press, New York, 1947. 6. Schellenberg, E. G., Trehub, S. E. “Natural musical intervals: Evidence from infant listeners”, Psychological Science, Vol. 7, No. 5, 1996. 7. Ho, R. S.-C., Sagisaka, Y. “F0 analysis of perceptual distance among Cantonese level tones”, Proc. Interspeech 2007: 10381041. 8. Ho, R. S.-C. “English teaching and learning in Hong Kong”, International Cooper Series on English and Literature, Vol. 10, Schwabe Verlag, Basel, 2005. 9. Li, Y.J., Lee, T. and Qian, Y. "Acoustical F0 analysis of continuous Cantonese speech", Proc. ISCSLP 2002: 127-130. Page 67