Download musical scales and cantonese level tones

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Renormalization group wikipedia , lookup

Serialism wikipedia , lookup

Transcript
The inaugural International Conference on Music Communication Science 5-7 December 2007, Sydney, Australia
http://marcs.uws.edu.au/links/ICoMusic
MUSICAL SCALES AND CANTONESE LEVEL TONES
Rerrario Shui-Ching Ho1, Sui-Fong Ho2
1
2
Englisches Seminar, Universität Basel, Switzerland
Cross-Culture Chinese Communications Centre, Hong Kong
[email protected], [email protected]
ABSTRACT
There are strong links between music and speech [1] --- especially
those aspects concerning tone and intonation. However, there is
little communication between musicologists and linguists.
Whereas traditionally the musical aspects of speech have not been
the object of study for musicologists, linguists tackle these aspects
without studying music seriously. The level tones of a tone
language form a system comparable to a musical scale. The
traditional description of canonical tone patterns have relied on
musically and mathematically ill-defined tools and approaches
[7]. This paper examines the general feasibility of characterizing
the pitch levels of a tone language with reference to musical
scales. By comparing the level tones of Cantonese to a musical
scale and generalizing the static nature of the latter to a dynamic
one, we formulate a mathematical framework to re-interpret and
compare different authors’ accounts and to characterize the
principle governing the pitch levels associated with a tone system
based on the ratio of the pitch distance among the level tones
rather than the pitch heights.
Index Terms: Cantonese, lexical tone, musical scale, musical
notation, pitch distance, tone-letter notation
1. INTRODUCTION
There are strong links between music and speech. [1]
Acoustically, both musical melody and speech melody
(henceforth speech prosody) manifest themselves as temporal
variation of pitch, stress and duration. For the former, pitch and
duration are highly discrete and mathematically well defined.
Despite the virtually infinite number of possible musical melodies
that can be constructed out of the large set of discrete notes,
description of the highly complex structure of a musical melody is
objective and precise. By contrast, identifiable categories and
patterns in speech prosody are far less in number. Yet they are
continuous in nature and defy straightforward quantification. This
is especially true for pitch description. Probably, pitch is the most
subtle variable in speech. Linguistically, it is not only manifested
as intonation at the phrase level but also as lexical tone at the
syllabic level. As manifested in intonation, recurring pitch
patterns are not fully identified or remain unidentifiable for most
languages. Lexical tones are less fluid but nothing easier
compared with musical notes. Although they are perceived and
stored as discrete entities in the mind of native speakers, and can
be retrieved intuitively by those with good ears with high
Proceedings of ICoMCS December 2007
consistency, they can hardly be pinned down mathematically. The
‘music’ of speech is subtle, and finding its ‘key’ or ‘notes’ remains
the biggest challenge in phonology.
So far, linguistic description of intonation and tone patterns has
been largely impressionistic. Most impressionistic accounts still
rely on two sets of primitive qualifiers --- ‘high’, ‘mid’ and ‘low’
to indicate pitch height and ‘rising’, ‘falling’, ‘level’, etc. to denote
the direction of pitch contour. A system of standardized tools for
quantitative characterization like that in music is lacking. For tone
languages in particular, which are highly constrained both in the
pitch and temporal dimension, linguists contend themselves with
the use of a so-called short-hand of musical notation [2], which is
musically ill-defined. This is the motivation for us to reconsider
musical notation seriously.
We focus on level tones, which are distinguished essentially by
pitch height. Cantonese is of particular interest as it possesses four
level tones which are essentially level and are otherwise
indistinguishable except by their pitch height. As a matter of fact, a
slight deviation of pitch in foreigners’ speech is easily perceived as
out of tune. In that sense, the Cantonese level tones form a system
comparable to a musical scale, which we henceforth call the
Cantonese level-tone scale (CLTS). The aim of our present paper
is to find out the mathematical principle governing their pitch
height and the pitch distance among them. We investigate the
feasibility of characterizing the level tones with respect to musical
scales by reviewing and comparing conventional accounts on their
relative pitch height by musical notation and tone-letter notation.
By analyzing the properties of CLTS and generalizing the static
nature of a musical scale to a dynamic one, we formulate a
mathematical framework to re-interpret and compare both the
musical and tone-letter notations given by different authors [2, 3,
4] and to characterize the pitch levels associated with a tone
system based on the ratio of their consecutive pitch distance.
2. CANTONESE TONE SYSTEM
Cantonese is a major southern Chinese dialect spoken in Hong
Kong, Guangdong, part of Guangxi province and among many
overseas Chinese communities in Southeast Asia and Englishspeaking western countries. Depending on definition, Cantonese
can be seen as having six or nine lexical tones. In the tradition of
Chinese linguistics, the three ‘entering’ (‘checked’ or ‘clipped’)
tones, which end with an unreleased /p/, /t/ or /k/, are listed as
separate tones --- T7, T8 and T9, in addition to the six non-entering
tones (Table 1). In the framework of western phonetics [3], the
Page 64
The inaugural International Conference on Music Communication Science 5-7 December 2007, Sydney, Australia
http://marcs.uws.edu.au/links/ICoMusic
three non-entering tones are not in minimal distinctive opposition
with the others due to the additional final phoneme. Among the
six tones, T1, T3, T4 and T6 are essentially level. They define what
we called the Cantonese lexical-tone scale (CLTS). For
convenience of discussion, they are put between square brackets
in ascending order of pitch --- [T4 T6 T3 T1.]. So are all their
corresponding pitch notations.
Chinese Tone
Phonetic
Character number transcription
Pitch
pattern
Tone-letter
notation [2]
fifth, which corresponds to a frequency ratio of 3/2, would sound
too harmonious, if not too exaggerated to native ears. This agrees
with our experience with westerners’ speech, which always
sounded funnily too melodious. Foreign learners very often follow
too rigidly a static framework prescribed in textbooks or conceived
of by themselves in terms of musical scales when learning tone
languages in general and Cantonese in particular. A perfect 5th is
certainly easier to memorize and reproduce. If a perfect 5th sounds
too wide, an augmented 5th or a major 6th will sound even more.
This can be confirmed easily by acoustical measurement. If all
these pitch ranges are too wide, one might ask, is it more
appropriate to represent it as diminished 5th or a major 4th? A quick
answer to this question is that the question itself is ill posed, as we
shall see shortly, because the interval is indefinite without
reference to a third tone.
詩
T1
/si/
high level
55
史
T2
/si/
high rise
35
試
T3
/si/
mid high level
33
時
T4
/si/
low level
21
市
T5
/si/
low rise
13
事
T6
/si/
mid low level
22
3.2.1 Conventional description
色
null
T7
/sik/
high stop
5
T8
/sik/
mid high stop
3
食
T9
/sik/
mid low stop
2
In 1930, Chao introduced the tone-letter notation [2], a kind of
musical short-hand, to represent the pitch contours of lexical tones.
He divided the pitch range of a speaker’s voice into four equal
intervals, resulting in five pitch levels. The integers 1 to 5 are
assigned to the five levels in ascending order of pitch respectively.
He exemplified its use by representing the six Cantonese tones as
55 (or 53), 35, 33, 11 (or 21), 23, 22 (Table 1), where, the two
numbers for each tone refer to the starting and ending point
respectively with respect to the five-point scale. As we are
focusing on the four level tones only, their denotation can be
simplified to four single tone-letters. Arranged in the ascending
order of pitch and put in square brackets, the representation of
CLTS in Chao’s system can be reduced to [1 2 3 5]. Hashimoto’s
tone-letter notation [4] runs as 22 33 44 55 (or 21 33 44 55), which
can be converted to [1 2 3 4] (or [1 3 4 5]).
Table 1: Cantonese tone system.
3. STATIC SCALE
There are two main ways of quantitative representation. The first
one was musical notation itself and the other was a diminished
version of it --- a five-point scale of tone-letters proposed by Chao
[2] in 1930. We compare the impressionistic accounts on the
lexical pitch height given by three major authors --- Jones [3],
Chao [2] and Hashimoto [4].
3.1 Musical notation
3.1.1 Conventional description
All three authors had tried to describe Cantonese tones in terms of
musical notation. Jones [3] did it already a century ago. The
sequence of musical notes used by him corresponding to the
CLTS is [G3 A3 B3 D4]. Chao seemed to agree with Jones but at
a point [5], he commented that T1 was better represented by an
augmented fifth. Thus, his musical notation of the CLTS could be
rendered as [G3 A3 B3 D4#]. Hashimoto’s perception, [G3 A3 C4
E4], disagrees not only in the pitch range but also in the second
highest tone. T3 is one whole-tone higher than the other two
representations and T1 is a major 6th higher than T4 rather than a
perfect 5th or augmented 5th. One will easily be tempted to ask:
which representation is more realistic?
3.1.2 Problems
According to our native perception, none of the above musical
notations are satisfactory. Judging from the pitch range between
T4 and T1, the lowest range among the three descriptions is
unnecessarily too wide in most ordinary speech. Jones’ perfect
Proceedings of ICoMCS December 2007
3.2 Tone-letter notation
3.2.2 Problems
To tell which author’s tone-letter representation of CLTS is more
realistic, we need to be able to interpret Chao’s five-point scale [2,
5] and to correlate it to actual musical scales and frequencies.
However, musically, this musical notation short-hand is loosely
defined and mathematically unformulated. In medieval times,
music used to be a discipline under mathematics [6]. Nevertheless,
the traditional linguists and phoneticians have studied speech
melody without bothering to learn music and mathematics. The
problems with the tone-letter notation are not merely perceptual
but conceptual. We would need a separate article to query the
validity of the system and to call for the IPA to abandon it.
4. DYNAMIC SCALE
4.1 Observations
It is not hard to imagine that a musical scale and a level-tone scale
are both relative in the sense that the pitch of a starting note or
syllable (or simply tone) in a musical or speech melodic phrase is
arbitrary. A melody can be transposed to another by re-adjusting
the absolute pitch of each note or tone so as to preserve the relative
Page 65
The inaugural International Conference on Music Communication Science 5-7 December 2007, Sydney, Australia
http://marcs.uws.edu.au/links/ICoMusic
distance between every note or tone. However, they differ in the
way the second note or tone is defined after the pitch of the
starting note or tone is fixed. Whereas in a musical scale, the
second note is automatically fixed after the initial one, it is not so
for an LTS. To illustrate, let’s consider an utterance consisting of
three different tones picked out from the LTS. Needless to say, the
pitch of the first tone is arbitrary. By extensive observation, the
pitch choice for the second tone is not yet defined even after the
first tone is fixed. Otherwise, all tones will be generated from the
antecedent ones and Cantonese speech would be rendered as real
singing. To illustrate, let’s consider Cantonese songs. Everyday
exposure to Cantonese pop songs around us will allow us to easily
come across parts of lyrics of which their tone patterns conform
very closely to their music. No matter how close they sound, their
tone patterns are never identical to the corresponding music. Of
course, the second tone is not as totally free as the first tone.
Given the first tone, the second is subject to more narrow
constraints than the first one. To account for the fact that
foreigners’ tiny deviation of pitch levels can be heard as out of
tune by native speakers requires that the pitch of the third be
defined after the pitch of the second tone is. In other words, the
pitch of the third tone is defined in terms of both the first and the
second. In that respect, a lexical tone step has a ‘second-order
relativity’. Based on this essential difference, we say that a
musical scale is static and that a lexical tone scale is dynamic.
Due to this two-fold relative nature, the pitch interval of the
canonical form of any two level tones can never be pinned down
to any absolute musical step. Since a lexical-tone scale is not
static, any musical notation can help represent only one particular
instance of tone-level realisation. It can be observed in acoustical
measurement that in an utterance of [T4 T6 T3 T1], the pitch range
between T4 and T1 may be close to a major fourth whereas in
another utterance, it turns out to be near a major 5th. This answers
our question in section 4.1. In general, an LTS can never be
mapped to any musical scale. Only a particular manifestation of
the LTS can be accounted for by a musical scale.
4.2 Pitch interval analysis
Through intense observations and measurements, we have only
arrived at the very first hint on the problem, namely the second
interval (rather than the tone) is defined in terms of the first. But
we still do not know how their relationship is governed
mathematically. The mathematical relationship might be too
subtle to perceive. Before we can find it out experimentally, it is
most intuitive and straightforward at this stage to postulate that
this relationship observes a simple ratio. With this assumption, we
can re-analyze the musical and tone-letter notations in a new light.
4.2.1 Musical intervals
Impression of musical distance presumes not only discretization
but also a musical scale. Depending how a span of pitch is
discretized, different musical scales are available. Depending on
the musical scale with reference to which one interprets his or her
pitch perception, different impressions of perceptual distance
arise. If a listener is predisposed to discretize his or her perception
of pitch in accordance with the western traditional major scale,
Proceedings of ICoMCS December 2007
Jones’ sequence of notes corresponds to, starting with the tonic and
using the anglicized "solfege syllable", [d r m s] in the G major
scale. If one is ‘deaf’ to accidentals (flats or sharps) because one is
never trained to hear them, he or she is inclined to claim that, in
the [d r m s] sequence, there is no potential note between d and r
nor between r and m but there is one note --- f, between m and s.
This person will tend to feel that the rise in pitch from G3 to A3 (d
to r in the solfege) and that from A3 to B3 (r to m in the solfege)
both corresponds to a distance of one step but the distance between
B3 and D4 (between m and s in the solfege) corresponds to two
steps. Hence, from G3 to D4, there are 4 steps or intervals. The
musical steps between consecutive levels of CLTS are in the ratios
of 1:1:2. By the same token, Hashimoto’s musical representation
of the four level tones conforms to [d r f l] in G major. Instead of a
perfect 5th, the range between the highest and lowest tone spans a
major 6th. The musical distance between the two middle tones
corresponds to two steps in the major scale instead of one.
Prohibiting insertion of accidentals, Hashimoto’s musical step
ratios of CLTS become 1:2:2. Chao’s musical step ratios become
1: 1: 2.5. To represent the scale by a sequence of integer of
numbers comparable to Chao’s scale, we assign the integer 1 to the
lowest level tone and the integer 2 to the second-lowest. The
numbers representing the remaining two levels can be found easily,
resulting in a sequence of 4 integers or half integers representing
the CLTS --- the CLTS sequence. Jones’, Chao’s and Hashimoto’s
CLTS sequences become [1 2 3 5], [1 2 3 5.5] and [1 2 4 6]
respectively.
What musical scale one is disposed to adopt is conditioned by
cultural and education background. Different musical traditions
employ different scales, which include different number of pitches
and different intervals. For example, the major scale is a kind of
diatonic scale, where there are seven notes within an octave.
However, in many cultures, including China, the pentatonic scale
is more popular, which consists only of five notes within an
octave. For a Chinese who has never been exposed to western
music, he would tend to perceive music with reference to a
pentatonic scale. If beginning on the pitch class C, this scale would
run in ascending order as C D E G A or d r m s l. Jones’ sequence,
though being able to fit into this scale, would be interpreted very
differently. The musical steps between successive tones will now
be expressed in ratios of 1: 1: 1 whereas Hashimoto’s musical
notation will have step ratios of 1: 1: 2, which correspond to the
CLTS sequence [1 2 3 4] and [1 2 3 5] respectively.
4.2.2 Tone-letter intervals
By the same token, Chao’s musical notation short-hand can be reinterpreted by neglecting the arbitrary assignment of 1 and 5 to the
lower and upper limits of tone space and by regarding the
difference between tone-letters as proportional to perceptual
distance. In this way, Chao’s Cantonese tone-letter sequence --- [1
2 3 5], corresponds to the interval ratios of 1: 1: 2 whereas that of
Bauer --- [1 2 4 5], corresponds to 1: 2: 1. If the pitch distance of
mathematically well-defined musical notations is susceptible to
subjective interpretation with respect to musical scales, how should
we interpret the perceptual distance of successive levels of Chao’s
tone-letter scale musically? What kind of musical scale does this
Page 66
The inaugural International Conference on Music Communication Science 5-7 December 2007, Sydney, Australia
http://marcs.uws.edu.au/links/ICoMusic
short-hand of music represent? Does it correspond to a major
scale or pentatonic scale? We will not be able to talk about
perceptual distance objectively without agreeing on a
mathematically justified scale which is fine enough to
accommodate all the above interpretations. No matter to which
scale the original designer’s personal perception tended to
conform, we are forced to abandon all the cultural prejudice, if the
system is to be used across languages and cultures. This problem
is analogous to the tuning problem in the middle ages. This led to
the appearance of an equal-tempered scale, e.g. the chromatic
scale. Our modern perception of music is predominantly based on
the European tradition of diatonic scale, including minor and
major scales, which has seven notes within an octave. In the
chromatic scale, each of the five whole-tone steps can be divided
further into semi-tones so that there are twelve steps in an octave.
In terms of the chromatic scale, the musical interval ratios of
CLTS of Jones, Chao and Hashimoto become 2:2:3, 1:1:2 and
2:3:4 respectively. The respective CLTS sequences become [1 2 3
4.5], [1 2 3 5] and [1 2 3.5 5.5]. In a similar vein, we need to
calibrate our perception by a mathematically justified scale which
is more general than the chromatic scale and does not presume
fixed pitch intervals.
4.3 Mathematical framework
Mathematically, the frequency of the successive notes of a welltempered scale can be derived from the formula f = AR n ,
whereby A is an arbitrary pitch, n is an integer and R the m-th root
of 2. In western music of 12th equal temperament, m is equal to
12. In some culture, m is equal to 32. To generalize the
mathematical form so as to represent a dynamic scale, m (hence
R) is arbitrary. Pitch levels generated by consecutive integers n’s
should now be perceived as separated by equal pitch distance. Let
the essentially level pitch that can be elicited from any series of
citation forms of T4, T6, T3 and T1 have frequencies f1, f2, f3 and f4
respectively. They can be expressed as
f1 = ar
f 2 = ar n
f 3 = ar n
f 4 = ar n
n1
(1a)
2
(1b)
3
(1c)
4
(1d)
where a, r, and the n’s are arbitrary. Taking the ratio of each
successive pair of equations and then the algorithm, the successive
differences in exponents of r are obtained:
n 2 − n1 = log( f 2 f 1 ) / log( r )
(2a)
n3 − n 2 = log( f 3 f 2 ) / log( r )
(2b)
n 4 − n 3 = log( f 4 f 3 ) / log( r )
(2c)
Thus, the interval ratios of CLTS can be expressed as (n2-n1): (n3n2): (n4-n3). Since r is arbitrary, it can be chosen in a way so that
(n2-n1)=1. The normalized perceptual interval ratios of CLTS
become
log( f 3 f 2 ) log( f 4 f 3 )
1:
:
log( f 2 f 1 ) log( f 2 f 1 )
(3)
Proceedings of ICoMCS December 2007
Now that we have assigned 1 to the step distance between the
lowest and the second lowest level, assigning the lowest level as 1
(n1=1) will make the second lowest level become 2 (n2=1). From
the above ratios, the remaining two level tones can be deduced and
hence the sequence of CLTS
⎡
⎛ log(f3 f 2 ) ⎞ ⎛ log(f3 f 2 ) log(f 4 f 3 ) ⎞⎤
⎟ ⎜2 +
⎟⎥
+
⎢1 2 ⎜⎜2 +
⎟ ⎜
⎟
⎢⎣
⎝ log(f 2 f 1) ⎠ ⎝ log(f 2 f 1 ) log(f 2 f 1 ) ⎠⎥⎦
(4)
conforming to the first two steps of Chao’s scale, is obtained.
5. DISCUSSION
To sum up, we have thrown doubt not only on the validity of the
conventional tone-letter notation of Cantonese tones in specific,
but the validity of the notation system itself in general. We have
formulated a simple mathematical framework to describe the pitch
intervals among level tones of Cantonese, which can be extended
to other tone languages. This framework not only enables a more
realistic description of a tone system, which should be given in
terms of the ratios of pitch intervals among the level tones
according to equation (2) and (3), but also accommodates a unified
re-interpretation of both musical and tone-letter notations by
transformation to equation (4). With the new framework, we can
proceed [7] with serious acoustical measurements to test the
validity of various linguistic and engineering claims [8, 9] more
objectively.
6. ACKNOWLEDGEMENTS
We would like to thank the Swiss National Science Foundation and
Cross-Culture Chinese Communications Centre for funding and
subsidies.
7. REFERENCES
1. Sloboda, J., The Musical Mind: The Cognitive Psychology of
Music, New York, Oxford University Press, 1989.
2. Chao, Y.-R. “A system of tone letters”, Maitre Phonétique 45:
24-27. 1930.
3. Jones, D. and Woo, K.T., A Cantonese phonetic reader,
University of London Press, London, 1912.
4. Hashimoto, O. K.Y., Studies in Yue dialects. 1. Phonology of
Cantonese, Cambridge University Press, Cambridge, 1972, p.
92, 122.
5. Chao, Y.-R., Cantonese Primer, Greenwood Press, New York,
1947.
6. Schellenberg, E. G., Trehub, S. E. “Natural musical intervals:
Evidence from infant listeners”, Psychological Science, Vol. 7,
No. 5, 1996.
7. Ho, R. S.-C., Sagisaka, Y. “F0 analysis of perceptual distance
among Cantonese level tones”, Proc. Interspeech 2007: 10381041.
8. Ho, R. S.-C. “English teaching and learning in Hong Kong”,
International Cooper Series on English and Literature, Vol. 10,
Schwabe Verlag, Basel, 2005.
9. Li, Y.J., Lee, T. and Qian, Y. "Acoustical F0 analysis of
continuous Cantonese speech", Proc. ISCSLP 2002: 127-130.
Page 67