Download Programming an Emotion-Based Original Music Generator through

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Programming an Emotion-Based Original Music Generator through Conventional
Musical Patterns
by William Locke
A Behavior Paper
Presented to
Junior Science, Engineering and Humanities Symposium
University of Missouri-St. Louis
by
William Alexander Locke IV
Senior
St. Charles West High School
3601 Droste
St. Charles, 63301
September 24 – Continuing Progress
Joan Twillman
Teacher
1606 Watson
For assistance in this original research, acknowledgement should be extended to:
Mrs. Joan Twillman, Authentic Science research teacher, for steady support and guidance
in this research,
Mr. Andrew Scott, choir and orchestra teacher, for indispensable information on standard
musical conventions used,
Dr. Annabel Cohen, professor of Music Psychology at UPEI, for continuing advice and
support,
Dr. David Huron, professor of Music Psychology at Ohio University, for continuing
advice and support.
Abstract:
The first objective of the project detailed is to create a music generator capable of writing
original songs based upon standard musical conventions. To this end, a random note
generator has been programmed to put out a sequence of notes with random pitch values
and durations; this random note generator is then given rules and parameters restricting
and patterning its output so that the undefined, almost accidental sounds can be arranged
into harmonious, structured, and, ultimately, musical compositions. The random
generator ensures the notes are arranged in an original format; the musical parameters
ensure that the arrangement conforms to structures of key, meter, and chords.
The second objective of the project detailed is to determine the music generator’s
output along certain emotional content. To this end, the rules and the parameters
employed by the generator are specified by selected emotional commands: happy, sad,
soothing, restless. The emotional command entered will determine the use of certain
rules, particularly of key (happy/sad) and meter (soothing/restless).
This project is tested through the participation of volunteers who listen to a
selection of musical pieces generated by this program and complete a survey in which
they rate the “listenability” of each piece (determining the effectiveness of the overall
musical structure) and the emotional content, if any, they detected in each (determining
the effectiveness of the specified emotional commands). The end result is an original
music generator capable of writing music with a specific emotional content, as validated
through human response.
Table of Contents
Statement of Problem
1
Review of Literature
1
Procedures
3
Results
8
Conclusions
11
Bibliography
12
1
In the computer/behavioral experiment, “Programming an Emotion-Based
Original Music Generator through Conventional Musical Patterns,” the purpose is twofold: first, to determine if it is possible to design a computer program that can generate
original music based on the random selection of various musical conventions and
patterns; second, to see if this program can generate original music specifically geared
towards a certain emotional content (happy/sad/restless/soothing) through the
specification and manipulation of certain of these musical tools.
In studying the extensive and varied levels of emotional effect brought about by
music, a broad literary base is of paramount importance. Even pursuing specific ends
within the field, a general knowledge of the connection between the mind and musical
sound and pattern is necessary in any attempt to manipulate or even scientifically observe
this relationship.
For the past fifty years, one text, Meyer’s Emotion and Meaning in Music6, has
been the foundation of any and all studies into the field of musical appreciation on the
psychological level. In it, Meyer details the subconscious patterns of expectation and
deviation observed in the minds of listeners and how music plays off of this subconscious
process. Music, though seeming to possess an inherent connection to the human species
throughout even centuries of evolution3, is actually based very heavily on learned
behavior2. This is supported by the fact that, though many aspects of the response to
sound are common to all humans, the specific structures accorded to music are not
uniform throughout either history or culture7. As music therefore is, in effect, an
“acquired taste”, its evolution and diversification are understandable.
2
The issue of how music essentially plays to learned responses in people is the first
concern of this research, as these specific responses are the foundation of the generator’s
programmed rules. To appreciate music, the mind must first learn to recognize its
patterns and structure4, much the same as the developing mind assimilates lingual
information purely through exposure1. In simply hearing music for many years, the mind
develops a steadily more extensive recognition of its general tendencies and rules.
Beyond this, it will also begin associating certain of these rules to specific genres, styles,
or emotional content6. Though an individual who is not in some way trained or taught
the technical aspects of musical arrangement such as key, meter, and chords will not have
an explicit understanding of these terms even in a musical sense, they will still have an
implicit sense of whether or not the music they are hearing conforms to these rules;
further, they will regularly detect even minor variations in the musical structure, though
they will be unable to identify the specific change9. This is because, in developing and
refining its repertoire of musical patterns, the mind will begin making predictions on
what it thinks it should be hearing next. This process of prediction remains, for most
untrained listeners, unconscious; it is only brought to the conscious awareness in the
event of surprising deviation6; it is in this event, however, that emotional response to the
music is most strongly defined2.
The gathered rules of musical appreciation, then, are this:
1) that sound must be structured in such a fashion as to be recognized by
the brain as musical, this structure being based upon forms common to
most music;
3
2) that this structure should be defined enough to provide for some degree
of classification for the music, if such is its aim;
3) that this music must follow enough common patterns for the brain to
form predictions of its progress;
4) that, for a strong emotional reaction, the music must, at points, thwart
the expectations of the brain in some form or another while remaining
within “musical parameters.”
Based on these rules standard to musical composition, it was deemed plausible to
reproduce a number of the common conventions in a computer program. Following these
rules, convincing musical structure could be generated; specifying them, emotional
content might be added. Finally, owing to the random nature of the computer program, it
was thought likely that the proper degree of deviation from expectation might be
achieved as well. The only potential problem was that, due also to this random selection,
the listener might not be able to follow a common enough pattern to form predictions
from the music and therefore be capable of appreciating it fully.
In order to test any of this theoretical design, a random note generator had to be
procured. This random note generator would provide an output of pitches of random
assignment; then, the generator itself could then be reprogrammed with the particular
rules set to structure these notes into music. However, finding such a versatile system
proved difficult, and so a random note generator was instead created using a simple
programming operation which would put out notes in written format, i.e. A, #A, C, et
cetera. This new program was then further edited and channeled so that the notes written,
when played, would posses the structure, and thereby sound, of music.
4
The development of musical rules thus became the focus for achieving the stated
ends of the music generator. Though it could denote the specific pitches of notes, the
actual execution of its original output would sound like nothing so much as the young
toddler’s first unguided foray onto the piano—all the right notes are hit in all the wrong
ways. Seeing as the computer program obviously could not learn and develop a musical
appreciation itself, as most humans can, the rules instead had to be directed to specifically
what options the program had available to choose from when arranging music. Here, the
key focus was turned to issues of meter, key, note selection, chords, note length, and
tempo as the primary agents of both musical recognition and emotional indication.
To begin with meter, the division of a musical piece into units of measures and
beats, a standard of four-four time was decided on as the most common and easy to use
time signature. Four-four time simply means that there will be four beats in a measure,
and a quarter note is counted as a single beat. In order to make this effective in the
programming, the generator was limited to only creating four “beats” (which were later to
be defined more fully in note length) in a given process and then repeating this process
however many times as necessary to create a song of a specified length (the length itself
was determined in number of measures, not actual time). With this rule in effect, the
generator would actually program, for purposes of musical expression, by measure. The
idea of changing the time set for different pieces, both for variety and possibly emotional
impact (five-four time was considered as a potential trigger to a “restless” reaction) was
briefly tested, but it has yet to be explored fully.
The next essential aspect, that of key, was identified early on as both critical to
musical structure and emotional specification. The key of a piece determines which eight
5
tones of the twelve-tone scale may be played in a single song. It determines the
difference between a song played in C-major and a song played in G-minor, specifically
that the latter plays B and E flat. The transition from the musical rule to the
programming rule was actually easier in this case; the idea behind both was essentially
the same. For the computer program, all that was required was that the selection pool be
specified along rules of both the major and minor keys. The generator was to select a
random key by number, and, upon doing so, the list of available pitches for output were
automatically limited to only those pitches acceptable within the selected key. This rule
ensured a degree of harmony between notes played and heightened the musical sense of
the results perceptively. However, after the creation of a recognizably musical piece
became possible, the selection of key was further refined. In music, it is most common
for triumphant, light, enjoyable, and otherwise happy music to be played in a major key;
by contrast, most dark, mournful, haunting, or otherwise sad music is composed in a
minor key. This became the first and most important distinction between happy and sad
music in the generator. If a request for happy music was made, the program selected any
major key, but, if the request was for sad music, the key would invariably be minor.
Thus, the emotional aspect began to be extended in importance to the overall project.
Directly after key, the issue of note selection should be addressed. Some degree
of this process already being touched upon by the preceding paragraph on key and yet
more to follow with the following paragraph on chords, the selection directly addressed
here is confined to the bass note and the selection of random single notes within the
piece. First, for each measure, a bass note is selected at random (conforming to rules of
key); from this bass note, all chords must be patterned, but single notes simply must stay
6
within the same key. The selection of individual notes within a measure is much like the
selection of the bass notes, except that there is only one bass note per measure (more on
this in the paragraph on note duration) whereas there may be multiple individual notes,
and nothing else in the measure is based off of the individual note, though the individual
note is based off of the key. Further techniques for refining the actual selection of notes
have yet to be followed in experimentation.
An issue tightly connected with note selection is chords, and it indeed caused the
generator no small amount of trouble for some time. Chords are generally a series of
three notes played in unison; as such, the harmony in these specific notes is essential to
their appreciation. In order to correctly select these three notes (though the program
could also play two notes together), an entirely new process of selection had to be
devised. Based off of both the bass note and the key, the three potential notes for use in
chords were specified; the standard chord is one with the lowest note exactly one octave
above the bass note and then two other notes, each separated by two tones from the one
below. However, this only determines which three notes are available for selection; the
order may be altered randomly, and, as such, that is what the generator was programmed
to do. So long as it had a bass note to play off and the key to determine the separation of
the tones, it was possible to arrange the three tones in almost whatever order to be played
later. An area of intense interest for further exploration is that of chord progression,
which is the standard pattern set in selecting different chords in music, but this also
awaits further restructuring.
Finally, by combining the rules for bass notes, individual notes, and chords
together, it became possible to create the full structure of a song purely in regards to
7
pitch. If the generator was to put out a single note, it only had to conform to the loose
rules of individual notes; if it was going to put out more than one note, then the more
stringent rules on chord were put to use in the selection. The generator had progressed
rapidly in its ability to put out notes which were arranged as recognizable music.
Coming finally to note length, this was shown to be one of the most complicated
concepts to transfer from musical notation to computational parameter. In music, the
duration of a note is denoted by the symbol for it, whether it be a quarter note (one beat),
half note (two beats), dotted half note (three beats), or whole note (four beats). On the
music generator, the output was organized as follows:
|G|_|_|E|
----------|E|_|E|C|
----------|C|F|C|G|
=======
|C|C|C|C|
The repeated C below the double line is the bass note of the measure while the notes
above are chords and individual notes. However, seeing as there would always be four
notes (or at least a note per beat, with some being in chords), every tone had to be
denoted as a quarter note, a single beat. This was initially the only way to make the math
come out correctly. However, a method was finally devised to work around the lack of
standard musical technique in the program; an option was included for every “beat” (or
essentially every column in the figure above) that, instead of just selecting either one,
two, or three notes, the generator could also choose to leave the entire beat blank. In the
event that this did occur, the note(s) of the last beat were extended by the users of the
program to encompass both beats, making a rather tricky rendition of half notes, dotted
8
half notes, and whole notes possible. For the individual notes and chords, the option to
leave a beat blank was left entirely random; for the bass note, all the beats after the first
were blank by default. This was effected to actually show that the bass note in each
measure would be a whole note, lasting that entire measure.
This last guideline for the actual output of the musical generator thus completed,
focus turned to one final issue of interest that had really more effect upon the emotional
recognition than musical structure. The tempo of the piece, or how quickly the beats are
counted out (“however many beats per minute”) was to be selected randomly by the
generator in correlation to the emotional state requested; the tempo itself would simply
appear as a number within its specific range which would be taken into account when the
piece was actually played. The range of the tempos was initially determined by the
traditional tempos of certain classical forms of music, and these ranges have thus far
remained in use. The tempo was set so that the highest would be played as restless,
followed by happy, sad, and soothing, respectively. The effect of the tempo would
eventually show to be one of the most influential in emotional reaction when the
generator would be tested on volunteer listeners.
The initial testing of the original music generator proved interesting. First off,
time constraints and problems in programming prevented the generator from being fully
programmed with the necessary rules for all of its emotional commands to operate—
restless and soothing were temporarily put aside for these purposes. Even those two that
did operate, happy and sad, were still in fairly rough formats that needed some degree of
refinement. Thus, it was really a prototype that was to be tested for its efficiency in
generating music both listenable as music and recognizable as having an emotional
9
content. However, in early presentations of the capabilities of the music generator to
various individuals interested in the project, it conducted itself quite admirably in all
regards. When it was necessary to take actual data from volunteer subjects, however, the
fates took their due return for the up-to-this-point remarkable advancements. A large
number of teachers were gathered together to listen to a seven short pieces of generated
music and respond on surveys their reactions both musically and emotionally. Due to
restraints, however, only five pieces had been successfully completed prior to the time for
testing, and of these pieces only one was of the quality previously displayed by the music
generated. Four people were tested on these five songs. In a rush during a lull after the
conclusion of that first test, two more songs were generated and put to another computer
program capable of playing it; these songs, however, were put up in such a hurry that
their quality was severely deteriorated in execution and there had been no time to listen to
them prior to the actual test getting underway. Needless to say, it was a rather
unorthodox experience for scientific research.
Nonetheless, a considerable deal of data was provided, and, from this, both
aspects of continuing success and surprising failure in the generator have been further
analyzed for improvement. The results of the experiment, while not expected, have
nonetheless proven enlightening.
All the pieces generated were rated, on a scale of one to ten in “listenability,”
higher than six-point-five; excluding the two pieces which suffered from considerable
user error due to rushed programming, this minimum figure rises to seven-point-eightseven. Thus, despite the personal opinion that the music generated for the test was,
overall, of an inferior quality to previous pieces, the results clearly indicate that an
10
audience could recognize the output as musical in nature. Though certainly with room
for improvement, the first stated goal of the generator had been achieved.
Observing the emotional reactions to the pieces, the results seemingly became far
more erratic. Of the seven pieces, only two were distinctly recognized by their intended
emotional output; interestingly, both of these songs were programmed and recognized as
happy (it stands note, however, that five of the seven pieces were programmed as happy).
Next to this, two other pieces had intriguing enough responses to warrant further
discussion. The first piece, which had been programmed to be recognizable as happy,
ended up being played in a minor chord due to user error; the recorded results identified
sad as the primary response. Though marked as incorrect because of its original
programming, the unintended change to minor chord could be seen as responsible for
causing the piece to play largely to the programming rules for “sad;” if this were decided
the case, then the first selection also was correctly identified by emotional content. The
third song, programmed to be recognizable as sad, also had results showing a higher
number of “sad” responses than “happy”; however, it was not included among the
successful pieces due to the largest number of responses actually residing in the option of
“undecided.” This brings up another important aspect of the test: out of one-hundred and
fifty-three recorded emotional responses from all the pieces totaled, forty-three were
“undecided” and thirteen were “not applicable;” this is over thirty-six percent of the total
results. While not accounting for close to a majority of the results, lowering this
percentage is one of the continuing goals of the project.
The last observations on the results gathered are derived from viewing both the
“listenability” ratings and the emotional recognition together. It is certainly worth note
11
that, in all but two of the pieces, the highest ratings were given by listeners who identified
the pieces as happy; furthermore, of the two exceptions, those categories with higher
rating averages had very few listeners actually choose them, so that the high scores of one
or two offset the averaged total of the majority. The highest average of listenability did
not, as might be expected, correspond to the intended emotional output; that is, the first
sad song (which came the closest to being correctly identified) also had higher ratings
from those listeners who identified it as happy. Further, responses dedicated as
“undecided,” though suggesting a lack of classification on the part of the music
generated, did not always have low ratings; in some few cases, the average ratings for
these responses proved higher than the ratings for definite responses.
The original music generator has proven an experiment with remarkable potential
for further research; in manipulating learned conventions to the extent that musical
structure can be recreated through random selection, it has shows a strong potential for
the reconciliation of computational process and learned human behaviors; in the
continuing difficulty to consistently evoke specific emotional responses by these same
rules, it allows for continued exploration into the complicated process of learned reaction
that dominates human behavior in areas even extending beyond music appreciation. The
actual rules employed are temporary; conventions change and evolve as music does. In
its inability to purposefully change form, the generator could never wholly replace the
role of conscious composers, at least in this format; regardless, its immediate
entertainment value should be obvious. The merging of computational and behavioral
sciences remains, ultimately, the greatest and most promising achievement of this project.
12
Bibliography
[1] Bower, B. “Song Sung Blue.” Science News 28 Feb. 2004.
[2] Huron, David. Sweet Anticipation: Music and the Psychology of Expectation.
Cambridge: MIT Press, 2006.
[3] Huron, David. “Is music an evolutionary adaptation?” Annals of the New York
Academy of Sciences 930 (2001): 43-61.
[4] Kratus, John. “A Developmental Study of Children’s Interpretation of Emotion in
Music.” Psychology of Music 21.1 (1993).
[5] Levitin, Daniel J. This is Your Brain on Music: The Science of a Human Obsession.
New York: Dutton, 2006.
[6] Meyer, Leonard B. Emotion and Meaning in Music. Chicago: University of Chicago
Press, 1956.
[7] Reck, David. Music of the Whole Earth. New York: Charles Scribner’s Sons, 1977.
[8] Scott, Andrew. Personal Interview. 11 Nov. 2006.
[9] Zatorre, J. and Carol L. Krumhansl. “Mental Models and Musical Minds.” Science 13
Dec. 2002.