Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Programming an Emotion-Based Original Music Generator through Conventional Musical Patterns by William Locke A Behavior Paper Presented to Junior Science, Engineering and Humanities Symposium University of Missouri-St. Louis by William Alexander Locke IV Senior St. Charles West High School 3601 Droste St. Charles, 63301 September 24 – Continuing Progress Joan Twillman Teacher 1606 Watson For assistance in this original research, acknowledgement should be extended to: Mrs. Joan Twillman, Authentic Science research teacher, for steady support and guidance in this research, Mr. Andrew Scott, choir and orchestra teacher, for indispensable information on standard musical conventions used, Dr. Annabel Cohen, professor of Music Psychology at UPEI, for continuing advice and support, Dr. David Huron, professor of Music Psychology at Ohio University, for continuing advice and support. Abstract: The first objective of the project detailed is to create a music generator capable of writing original songs based upon standard musical conventions. To this end, a random note generator has been programmed to put out a sequence of notes with random pitch values and durations; this random note generator is then given rules and parameters restricting and patterning its output so that the undefined, almost accidental sounds can be arranged into harmonious, structured, and, ultimately, musical compositions. The random generator ensures the notes are arranged in an original format; the musical parameters ensure that the arrangement conforms to structures of key, meter, and chords. The second objective of the project detailed is to determine the music generator’s output along certain emotional content. To this end, the rules and the parameters employed by the generator are specified by selected emotional commands: happy, sad, soothing, restless. The emotional command entered will determine the use of certain rules, particularly of key (happy/sad) and meter (soothing/restless). This project is tested through the participation of volunteers who listen to a selection of musical pieces generated by this program and complete a survey in which they rate the “listenability” of each piece (determining the effectiveness of the overall musical structure) and the emotional content, if any, they detected in each (determining the effectiveness of the specified emotional commands). The end result is an original music generator capable of writing music with a specific emotional content, as validated through human response. Table of Contents Statement of Problem 1 Review of Literature 1 Procedures 3 Results 8 Conclusions 11 Bibliography 12 1 In the computer/behavioral experiment, “Programming an Emotion-Based Original Music Generator through Conventional Musical Patterns,” the purpose is twofold: first, to determine if it is possible to design a computer program that can generate original music based on the random selection of various musical conventions and patterns; second, to see if this program can generate original music specifically geared towards a certain emotional content (happy/sad/restless/soothing) through the specification and manipulation of certain of these musical tools. In studying the extensive and varied levels of emotional effect brought about by music, a broad literary base is of paramount importance. Even pursuing specific ends within the field, a general knowledge of the connection between the mind and musical sound and pattern is necessary in any attempt to manipulate or even scientifically observe this relationship. For the past fifty years, one text, Meyer’s Emotion and Meaning in Music6, has been the foundation of any and all studies into the field of musical appreciation on the psychological level. In it, Meyer details the subconscious patterns of expectation and deviation observed in the minds of listeners and how music plays off of this subconscious process. Music, though seeming to possess an inherent connection to the human species throughout even centuries of evolution3, is actually based very heavily on learned behavior2. This is supported by the fact that, though many aspects of the response to sound are common to all humans, the specific structures accorded to music are not uniform throughout either history or culture7. As music therefore is, in effect, an “acquired taste”, its evolution and diversification are understandable. 2 The issue of how music essentially plays to learned responses in people is the first concern of this research, as these specific responses are the foundation of the generator’s programmed rules. To appreciate music, the mind must first learn to recognize its patterns and structure4, much the same as the developing mind assimilates lingual information purely through exposure1. In simply hearing music for many years, the mind develops a steadily more extensive recognition of its general tendencies and rules. Beyond this, it will also begin associating certain of these rules to specific genres, styles, or emotional content6. Though an individual who is not in some way trained or taught the technical aspects of musical arrangement such as key, meter, and chords will not have an explicit understanding of these terms even in a musical sense, they will still have an implicit sense of whether or not the music they are hearing conforms to these rules; further, they will regularly detect even minor variations in the musical structure, though they will be unable to identify the specific change9. This is because, in developing and refining its repertoire of musical patterns, the mind will begin making predictions on what it thinks it should be hearing next. This process of prediction remains, for most untrained listeners, unconscious; it is only brought to the conscious awareness in the event of surprising deviation6; it is in this event, however, that emotional response to the music is most strongly defined2. The gathered rules of musical appreciation, then, are this: 1) that sound must be structured in such a fashion as to be recognized by the brain as musical, this structure being based upon forms common to most music; 3 2) that this structure should be defined enough to provide for some degree of classification for the music, if such is its aim; 3) that this music must follow enough common patterns for the brain to form predictions of its progress; 4) that, for a strong emotional reaction, the music must, at points, thwart the expectations of the brain in some form or another while remaining within “musical parameters.” Based on these rules standard to musical composition, it was deemed plausible to reproduce a number of the common conventions in a computer program. Following these rules, convincing musical structure could be generated; specifying them, emotional content might be added. Finally, owing to the random nature of the computer program, it was thought likely that the proper degree of deviation from expectation might be achieved as well. The only potential problem was that, due also to this random selection, the listener might not be able to follow a common enough pattern to form predictions from the music and therefore be capable of appreciating it fully. In order to test any of this theoretical design, a random note generator had to be procured. This random note generator would provide an output of pitches of random assignment; then, the generator itself could then be reprogrammed with the particular rules set to structure these notes into music. However, finding such a versatile system proved difficult, and so a random note generator was instead created using a simple programming operation which would put out notes in written format, i.e. A, #A, C, et cetera. This new program was then further edited and channeled so that the notes written, when played, would posses the structure, and thereby sound, of music. 4 The development of musical rules thus became the focus for achieving the stated ends of the music generator. Though it could denote the specific pitches of notes, the actual execution of its original output would sound like nothing so much as the young toddler’s first unguided foray onto the piano—all the right notes are hit in all the wrong ways. Seeing as the computer program obviously could not learn and develop a musical appreciation itself, as most humans can, the rules instead had to be directed to specifically what options the program had available to choose from when arranging music. Here, the key focus was turned to issues of meter, key, note selection, chords, note length, and tempo as the primary agents of both musical recognition and emotional indication. To begin with meter, the division of a musical piece into units of measures and beats, a standard of four-four time was decided on as the most common and easy to use time signature. Four-four time simply means that there will be four beats in a measure, and a quarter note is counted as a single beat. In order to make this effective in the programming, the generator was limited to only creating four “beats” (which were later to be defined more fully in note length) in a given process and then repeating this process however many times as necessary to create a song of a specified length (the length itself was determined in number of measures, not actual time). With this rule in effect, the generator would actually program, for purposes of musical expression, by measure. The idea of changing the time set for different pieces, both for variety and possibly emotional impact (five-four time was considered as a potential trigger to a “restless” reaction) was briefly tested, but it has yet to be explored fully. The next essential aspect, that of key, was identified early on as both critical to musical structure and emotional specification. The key of a piece determines which eight 5 tones of the twelve-tone scale may be played in a single song. It determines the difference between a song played in C-major and a song played in G-minor, specifically that the latter plays B and E flat. The transition from the musical rule to the programming rule was actually easier in this case; the idea behind both was essentially the same. For the computer program, all that was required was that the selection pool be specified along rules of both the major and minor keys. The generator was to select a random key by number, and, upon doing so, the list of available pitches for output were automatically limited to only those pitches acceptable within the selected key. This rule ensured a degree of harmony between notes played and heightened the musical sense of the results perceptively. However, after the creation of a recognizably musical piece became possible, the selection of key was further refined. In music, it is most common for triumphant, light, enjoyable, and otherwise happy music to be played in a major key; by contrast, most dark, mournful, haunting, or otherwise sad music is composed in a minor key. This became the first and most important distinction between happy and sad music in the generator. If a request for happy music was made, the program selected any major key, but, if the request was for sad music, the key would invariably be minor. Thus, the emotional aspect began to be extended in importance to the overall project. Directly after key, the issue of note selection should be addressed. Some degree of this process already being touched upon by the preceding paragraph on key and yet more to follow with the following paragraph on chords, the selection directly addressed here is confined to the bass note and the selection of random single notes within the piece. First, for each measure, a bass note is selected at random (conforming to rules of key); from this bass note, all chords must be patterned, but single notes simply must stay 6 within the same key. The selection of individual notes within a measure is much like the selection of the bass notes, except that there is only one bass note per measure (more on this in the paragraph on note duration) whereas there may be multiple individual notes, and nothing else in the measure is based off of the individual note, though the individual note is based off of the key. Further techniques for refining the actual selection of notes have yet to be followed in experimentation. An issue tightly connected with note selection is chords, and it indeed caused the generator no small amount of trouble for some time. Chords are generally a series of three notes played in unison; as such, the harmony in these specific notes is essential to their appreciation. In order to correctly select these three notes (though the program could also play two notes together), an entirely new process of selection had to be devised. Based off of both the bass note and the key, the three potential notes for use in chords were specified; the standard chord is one with the lowest note exactly one octave above the bass note and then two other notes, each separated by two tones from the one below. However, this only determines which three notes are available for selection; the order may be altered randomly, and, as such, that is what the generator was programmed to do. So long as it had a bass note to play off and the key to determine the separation of the tones, it was possible to arrange the three tones in almost whatever order to be played later. An area of intense interest for further exploration is that of chord progression, which is the standard pattern set in selecting different chords in music, but this also awaits further restructuring. Finally, by combining the rules for bass notes, individual notes, and chords together, it became possible to create the full structure of a song purely in regards to 7 pitch. If the generator was to put out a single note, it only had to conform to the loose rules of individual notes; if it was going to put out more than one note, then the more stringent rules on chord were put to use in the selection. The generator had progressed rapidly in its ability to put out notes which were arranged as recognizable music. Coming finally to note length, this was shown to be one of the most complicated concepts to transfer from musical notation to computational parameter. In music, the duration of a note is denoted by the symbol for it, whether it be a quarter note (one beat), half note (two beats), dotted half note (three beats), or whole note (four beats). On the music generator, the output was organized as follows: |G|_|_|E| ----------|E|_|E|C| ----------|C|F|C|G| ======= |C|C|C|C| The repeated C below the double line is the bass note of the measure while the notes above are chords and individual notes. However, seeing as there would always be four notes (or at least a note per beat, with some being in chords), every tone had to be denoted as a quarter note, a single beat. This was initially the only way to make the math come out correctly. However, a method was finally devised to work around the lack of standard musical technique in the program; an option was included for every “beat” (or essentially every column in the figure above) that, instead of just selecting either one, two, or three notes, the generator could also choose to leave the entire beat blank. In the event that this did occur, the note(s) of the last beat were extended by the users of the program to encompass both beats, making a rather tricky rendition of half notes, dotted 8 half notes, and whole notes possible. For the individual notes and chords, the option to leave a beat blank was left entirely random; for the bass note, all the beats after the first were blank by default. This was effected to actually show that the bass note in each measure would be a whole note, lasting that entire measure. This last guideline for the actual output of the musical generator thus completed, focus turned to one final issue of interest that had really more effect upon the emotional recognition than musical structure. The tempo of the piece, or how quickly the beats are counted out (“however many beats per minute”) was to be selected randomly by the generator in correlation to the emotional state requested; the tempo itself would simply appear as a number within its specific range which would be taken into account when the piece was actually played. The range of the tempos was initially determined by the traditional tempos of certain classical forms of music, and these ranges have thus far remained in use. The tempo was set so that the highest would be played as restless, followed by happy, sad, and soothing, respectively. The effect of the tempo would eventually show to be one of the most influential in emotional reaction when the generator would be tested on volunteer listeners. The initial testing of the original music generator proved interesting. First off, time constraints and problems in programming prevented the generator from being fully programmed with the necessary rules for all of its emotional commands to operate— restless and soothing were temporarily put aside for these purposes. Even those two that did operate, happy and sad, were still in fairly rough formats that needed some degree of refinement. Thus, it was really a prototype that was to be tested for its efficiency in generating music both listenable as music and recognizable as having an emotional 9 content. However, in early presentations of the capabilities of the music generator to various individuals interested in the project, it conducted itself quite admirably in all regards. When it was necessary to take actual data from volunteer subjects, however, the fates took their due return for the up-to-this-point remarkable advancements. A large number of teachers were gathered together to listen to a seven short pieces of generated music and respond on surveys their reactions both musically and emotionally. Due to restraints, however, only five pieces had been successfully completed prior to the time for testing, and of these pieces only one was of the quality previously displayed by the music generated. Four people were tested on these five songs. In a rush during a lull after the conclusion of that first test, two more songs were generated and put to another computer program capable of playing it; these songs, however, were put up in such a hurry that their quality was severely deteriorated in execution and there had been no time to listen to them prior to the actual test getting underway. Needless to say, it was a rather unorthodox experience for scientific research. Nonetheless, a considerable deal of data was provided, and, from this, both aspects of continuing success and surprising failure in the generator have been further analyzed for improvement. The results of the experiment, while not expected, have nonetheless proven enlightening. All the pieces generated were rated, on a scale of one to ten in “listenability,” higher than six-point-five; excluding the two pieces which suffered from considerable user error due to rushed programming, this minimum figure rises to seven-point-eightseven. Thus, despite the personal opinion that the music generated for the test was, overall, of an inferior quality to previous pieces, the results clearly indicate that an 10 audience could recognize the output as musical in nature. Though certainly with room for improvement, the first stated goal of the generator had been achieved. Observing the emotional reactions to the pieces, the results seemingly became far more erratic. Of the seven pieces, only two were distinctly recognized by their intended emotional output; interestingly, both of these songs were programmed and recognized as happy (it stands note, however, that five of the seven pieces were programmed as happy). Next to this, two other pieces had intriguing enough responses to warrant further discussion. The first piece, which had been programmed to be recognizable as happy, ended up being played in a minor chord due to user error; the recorded results identified sad as the primary response. Though marked as incorrect because of its original programming, the unintended change to minor chord could be seen as responsible for causing the piece to play largely to the programming rules for “sad;” if this were decided the case, then the first selection also was correctly identified by emotional content. The third song, programmed to be recognizable as sad, also had results showing a higher number of “sad” responses than “happy”; however, it was not included among the successful pieces due to the largest number of responses actually residing in the option of “undecided.” This brings up another important aspect of the test: out of one-hundred and fifty-three recorded emotional responses from all the pieces totaled, forty-three were “undecided” and thirteen were “not applicable;” this is over thirty-six percent of the total results. While not accounting for close to a majority of the results, lowering this percentage is one of the continuing goals of the project. The last observations on the results gathered are derived from viewing both the “listenability” ratings and the emotional recognition together. It is certainly worth note 11 that, in all but two of the pieces, the highest ratings were given by listeners who identified the pieces as happy; furthermore, of the two exceptions, those categories with higher rating averages had very few listeners actually choose them, so that the high scores of one or two offset the averaged total of the majority. The highest average of listenability did not, as might be expected, correspond to the intended emotional output; that is, the first sad song (which came the closest to being correctly identified) also had higher ratings from those listeners who identified it as happy. Further, responses dedicated as “undecided,” though suggesting a lack of classification on the part of the music generated, did not always have low ratings; in some few cases, the average ratings for these responses proved higher than the ratings for definite responses. The original music generator has proven an experiment with remarkable potential for further research; in manipulating learned conventions to the extent that musical structure can be recreated through random selection, it has shows a strong potential for the reconciliation of computational process and learned human behaviors; in the continuing difficulty to consistently evoke specific emotional responses by these same rules, it allows for continued exploration into the complicated process of learned reaction that dominates human behavior in areas even extending beyond music appreciation. The actual rules employed are temporary; conventions change and evolve as music does. In its inability to purposefully change form, the generator could never wholly replace the role of conscious composers, at least in this format; regardless, its immediate entertainment value should be obvious. The merging of computational and behavioral sciences remains, ultimately, the greatest and most promising achievement of this project. 12 Bibliography [1] Bower, B. “Song Sung Blue.” Science News 28 Feb. 2004. [2] Huron, David. Sweet Anticipation: Music and the Psychology of Expectation. Cambridge: MIT Press, 2006. [3] Huron, David. “Is music an evolutionary adaptation?” Annals of the New York Academy of Sciences 930 (2001): 43-61. [4] Kratus, John. “A Developmental Study of Children’s Interpretation of Emotion in Music.” Psychology of Music 21.1 (1993). [5] Levitin, Daniel J. This is Your Brain on Music: The Science of a Human Obsession. New York: Dutton, 2006. [6] Meyer, Leonard B. Emotion and Meaning in Music. Chicago: University of Chicago Press, 1956. [7] Reck, David. Music of the Whole Earth. New York: Charles Scribner’s Sons, 1977. [8] Scott, Andrew. Personal Interview. 11 Nov. 2006. [9] Zatorre, J. and Carol L. Krumhansl. “Mental Models and Musical Minds.” Science 13 Dec. 2002.