Download Music Mood Classification using Intro and Refrain Parts of Lyrics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Music Mood Classification using Intro and Refrain
Parts of Lyrics
Seungwon Oh and Minsoo Hahn
Jinsul Kim
Digital Media Lab
Korea Advanced Institute of Science and Technology
(KAIST)
Daejeon, Korea
{swoh, mshahn1}@kaist.ac.kr
Electronics and Computer Engineering
Chonnam National University
Gwangju, Korea
[email protected]
Abstract—In this paper, we propose an lyrics-based
classification approach. It estimates a mood of a song with only
intro and refrain parts of lyrics. In general, the intro part creates
a specific atmosphere of a song, and the chorus part is the
strongest part of the song. The proposed method detects
important features significantly associated with the mood of
songs from the both parts. By calculating the similarity between
terms of the parts and eight basic emotions, it can classify songs
according to mood.
Keywords—mood; classification; lyrics
I.
INTRODUCTION
Recently, we are living in the flow of media. Specially,
music is one of the most popular media. In general, songs are
closely related with emotion and mood. Sometimes, people
select songs which they want to listen to according to their
mood. However, it is not an easy task to find manually the
music what they want because there are so many songs.
Therefore, a music classification and recommendation system
is needed. In Itunes software, users can attach tags to songs,
and they can listen to music what they want easily [1].
However, users should attach tag keywords to their songs fully
manually in order to classify them. In addition, users should
know about songs to attach tags to them. Thus, automatic
approaches are needed.
II.
PLUTCHICK'S EMOTION MODEL
A. Decision of mood
The Plutchick’s basic emotion model defines the emotion
of human with eight basic emotions as shown in Fig. 1 [3].
Plutchick insists that the other emotions which are not
displayed on the Fig. 1 can be represented by combinations of
basic emotions.
In this paper, the proposed approach utilizes the eight basic
emotions such as Joy, Acceptance, Anticipation, Anger,
Disgust, Sadness, Surprise, and Fear. And in order to analyze
the similarity between lyrics and emotions, the method
considers a correlation between vocabularies and the emotions.
B. Music recommendation based on the mood of a user
In order to recommend songs for users, the proposed
approach defines three types of emotions as users' state such as
happy, angry, and sad as shown in Table I. When a user feels
happy, it plays songs which have the mood of joy, acceptance,
and anticipation. If a user feels sad, there are two kinds of
results. One is a collection of sad songs to let user feel
sympathy. And another is happy and encouraging one to let
user overcome the sadness. Users in sad mood can choose one
of the two.
There are some methods to classify songs automatically.
Saunders uses speech and sound signals to get features from
music [2]. However, speech in music is quite different from
normal speech. Therefore, it is very hard to recognize the
speech signals. On the other hand, another method is to analyze
the lyrics. It is much simpler than analyzing sound signal.
This paper represents a new lyrics-based mood
classification method. It utilizes only intro and refrain parts of
lyrics in order to classify songs. The intro part of lyrics
includes the information to create the atmosphere of a song,
and the refrain part has the most important keywords of the
song. Therefore, the proposed approach can enhance the
classification accuracy by disregarding meaningless words.
Figure 1. Plutchick's basic emotion model
978-1-4799-0604-8/13/$31.00 ©2013 IEEE
TABLE I.
MUSIC RECOMMENDATION TABLE
User's mood
Happy
Angry
Recommended
mood of songs
joy
+
acceptance +
anticipation
anger + disgust
III.
Sad
case1
Sadness + fear
+ surprise
case2
joy +
acceptance +
anticipation
FEATURE SELECTION
A. Feature Set Selection
Feature selection is an important part in pattern
classification processes. It is more important than choosing the
learning algorithm. In text classification area, the frequency of
words is usually used.
The proposed classification method is based on lyrics of a
song. The method deals with features different from other text
classification.
First, we employ term count as a feature for our
classification as shown in Fig. 2. In addition, we calculate the
similarity between word and emotion.
Second, we focus on the intro part and the refrain part
because of the following two reasons.
1) The intro part of lyrics is the start part and decides the
atmosphere of songs. Therefore, the intro should include more
important keywords.
2) The refrain part represents the repeated words of a
song. The refrain is also important because a song writer
writes essential keywords repeatedly.
B. Feature: Term Count
We use ANEW(Affective Norms for English Words)
database in order to collect training samples as shown in Fig. 3
[4]. ANEW model uses three factors such as Valence, Arousal,
and Dominance. However our model needs eight factors to use
Plutchik's emotion model. Therefore we expand the ANEW
model.
To get eight dimensional features from ANEW model, we
calculate a three dimensional Euclidean distance from the eight
emotion term such as Joy, Acceptance, Anticipation, Anger,
Figure 3. 3D feature plot for ANEW(Affective Norms for English words)
Disgust, Sadness, Surprise, and Fear. Terms which represent
the emotion of Joy may have closer distance to the term "Joy"
than other seven emotion terms.
C. Feature: Intro and Refrain Part
It may more effective to handle not all parts but important
parts of the lyrics. We define two parts as follows.
1) Intro part: Intuitively, the intro of a song has more
intensive words for mood to give information about the mood
of a song to listeners. We assume it is first two sentences of
lyrics as an initial setting. It can be adjusted for optimization.
2) Refrain part: To find the refrain part of a song, we need
to find repeating sentences. In general, the refrain part is to
repeat the same sentences. However, some of the part can be
changed in a song. Therefore, in order to detect the refrain
part correctly, we deal with the refrain part including small
changes. If we do not consider about changes of the refrain
part in a song, we can simplify the problem as a ‘longest
common repeat problem’.
IV.
EXPERIMENT
A. Setup
In order to evaluate the proposed approach, we utilize a
support vector machine algorithm. We implement a Java-based
application for mood classification with the Livsvm, an open
source library, as shown in Fig. 4.
B. Building Training Set
We use eight kinds of moods. For each mood, we calculate
distances between each term and eight mood classes. For mood
‘A’, we assume two classes: class ‘A’ and ‘Ā’ class. Class ‘A’
consists of songs in ‘A’ mood. Class ‘Ā’ consists of songs not
in ‘A’ mood. Next we counts term frequency in each feature
part of lyrics of songs and use it as a weight. We use the intro
and refrain part as features for classification.
Figure 2. Term count plot
C. Building Testing Set
We select one hundred songs randomly among various
music collections and evaluate the classification application
with them.
TABLE II.
EXPERIMENT RESULTS
Mood
# of songs
Joy
Acceptance
Fear
Surprise
Sadness
Disgust
Anger
Anticipation
33
36
15
3
35
30
28
57
V.
# of correct
classification
14
12
11
0
12
11
16
40
Accuracy
42.4%
33.3%
73.3%
0%
34.3%
36.7%
57.1%
70.1%
CONCLUSION
We proposed a method using the Plutchick's emotion model
to classify the mood of a song. The proposed approach can be
used for automatic music classification in commercial music
download services or internet radio broadcasting services. In
addition, it can provide recommendations according to users’
mood. If it utilizes additional information of songs such as
genre, accent, and speed, it can provide better classifier and
recommender services.
ACKNOWLEDGMENT
Figure 4. Music mood classifier application
D. Results
Table II represents the results of the experiments. The
accuracy of Fear and Anticipation is higher than others because
a lot of terms in ANEW are located nearby them. On the other
hand, because the terms related with Joy, Acceptance, and
Sadness imply ambiguous meanings, the classification has low
accuracy rate. However, as shown in the recommendation table,
the evaluation of users was rather positive because the
recommendation results are decided by the combinations of the
basic emotions.
This research is supported by Ministry of Culture, Sports
and Tourism(MCST) and Korea Creative content
Agency(KOCCA) in the Culture Technology(CT) Research &
Development Program.
REFERENCES
[1]
[2]
[3]
[4]
http://www.apple.com/itunes/
J. Sounders, “Real-time discrimination of broadcast speech/music,” in
Proc.ICASSP96, vol.2, Atlanta, GA, 1996, pp.993-996.
A. Ortony and T.J. Turner, “What’s the basic emotions,” Psychological
Review, 1990.
M.M. Bradley, B.N. Cuthbert, and P.J. Lang, “Affective Norms for
English Words (ANEW). Technical Manual and Affective Ratings,”
Gainesville, FL: The Center for Research in Psychophysiology,
University of Florida, 1998.