Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Neural Networks and Musical Style Recognition Sarah Callaghan 3rd year Physics and Music BSc Aims: The aims of this project were to: 1) Develop a way of symbolically mapping a melody line on to the input layer of a neural network, while retaining the pitch and the rhythmic structure of the line. 2) Using this mapping technique, train a neural net on a series of examples from two different composers, JS Bach and Igor Stravinsky, then test the network with some previously unseen examples. 3) Determine what criteria the neural net uses to classify each unseen piece. Apparatus: The neural networks used were simulated by SNNS (Stuttgart Neural Network Simulator) which is a software simulator for neural networks on UNIX workstations developed at the Institute for Parallel and Distributed High Performance Systems (IPVR) at the University of Stuttgart. The SNNS simulator consists of two main components: 1) simulator kernel written in C 2) graphical user interface under X11R4 or X11R5 The simulator kernel operates on the internal network data structures of the neural nets and performs all operations of learning and recall. The testing and training pattern files were created using a simple text editor. The networks that were created, trained and tested were simple, with no more than fifty nodes. They used standard backpropogation as their learning function. The music samples were taken from JS Bach’s “Six Sonatas for Violin and Klavier” and “Six Sonatas for Unaccompanied Violin”; the Stravisnky music sampled was “Duo Concertant pour Violon et Piano” and “Three Pieces for Clarinet Solo”. Methodology: The music had to be translated into a form that could be presented easily to the input layer of the neural network. It is possible to use MIDI data for this, but due to the time constraints I chose to use a simplified way of translating music into numbers. I chose also to look only at melody lines with rhythmic aspects added in at a later stage in the project. The melody lines were plotted across the input layer of the network as a function of space rather than time. Each actual pitch, as given by the sheet music, was assigned a number between 0 and 1, with the interval of a semitone between notes being represented by a difference between the two numbers of 0.01. Rests (periods of silence) were represented by 0. The basic unit of rhythm was taken to be the semiquaver, one semiquaver was assigned to each input node. If a particular note was longer than a semiquaver then the number of nodes its pitch value was spread across was directly proportional to the number of semiquavers that the note rhythm value corresponded to. Networks were also tested and trained using the frequency of occurrence of the notes in the samples as learning information. To discover if the network developed notions of key from the samples it was trained from, it was trained and tested using samples all in the same key. The output nodes were set up to give results of (1,0) for any sample by Bach, and (0,1) for a sample by Stravinsky. The untrained network gave results of 0.5 for both nodes when shown any sample of music, over the training cycles a definite trend became evident with the samples becoming more and more as expected. SNNS automatically plotted graphs of the sum of the squared errors versus the number of training cycles. The number of training cycles was dependent on the amount of training samples and the size of the network. In general the network used had twenty input nodes, ten hidden nodes and two output nodes, the first representing a result of Bach the other a result of Stravinsky. The training files had samples from a number of pieces by the two composers, there were always equal numbers of samples by each composer. In general there were twenty training patterns and a variable number of test patterns. Results: The trained network demonstrated its ability to tell two pieces apart with very high accuracy, the Bach training sets were all taken from the Gigue from Partita number 3 in E, all the Stravinsky sets were taken from the first piece from his three pieces for clarinet solo. Only the first hundred or so notes were used to make the training patterns, the last twenty notes were used to test the network. It consistently gave the correct answer for these test files. Over large amounts of training patterns taken from large numbers of different pieces, the network did not perform well at all. It had difficulty generalising sufficiently well enough to be able to categorise the samples by composer. However, when the internal logic of the network was examined the network gave consistent result with what it had been trained. For example, in one of the training patterns there had been a rest at the beginning, which had happened to be a sample from a Bach piece. In all testing after the network had been trained using this sample, it classified any test pattern that had a rest at the beginning as being a sample of Bach. Commentary: This project served as an interesting starting point for further work. Assuming unlimited time and funds I would have loved to have used an actual neural network instead of the simulator and used MIDI data to give a more accurate picture of what music really is. In essence what I was actually teaching the neural net was how to recognise music from its arbitrary numerical pattern; the same as trying to teach someone who can’t read music to recognise the difference between two pieces simply by looking at the sheet music.