Download GMI midyear report

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Music technology (electronic and digital) wikipedia , lookup

Transcript
Gestural Melodic Interface
Mid-year report for IITG
January 10, 2013
Jim McElwaine, Paul Thayer, Keith Landa
Initial Concepts
The Gestural Music Interface is custom music software designed to capture melodies quickly
and clearly in a digital audio domain using hand gestures on a digital tablet. It also allows editing
of the initial captures: motivic repetition and variation using conventional melodic procedures. All
melodic files will be editable and will allow overdub and compilation.
It will be capable of capturing melodies in a variety of tonalities and temperaments. That means
that frequency can either be 'corrected' to conventional scales, or left 'uncorrected.' All melodic
files will be exportable, in file-types suitable to further audio processing (MP3), and in file-types
suitable for digital notation software and timbral reassignment (MIDI, with duration quantizing
options).
GMI is written in open code for iOS, suitable for both tablet and smartphone devices. Eventually,
it could evolve to the Android OS. Initially, the GMI will have three operating modes: Capture,
Edit, and eXport. Capture will acquire melody from screen translations of hand motions. Edit
mode will allow recall and change of captures. Export will provide file export of edits and
captures via email of MIDI files.
Although GMI sounds will be compatible with General MIDI Level 2 (GM2) specification, they
will not use GM2 sounds. Instead, captures will l be purposely limited in their timbral options,
only two or three sustained musical sounds will be available. Pitches will have simple graphic
representations, but will not immediately be adapted to conventional clefs and notations. The
screen output will follow common visual conventions for music software.
Frequency and Time
The GMI human user interface is a conventional grid, consonant with most other digital audio
software: the X-axis equals time, and the Y-axis equals frequency. Outer portions of each grid
member retain microtonal pitch variation. Initially, the GMI has a movable pitch compass (highto-low) of a perfect 12th, or 18 semitones, approximating the human vocal span and providing
for several different stepwise cadential approaches from above or below cadential points or
pitches. Melodies that exceed the GMI compass will need first to expand pitch compass,
probably via the Edit mode. It is hoped that amplitude variation can be incorporated in a later
version.
Initial Design
Initial planning and conferences between music specialist (J. McElwaine), software engineer (P.
Thayer), and project manager (K. Landa) began in late August. Conversations revealed many
shared concerns:
fidelity of hand gesture to resultant sound;
note connections (portamento);
pitch resolution (conventional scales?)
silence as a recorded element
A modest pace in our development process allowed for better discussion of initial objectives and
technological paths to them, and eventually a reduction in programming overhead. By late
October 2012, we had arrived at a functioning alpha-model that responded to screen data entry,
via a grid generated by the above musical requirements.
By December 2012, the alpha was 80% functional. Its look and feel remained a simple
approximation of what would result from a subsequent beta refactoring. That allowed for largescale changes with quick turnarounds – more reductions in programming overhead. We decided
not to incorporate a metronome. We also agreed that a moving grid, actually a moving
background along the x-axis beneath a static grid image, would eventually solve our extended
time grid needs. So, time was generated only by event entry, with no accompanying event
divisibility.
Current Challenges and Solutions
We now have a clear idea of how we want the grid to work, and we have established an optimal
path for achieving it. The current grid is entirely event-oriented. We have not yet addressed the
issues of changing duration values (realtime mensuration) since this variability seems to require
an invariable, a comparative clock. Current event-entry mode also obviates any multiple
rhythmic modalities, or prolation. Sometimes beats are divided in twos; other times in threes and
fours, and their subgroupings. Often prolations are irrational numbers too, like the ‘swing’ feel of
jazz.
Once the grid is fully functional, we will be faced with the problem of devising a method for
defining consecutive notes as being tied together into a long note as opposed to simply being a
row of distinct short notes.
Not exactly a "problem" but a "solution" nonetheless: Instead of programming audio features in
objective-c3, the application uses "libpd"4 which allows for easy incorporation of the Pure Data
interactive audio programming environment, reducing a lot of programming overhead.
Next Steps
Next on our agenda will be a decommission of the alpha model and a newly constructed beta
model for release to selected master-class faculty by January 25, and then to students in those
master-classes by February 10.
The alpha version of the application has been an exploration of the most efficient methods for
achieving the musical goals of the project. It has served its purpose: getting a more concrete
vision of an intuitive and gratifying interface. Since it has been experimental from the start, it
contains remnants of code that will not be needed. It also contains redundant code that will be
refactored to optimize the application and make the code more accessible to others. Existing
code from the alpha version will be optimized in the beta version.
In the IOS development environment (Apple’s XCode), there are several elements that need to
be clear when a project is created. The alpha work has clarified many of those considerations.
Beginning the beta version as a new XCode project will simplify the integration of these ideas.
The most important of these are the need for a window layout that allows for interface elements
outside of the gestural grid and the need for a “paged” application as opposed to a “single view”
application.
Remaining Technical Challenges
Grid spacing
Grid markers appearance, if at all
Retaining highlights on screen of melodic curve
Aesthetic Challenges
The aesthetic challenges have been revelatory. Is the event (individual note) duration really
important to the intuitive capture of a melodic idea? Humans seem to remember rhythms more
quickly than pitches.1 If rhythm is short-term and more imminently recollectable than pitch
sequence, can the user retain the combined rhythm-pitch model (the ‘melody’) without reference
to rhythm?
Melodies are also very often defined by their phrases (or pauses or ‘breaths’). What happens if
silence is not recorded. Is the inclusion of a referent clock a necessity?
Will the exclusion of duration and event rhythms yield the same recollections or will we adapt a
dichotomized solution similar to the isorhythmic training of musicians in the 12th through the
14th centuries2?
What needs to be recorded? What are the minimum components of a melody?
Capture these to assure a reasonable facsimile of the initial human idea.
Evaluation
The GMI will be tested first by a few qualified professional composers and performers, familiar
with digital sketching software. That evaluation is scheduled for late January 2013. It will then be
offered to various master class format. Master classes are small, curricular groups of three to
four students, all engaged in music composition and production at Purchase College, SUNY.
These groups will use a variety of evaluative and comparative studies relative to effectiveness
and accuracy of capture, quicker completion, and better artistic development and incorporation
of melody and counterpoint into their compositions.
Group evaluations will begin after faculty evaluation in mid-February, to be completed in April.
Data collected from beta-testing will be gathered in a database.
Evaluative instruments for both groups and their common instruments and queries are being
written.
Source
1. Music and the Perception of Rhythm, Candace Brower, 1993
2. Metaphor and Musical Thought. Michael Spitzer. 2003
3. Objective-C, B. Cox, T. Love (1983)
4. libpd (layered on PD) M. Puckette (1991)
________________
Reference
Functional neuroimaging of semantic and episodic memory. H. Platel (2005)
Human Memory: A proposed system and its control processes. R. Atkinson, R.Shiffrin (1968).
Memory in a jingle jungle: Music as a mnemonic device in communicating advertising slogans.
Richard Yalch, (1991)