Music Coach
Requirements
Submit (via email) a tentative project proposal, including: team member names, brief description of project (a paragraph or so), the main technical challenges, milestones/tasks to accomplish this, the primary obstacles/concerns, how you propose to evaluate the outcome.
Team Memebers
- David Johnson
- Dianna Han
Idea description
Use the Nokia N900 as a music coach which can check if you are playing a note at the correct pitch, check if a note is played at the correct time and optionally may allow you to display sheet music from midi files as well as control the pace of performance with movements of the body. It will act both as a real time feedback mechanism alerting you of pitch and rhythm inaccuracies as you play the notes and give you an overall evaluation at the end of your performance. The pitch evaluation is especially useful for instruments which don't have discrete notes such as a cello, violin, trombone or even human voice whereas the rhythm evaluation is suitable for all instruments. Other additional features such as an instrument tuner and a metronome could be added with trivial software modification. It should also be possible to evaluate only pitch with no pre-loaded music score by checking how close a note is to its nearest discrete note. This would assume an advanced student that is playing notes with less than a quarter tone of inaccuracy or else it would rounding to a note which is different to the original intended note.
Different feedback mechanisms will be explored. Visual feedback could use a compass like display where North and South represent a pitch that is above or below the correct note and the length of the arrow represents the fractional difference between the notes. East and West represents whether you played the note too early or too late and the length of a second arrow also represents the fractional difference. Audio feedback could also be explored by playing the correct note on the speaker in the case of a pre-loaded score and using beats in the note which get closer together the further the note is away from the correct note. This would be ideal as a training module for someone learning scales on a cello, for example.
One problem with a pre-loaded score is making allowances for musical expressiveness. This comes in the form of slowing down and speeding up your performance in sections of the music as well as adding vibrato to a note in the case of stringed instruments and the human voice. A vibrato note is an oscillation which falls slightly below and slightly above a note. The accelerometer could be used to sense rhythm by monitoring swaying movements on the body, if the phone was strapped onto a person. For example if the phone was strapped to the head, faster rocking movements in the head would speed up the performance and slower rocking movements would slow the performance down. Allowing for vibrato can be done by calculating the average pitch of the performed note over a specific time window.
Main challenges
- Real-time processing of the audio input to recognise the frequency of a note.
- Different instruments have a range of timbre with transient harmonics which can dominate the fundamental frequency momentarily - the challenge will be to decide on an optimal time window to use for the pitch recognition. This challenge may simply limit the scope of instruments to ones which have a dominant fundamental from the attack to the decay of the note.
- Real-time feedback display. You want the performer to get feedback as quickly as possible for it to be useful but if this is too soon - it may be inaccurate, whereas if its too late it will be more accurate but less useful.
- If time permits, inferring rhythm from a trace of accelerometer inputs should not require the user to make movements of the body that would be unnatural. A gentle rocking of the head should be all that's required to alter the pace at which the pre-loaded music progresses.
Milestones
- Test some FFT code on a PC which can take as input - sample rate, number of bins and display the frequency of a single note on the screen
- Run some experiments which compare time from note attack to display vs note accuracy.
- Make use of some existing software which can read in a midi file and display the notes and reformat this for a Nokia N900 screen size.
- Experiment with comparing a midi file to a live performance in terms of pitch accuracy - find optimal time windows for calculating average pitch of performed note.
- Evaluate the accuracy of the attack and release times of the live performance with the start and stop times of the notes in the midi file. Make use of a metronome on the phone for the user to synchronise their performance.
- Write a graphical interface on the Maemo simulator which creates feedback on the pitch and timing accuracy of the performance using a compass like display - (other more creative graphical displays may be developed). Also display momentary Pitch and timing accuracy as well as cumulative accuracy as a percentage.
- If both the compass display and the midi file are to be displayed, then a blend of these two could be explored
- Test the FFT code for pitch recognition on the N900
- Test the midi file input on the N900
- Test the graphical feedback display on the N900
- Time permitting add the option to change the speed of the performance using the accelerometer
Primary obstacles
- Processing capability to carry out a real-time FFT. If the on-board CPU is not sufficient we might need to use the DSP which adds an extra level of complexity to the problem
- Allowing for a certain level of musical expressiveness in a performance and not seeing this as pitch or rhythm inaccuracy.
- Inferring rhythm for small subtle movements of the phone when attached to the body depends on accuracy of accelerometer.
Evaluation
- The real-time pitch recognition software on the N900 can be compared against some more accurate non-real time pitch recognition on a standard PC.
- Musician's feedback on whether the device was useful in improving accuracy of performance.
Using pulseaudio
- possible command
pacat --record | sox -t raw -r 44100 -s -L -b 16 -c 2 - "output.wav"
Code that can be used
- A metronome with visual and audio feedback will be very useful for a player to know the pace that the application is expecting you to play the music - some code segments of the GTick open source software can be used for this.
- To display notes on the screen a very well written qt application called Musescore could provide some useful code segments
Useful links
Pitch recognition
- http://www.nicholson.com/rhn/dsp.html
- http://www.hotpaw.com/intuna/singintuna.readme.html
- http://cs.ucsb.edu/~davidj/Files/thesis1995.pdf
FFT
- FFT on a symbian phone (http://wiki.forum.nokia.com/index.php/FFT_algorithm)
- Most stable FFT system - fftw (http://www.fftw.org/)
Multimedia system Maemo 5
- Maemo 5 multimedia domain (http://wiki.maemo.org/Documentation/Maemo_5_Developer_Guide/Architecture/Multimedia_Domain#Audio_Subsystem)
- Maemo 5 uses pulseaudio (http://www.pulseaudio.org)
- Examples of using pulseaudio - parec (http://grangerx.wordpress.coExamplem/2009/08/03/fedora-11-recording-audio-from-pulseaudio-using-parec-and-sox/)
Accelerometer
- Using accelerometer trace to detect shaking (http://msdn.microsoft.com/en-us/magazine/ee413721.aspx)
- Smart phones as instruments (http://www.cnn.com/2009/TECH/10/15/iphone.music.zoozbeat/index.html)
- Zoozbeat (http://www.youtube.com/watch?v=YEsC-lklOA4)
MIDI importing
- C++ MIDI Library (http://www.jdkoftinoff.com/main/Free_Projects/C++_MIDI_Library/)
- The Computer Music Project Software (http://www.cs.cmu.edu/~music/music.software.html)
- Very Simple MIDI parsing code (http://cap-lore.com/EnglishSuites/code/code.html)
MIDI file format explanation
- David's MIDI Spec (http://www.srm.com/qtma/davidsmidispec.html)
Notation software
- GUIDOLib qt music notation library (http://guidolib.sourceforge.net/)
Companies that have done music reconigtion
- WIDI 1998-2009 Recognition System (79$ stadard, $159 professional) (http://www.widisoft.com/english/mp3-midi-products.html)
- Shazam iphone song reconigtion (http://www.shazam.com/music/web/pages/getshazam.html)
Academic references
- Google scholar results for "transcription of musical sound" (http://scholar.google.com/scholar?hl=en&q=transcription+of+musical+sound&btnG=Search&as_sdt=2000&as_ylo=&as_vis=0)
- Bibliography on automatic music transcription (http://www.recognisoft.com/cgi-bin/main.cgi?id=62&lan=en&n=&p=&d=&o=&r=y&s=)
Instrument Ranges
- Of all orchestral instruments (http://www.orchestralibrary.com/reftables/rang.html)