SMART SHARPEYE Optical Music Recognition

Home / Blog / SMART SHARPEYE Optical Music Recognition

Recently, the abbreviation OMR can be often seen on the Web and in the media. It stands for Optical Music Recognition (similar to OCR = Optical Character Recognition). OMR is definitely a very promising computer technique aimed at the facilitation and speeding-up of the initial conversion of printed music into digital form.

Attempts of music recognition started as early as in 1980s; however, until recently, the results have been unsatisfactory. Partly, this can be accounted for by the use of MIDI as the data exchange format. This is the reason why, in all my public and private discussions of Finale, I, without any doubt, gave a preference to music entry using the Speedy tool.

The situation changed with the emergence of SharpEye software developed by the English mathematician Graham Jones 1 and sold by the company Recordare 2. This, together with the data exchange format MusicXML, also developed and promoted by Recordare, allows taking OMR seriously.

Below, I will describe a method of music entry using OMR, which, under specific preconditions, can give an essential gain in the initial entry speed and, most important, eliminate the annoying handwork. What are these preconditions?

You are entering music from a printed original of a high or medium quality. A book is fine; however, a hard copy of music set in a scorewriter is even better.
The music does not contain a large number of notation elements that are rhythmically or graphically complicated. It is not worth recognizing the XX century avant-garde music or mensural notation. The easiest task for OMR is the academic music from the baroque to the late romanticism, dance music or any other light instrumental music.
The recognition of the following elements, as well as their interpretation in Finale, may be difficult:
a) beams over a bar line;

b) tremolo;

c) cross-staff voices 3.
If the original document that you are going to scan and recognize contains much graphics of this kind, then my advice is to completely refuse from using OMR.
The music is not vocal music with Russian lyrics. SharpEye can recognize texts in European languages and Latin pretty well, but it does not know Russian.
The printed original is clear: no pencil or ball pen marks left by a teacher or by a negligent student. Sure, you can remove all the garbage in Photoshop, but first consider what takes more time: correcting the corrupted fragment in Photoshop or its manual entry in your favorite music notation program.
Software. The following programs were used: Adobe PhotoShop 7.0, SharpEye 2.33, Finale (hereinafter φ) 2003a, Dolet 1.1 plug-in by Recordare. Of course, all these programs are for Windows only.

Printed music. I used music issued by Henle Verlag, Breitkopf und Härtel, Muzyka Publishers, Quadrivium Publishers and others.
Results of my tests, using the processing chain SharpEye→Music XML→Finale, are given in the end of this article.

Results of my tests, using the processing chain SharpEye→Music XML→Finale, are given in the end of this article.

***
Preparation of the Image for SharpEye

Scan at 300 dpi resolution. A slightly higher resolution, 400 dpi, is only worth employing if you work with a pocketsize score. A resolution higher that 400 dpi never makes any sense. When putting the original on the scanner align it in such a way that the staves in the scan should be horizontal.
Scan the page as a monochrome (black and white) bitmap. If you have time, you may first scan it as grayscale, then increase the sharpness by adjusting the gray levels in Photoshop: Image→Adjust→Levels (Ctrl+L), and then save it as monochrome bitmap (LineArt, 1 bpp). You can safely trim off white margins, page numbers and other elements that you don’t need to recognize. Save the scan results as TIFF or BMP image. When saving as TIFF, do not use the LZW-compression, because SharpEye cannot deal with it.
SharpEye. General Settings

Before changing SharpEye settings, check the original for the following:

Voices/supporting voices. If it has voices/supporting voices, check the box Options→Rhythm analysis options→Relaxed, no overlong measures;
Text elements of the score. If the score has no lyrics, uncheck the box Options→Text recognitions options→Read lyrics and check the box Read other text for the recognition of the other text notations;
grace notes and other small notes. If the score has such notes, be sure to check the box Options→Music recognitions options→Grace notes. By default, SharpEye does not recognize them.
Now it is time to open the scanned image. This can only be done in the menu, because there is no hot key for this operation. The graphic image of the musci sheet will appear in a separate window.

SharpEye. Edit the Recognition Result

To read the scan press the blue button . The program requires some time to interpret the graphic image. Just wait a little, and in a marvelous way the music, already in the form of logically structured and intelligible score, appears in the second window of SharpEye.
Position the source window, which contains the original scan, and the target window upon your convenience. This done, I suggest you to save the window positions: Options→Save window positions. From now on, SharpEye will use them for any new session.
Now it is time to edit the recognition result.
A general recommendation: go easy on editing in SharpEye! Make-up is much more convenient and faster in φ. Most important editing action is the correction of recognition mistakes (no hot key, use menu: Edit music→Goto next rythmic warning, or use mouse to click on the icon ). «Dubious» elements, whose recognition was uncertain, are highlighted in grey color, and blue triangle appears at the end of the measure, where a mistake was detected. That is what we need.
To correct a rhythmical mistake in a note/rest, it is often straight-forward to re-enter the gray grapheme.

a) highlight it with left mouse button ô;
b) delete it [Del];
c) press the right mouse button õ at any point in the score. The graphic palette of note values (GPV) appears;
d) choose ô the required value from GPV;
e) using (attention!) õ, put the value chosen to the corresponding staff line;
f) to leave the editing mode click ô on any graphic element of the score, then ô click once again at any non-graphic (white) place in the score.
Attention! SharpEye can only undo a single operation; so do not right-click randomly in the score, do not press [Esc] or Ctrl-Z. This will not make any good. If you have made a mistake, undo immediately by using menu: Edit Music→Undo.
Sometimes it proves impossible to find out, why a grapheme is «grayed out»; correspondingly, it is impossible to deactivate the aforementioned blue triangle indicating the mistake. Such gray graphemes will be lost when exporting to XML!
Consider some typical recognition mistakes.
Typical situation 1. Mistakes in the interpretation of tuplets. To correct it, proceed as follows. Highlight a group of notes.
In SharpEye, to add a note to a highlighted group, click it holding Ctrl. To highlight a group of neighboring notes, it is convenient to drag the mouse along the diagonal of the surrounding rectangle, holding Shift, similarly to highlighting in φ.
Click on the circle , if SharpEye recognized a tuplet, but with a wrong rhythm. If SharpEye completely missed a tuplet, define it in the field using the scheme N:T, where N is the number of values X, which occupy time T, measured as the actual sum of values X. For example, SharpEye recognized six sixteenths, but did not understand that it is a sextuplet. Write 6:4 to obtain the desired sextuplet.
Typical situation 2. SharpEye failed to recognize the time signature. Proceed as follows. First enter the default signature 4/4, then choose ô the entered signature 4/4 and change it as necessary. To this end, enter it in the time signature field:
In order to insert a new symbol into the recognition result, click õ at any place on the staff, and the graphic palette will open. Then click ô and choose the necessary symbol from the palette. Click aganin õ, this time at the required place of the score.
In our example, the second click õ will be done at the beginning of the second staff of the first system.
In principle, it is not necessary to put the time signature at the beginning of the first system on the second and further pages of a multi-page score. In this case, φ will use its own analyzer to infer the right regular beat. However, if the beat interpretation is wrong, put the time signature in SharpEye explicitly, φ will hide it anyway.
Typical situation 3. SharpEye failed to recognize the key signature. Highlight the wrong key signature and choose the right one from the palette (see the previous paragraph).
In you put the key one signature on one staff, for example the upper staff, SharpEye can propagation it to all the staves of the system. To this end, Edit Music→Copy key sig to→All staves from here on. A copy of the original signature will appear on every staff along the whole vertical.
Typical situation 4. SharpEye recognized small notes (Ossia etc.) as a series of grace notes. My advice is to reformat the small notes into large ones. Later, in φ you may use the function «Resize Notes» to reformat them back. Otherwise, when importing XML, Dolet will understand them all as grace notes, and you will have to correct the rhythmical pattern manually.
Remark for experts. Initially, SharpEye does not group staves. Instead, it analyzes the rhythm horizontally, in every single staff. It might seem obvious that it is necessary to check not only the horizontal rhythmical uniformity, but also the vertical one. However, don’t rush through grouping staves. Any notated music always consists of non-typical rhythmical situations multiplied with the human factor: after all, a human set your original. Complement this with the rhythmical misgivings of SharpEye itself: SharpEye has the right for misgivings, which is even advantageous in the context of real music. Under these conditions, grouping staves will only result in the propagation of rhythmical mistakes, while the source of problem will be very difficult to locate. That is why Graham Jones, in the current version of his brainchild, limited the number of staves to be verified vertically, by two.

Group staves as late as after having finished fixing rhythmical mistakes in every single staff. To group two neighboring staves, click on the black rectangle to the left. To undo click on the rectangle again.

Indication of separated staves:

Indication of grouped staves:

After editing, it is expedient to play the result as MIDI, without exiting SharpEye. They you may catch a few more mistakes, given you have a good ear for music.
A. To play the score from an arbitrary measure, rather than from the beginning, highlight any grapheme in the measure and press the Play button: . The clever SharpEye with then play just from here.
B. Do not surprise that grace notes are not played. If they are not «grayed out» on the display, they will be correctly exported to XML.
Periodically save the work result in SharpEye’s native format (*.MRO) by pressing the button , when you have finished editing, save it as XML:
What you Should Do in Finale

Open a new file. Any template will do, it is only important to create the file from scratch..
Use the Dolet plug-in 5 by company Recordare (Plug-Ins→Music XML Import). The following window will appear:

The window title reads: Dolet Light Music XML Importer. You may now ask: is it possible to obtain Dolet without the word Light? My answer is: it is not worth effort. The «full» Dolet provides the support of some other format; it can also be integrated into φ2001 и φ2002. The company states that in the version Dolet 1.1 some bug are fixed that are still present in 1.0.1 (which comes with φ2003). However, experiments indicate that these corrections do not essentially improved the import quality. On the other hand, instead of «life-long» version of Dolet, you will only get an Evaluation copy, for 30 days. A good news is that the full version will not remove the default Light Dolet plug-in6.
Caveat 1. If you clear the imported music by theTractor tool and run the plug-in in supposedly clean file, φ will hang.
This bug is fixed in the full version of Dolet. In other words, the repeated import into staves that were already used but cleared by using Tractor tool will not result in φ’s hanging.
Caveat 2. It is impossible to add a new fragment to an existing MUS file (for example, in a multi-page score. There are two solutions:
a) Create a new file A in φ. Import the result of the recognition of the first page of the original into the file. Now create a second file B. Import the result of the recognition of the second page of the original into the file. Copy the full content of the file B into buffer, using Tractor), and paste it into the file A. In this method, the layout of the staves, which is kept by XML, will be lost in φ.
b) Create in advance a multi-page MRO file, this time in SharpEye. This will produce a multi-page XML file.
Doing this is easy:
recognize and edit separately each single page. Do not use the batch recognition mode, because it achieves speed by ignoring some settings. Save each page as a file with the native MRO extension. Having finished editing,
– open the first file: File→Open music;
– open the second file by using the same command. The following window will open:

Press the Append button, and SharpEye in a clever way will append it at the end of the first file, etc.

If you are recognizing a multi-staff score, it may happen that the number of staves will vary from a system to a system. In «finalistas'» slang, such scores are termed «optimized». In such optimized pages, SharpEye after the append operation will tacitly add empty staves (placeholders) at the bottom. Their only purpose is to keep the same number of staves in every page. In other words, SharpEye, like any other modern scorewriter, thinks that every page must, one way or another, contain all the instruments, every those that are currently keeping mum.
Naturally, such an «equating» approach may result in a wrong joining of the instrumental voices in the MRO file. For example, the second staff in page 1 contains the oboe part, and in page 2 the oboe becomes silent. The second staff in page 2 contains a second violin part. SharpEye doesn’t interpret the score, it just joins one second staff with another second one. Therefore, the oboe part will smoothly, but wrongly, pass into the second violin part. To achieve the correct musical and logical joining, use the green arrows, which can be found at the border between the previous and following pages of the score. Drag the arrow head up or down, with the mouse.

Do not try to put a whole enormous score into a MRO file. The reasonable maximum, on my view, is estimated at 50 pages. Although SharpEye allows for assembling such a large file, the work required for correctly joining all the staves will exceed any advisable limits. Every time, after editing a next joining, you will to export and XML file and import it into the scorewriter (e.g., Finale), which, for a large number of pages, gets too laborious.

Tests for Music Recognition (for Advanced Ones)

Test 1. Beethoven. Sonata No. 14 for piano, part II. (Original text issued by Könemann Verlag, Budapest; high-quality print).

Original

Recognition result

This is an example of irreproachable recognition. Both SharpEye and φ have correctly interpreted the pickup measure. We don’t show the score in Finale, because it has no problems. Ties, slurs, articulation, and clef change inside a measure: everything is fine. Shortly, there is almost nothing to bring to perfection in φ.

Test 2. Beethoven. Sonata No. 7 for piano, part IV. (Original text issued by Henle Verlag, München; high-quality

Original

Recognition result

The result of XML import in φ

Very bad. The clever SharpEye correctly understood beams over bar line. But this didn’t help much: the rhythmical problems ruined SharpEye. Due to non-standard beams, the pickup measure was interpreted in a wrong way (cf. the successful variant in Test 1).

Take a closer look at the middle screen shot. Count the note values, first in the upper staff, then in the lower staff of the second measure. The amount is the same; everything seems to be OK. Still, SharpEye is puzzled by the «strange» rhythmical logic of Beethoven, an awfully «strange» composer from the standard classicism epoch. This is indicated by the blue triangle in the lower staff, at the end of the second measure. Can you see it? That is what I was unable to eliminate in SharpEye, despite all my efforts.

Locating the source of the problem is difficult. It may be some shortcomings of the SharpEye data base; eventually, this is accompanied by some problems of the MusicXML exchange format, which is not yet sufficiently developed. SharpEye was so upset that even didn’t recognize a grace note, which usually is not a problem. It also lost a few ties, which cannot be accounted for by a bad scan: Henle issues always high-quality prints.

Test 3. Mozart. Don Giovanni. Act I. Donna Elvira’s Aria (clavier issued by Breitkopf und Härtel, Leizpig; a medium-quality print produced in German Democratic Republic).

Original

Recognition result

Tremolo is beyond SharpEye’s strengths, which may be due to the ambiguity of its export formats. Caveat: when neighboring staves are located tightly, ShapEye has difficulties with the decision, to which staff the notes should belong. Good news: OCR of both the first, German, and the second, Italian, vocal text is not a difficult task for SharpEye. Note, SharpEye sets the recognized lyrics in a sans-serif font (Arial), while the other text is set in a serif font. In particular, we can see that the word Lepor., indicating Leporello’s entry, is recognized as «other text» and set in Times New Roman. Therefore, it is expedient to prevent text classification mistakes already at the stage of editing in SharpEye. It known that φ has special arrangements for Lyrics.

Test 4. Borodin. Small suite for piano: No. 6. Serenade (print, Quadrivium Publishers, Moscow).

Original

Recognition result

The whole voice is cross-staff. SharpEye can confidently recognize the notes themselves, but it doesn’t know what to do with them. Pedal on/off is not recognized: SharpEye only knows a limited number of articulation marks: staccato, tenuto, and accent.

Test 5. Tchaikovsky. Songs for Children: On the Bank. Muzyka Publishers, low-quality print.

Original

Recognition result

In the OCR of Cyrillic letters, SharpEye fails 7. Instead of the Russian word Как (how) SharpEye reads Kalt, cold in German; one can really grow cold at seeing such a recognition. Other problems: we clearly see mistakes in the recognition of rhythmical patterns: extra dots, and in graphics: a half note in the fourth measure was lost. This is caused by the bad quality of Soviet prints.

In these tests, I deliberately demonstrated SharpEye’s flaws, which by no means depreciates the achievements of Graham Jones, who created this sharp-eyed, sensible and inexpensive OMR program.

Sergey Lebedev, PhD Moskow conservatory, University of Vienna

Добавить комментарий

Ваш адрес email не будет опубликован.