[MEI-L] Issues with the Sample Encodings

Wed Aug 13 23:34:17 CEST 2014

Hi Christopher,

Comments in-line below --

--
p.

> -----Original Message-----
> From: mei-l [mailto:mei-l-bounces at lists.uni-paderborn.de] On Behalf Of
> Christopher Antila
> Sent: Tuesday, August 12, 2014 3:05 PM
> To: mei-l at lists.uni-paderborn.de
> Subject: [MEI-L] Issues with the Sample Encodings
> 
> Greetings MEI-L Members:
> 
> I have a set of questions/comments related to the sample encodings in
> the "music-encoding" repository, and the musicxml2mei XSLT stylesheets.
> 
> It seems that all of the sample files produced with the musicxml2mei
> convertor either don't follow the Guidelines completely, or include
> errors. While writing the MEI-to-music21 importer, I've found that
> importing these files correctly is generally more difficult and less
> efficient than it could be, and sometimes impossible.

This is not a problem with MEI per se, but rather with the nature of music notation in general.  As you say, importers should be forgiving, issuing warnings when a passable conversion is possible, but dying (with an appropriate message, of course) when it is not.  It's my belief that refusing to continue is a legitimate response to flawed input -- often better than settling for syntactically correct, but semantically meaningless conversion.  Complete and totally accurate conversion is only possible in the simplest of cases; that is, where the starting format and the ending format are syntactically and semantically very similar.  I think it's better to think of machine translation as a starting point for human intervention.

> One of the most troublesome recurring problems is <tupletSpan> elements,
> which are supposed to use a @plist attribute to indicate all the
> elements within the tuplet (pg.97 of the Guidelines). None of the
> <tupletSpan> elements I've seen so far actually use @plist, which leaves
> my importer guessing which elements are involved ("git grep" tells me
> only four sample files use @plist at all). With my current algorithm,
> any tuplet with components in three or more <measure> elements will
> necessarily be imported incorrectly (unless it has a @plist).
> Furthermore, this guesswork constitutes more than half the processing
> time in a sizeable file like Beethoven_op.18.mei.

Re-reading the current text, I can see how one could interpret it to mean that @plist, @startid, and @endid are all expected to be present in every case. However, that's not its intended meaning.  "Supposed to" is too strong a phrase.  "Can" or "may" is more appropriate.

> I do believe that importers should be as forgiving as possible, and that
> the tuplet-guessing algorithm must therefore remain in the code, along
> with all the other just-in-case algorithms for elements missing a
> @plist. However, I also believe that the MEI-provided sample encodings
> should be stand-out examples of the highest quality, and that they
> should therefore follow all of the guidelines in the Guidelines, even
> when not strictly required.

The reason they're called "guidelines", and not "requirements", is that not all recommendations are applicable in every case.  The nature of music notation makes it impossible to distill it down to a simple set of universal rules.  The best one can do is provide encoding possibilities, best practices as it were.  It is then up to the markup creator to use these practice recommendations appropriately.  Markup consistency is a more realistic goal than perfection. :-)

Since @plist isn't required, the <tupletSpan> elements that only have @startid and @endid are following all *appropriate* recommendations.  In fact, while <tupletSpan> typically has @startid and @endid (because those are relatively easy to generate), the actual requirement for <tupletSpan> is that it have some kind of start-type attribute (@startid, @tstamp, @tstamp.ges, or @tstamp.real) and some kind of end-type attribute (@dur, @dur.ges, @endid, or @tstamp2).  If you expect a certain combination of these attributes, you might do as Johannes suggests and create a profile that enforces your expectation.  However, that puts the onus on the user to transform the input file so that it matches the profile before running the MEI->Music21 converter.  You can, however, provide that transformation as the first step in the conversion process.

By the way, I just noticed that it should be possible for <tupletSpan> (and probably all similar elements) to provide @plist *as a substitute for* the currently required start-end pair.  There's a report for me to file.  :-)

> Considering this, my virtually non-existent knowledge of XSLT, and the
> current possibility of a move from Google Code to GitHub, I'm wondering
> where and how to report my issues. Some of the problems can be fixed
> easily by hand-editing certain MEI files (to correct the duration of a
> <space> in Joplin_Elite_Syncopations.mei, for instance). Many of them
> are difficult to fix (like an enormous swath of missing <tupletSpan>
> elements and correspondingly incorrect @tuplet attributes in
> Brahms_StringQuartet_Op51_No1.mei), and might be more profitably
> solved
> by adjusting/regenerating the musicxml2mei stylesheet and re-converting
> the sample encodings from their MusicXML source.

In many cases, the problem lies in the varying quality of the MusicXML sources used to create the sample encodings.  In addition, conversion to MEI from MusicXML isn't exactly a walk in the park either -- there's a lot of time-consuming and inefficient guessing there too.  (See my earlier comment about the input and output being syntactically and semantically similar.) Also, some of the sample files were created by hand in an XML editor, certainly the most powerful, but probably also the most error prone method.  So, while a lot more can be done, I sincerely believe MEI is a better representation in that it prevents more errors than MusicXML.  But, creating flawless MEI markup is no easier than making perfect MusicXML files -- and may even be more difficult at this point in time -- but I'm confident that's a solve-able problem.  (These are good reasons why it's important to create software for generating MEI directly from existing score editors -- or an MEI-based editor.  But that's another topic for another time.) 

We'll certainly look into the errors you find in the sample encodings.  I don't expect the switch from Google to GitHub to happen overnight, so I say we just keep putting them in the Google Code repo for the time being.  You can create an error report for each *class of error* -- one for incorrect durations and one for tuplet problems, for example -- then provide pointers to specific instances of the error class in the sample files.  Sound reasonable?

> 
> I'd be happy to lend a hand in any way that I'm able!

By calling attention to inaccuracies and inconsistencies, you're already helping to improve MEI!

Best wishes,

--
p.

> 
> 
> Thanks,
> Christopher