[mei-catalog-ig] Persistent identifier for digital music editions

Jim DeLaHunt from.mei-l at jdlh.com
Thu Nov 22 08:21:51 CET 2018


Dr. Dubowy:

On 2018-11-15 01:17, Norbert Dubowy Internationale Stiftung Mozarteum wrote:
> ​Dear metadata and cataloging IG,
> As we are close to publishing our first set of digital editions, I 
> would like to hear your opinion on the following subject: Does it make 
> sense to provide a persistent identifier with the edition (in the 
> individual file / MEI header)? What is the benefit, and is there any 
> preference of what kind of identifier (DOI, or urn, or ...) to use in 
> the case of a strictly digital music edition (no hybrid edition, not 
> the rendering, just the coding)? ...

This is a very insightful question. I am no musicologist or academic, 
but I would like to reply as someone interested in transcribing the 
legacy corpus of public domain scores to digital score formats 
<http://keyboardphilharmonic.org/>, and as an experienced software 
engineer.

Yes, I encourage you to put in a stable machine-readable identifier with 
the edition. And, I encourage you to thoughtful and clear about what 
that identifier refers to, and what it does not refer to.

Identifiers make many tasks easier. For a digital score, being able to 
refer to an edition, and the work of musical composition it represents, 
and the specific revision within an edition, are all important and 
useful. The identifier should be in a format which software can find and 
read within the digital score, and which humans can also find, and copy 
and paste (e.g. into an email). The identifier should be stable — once 
assigned, it should not change. The identifier should be unique — no 
other thing should ever have an identifier which could be confused with 
this identifier.

I am not aware of a system of identifiers for music score editions. If 
there is one, someone please cite it, so I can learn about it. If there 
is not one, you may have to make your own system of identifiers.  There 
are identifiers for works of musical composition, but there may be many 
editions of the same work. There are library catalogue call numbers, 
e.g. in WorldCat, but as far as I know these are specific to each 
library, and two identical copies of the same printed score could have 
different WorldCat numbers from different libraries. It would be 
valuable to have a catalogue of edition identifiers which would lead 
different people in those libraries to come up with the same edition 
number for their identical copies.

Consider how you will handle revisions. I believe the lesson of 
electronic text and software publishing is that revisions become more 
and more numerous and frequent as the representation becomes more 
symbolic and digital. Where a book might have only one or two editions, 
and one or four or ten printings, the same content as a wiki or software 
repository might have hundreds of numbered versions, for one "edition". 
It is valuable to have a stable identifier which represents what is 
stable and continuing about the edition, and a separate revision 
identifier to which represents what changes within those stable boundaries.

There are a few widely-used and trustworthy systems of identifiers I 
know of, which might be useful for your purposes.

MusicBrainz <https://musicbrainz.org/> assigns Work MBID 
<https://musicbrainz.org/doc/Work>s for compositions, and Artist MBID 
<https://musicbrainz.org/doc/Artist>s for people and groups. For 
instance, the identifier 
<https://musicbrainz.org/artist/9ddd7abc-9e1b-471d-8031-583bc6bc8be9> is 
a stable identifier for a person known as Чайковский, Tchaikovsky, and 
Tschaikowski. The identifier 
<https://musicbrainz.org/work/ba27a04b-1a26-4771-909e-81b2f8449ff7> 
refers to the 1875 version of 9ddd7abc's /Concerto for Piano and 
Orchestra no. 1 op. 23/, while the identifier 
<https://musicbrainz.org/work/71b01883-7b2d-488b-91cc-d0d44f182743> 
refers to an 1879 revision of that same concerto. MusicBrainz does not 
have score edition identifiers, that I know of.  But where they have 
identifers (for Work and Artist and Release and more), they are unique, 
stable, reliable, and useful.

Wikidata assigns identifiers for many kinds of things. The identifier 
Q7315 <https://www.wikidata.org/wiki/Q7315> is the identifier for the 
composer Musicbrainz identifies as 
artist/ddd7abc-9e1b-471d-8031-583bc6bc8be9. The identifier Q162935 
<https://www.wikidata.org/wiki/Q162935> refers to both the 1875 and the 
1879 versions of that concerto which MusicBrainz identifies as 
work/ba27a04b-1a26-4771-909e-81b2f8449ff7 and 
work/71b01883-7b2d-488b-91cc-d0d44f182743 .  Wikidata is also good at 
maintaining lists of identifiers in other systems for the things it 
identifies, so it is a powerful identifier cross-reference resource.

It is also possible to compute stable unique identifiers for the content 
of "just the coding", that is of arbitrary. This is done with tools 
which can reduce any content (byte sequence) of any length down to a 
"digest", consisting of a predefined number of binary bytes, that is 
unlikely to match the digest of any other content.  Such digests are 
known of names like "md5", "sha1", and "sha256". The sha-1 digest of the 
preceding paragraph is 426eae7043658f83d2c1e50d77f29b20bb3434e1 . The 
sha-256 digest is 
71fa49b4fa7c6fbe77d4e681c6168ba0eb49c5ba816ae218750a03ca5f265103 . So, 
you could choose to define the edition identifier for "just the coding" 
as the digest of the MEI-format text of the digital score itself. Every 
change to the content, the "coding", will result in a different digest. 
A definition like this requires defining a "canonical form" for the 
MEI-format text, and defining what to do about the space in the score 
for edition identifier when calculating the edition identifier. 
Standards like XML signing and XML comparison have come up with those 
definitions, however.

I am sorry this is not a specific reply to your question. I hope it is 
helpful in giving you ideas for deciding what edition identifiers you 
will use.

Best regards,
         —Jim DeLaHunt, Vancouver, Canada


On 2018-11-15 01:17, Norbert Dubowy Internationale Stiftung Mozarteum wrote:
> ​Dear metadata and cataloging IG,
> As we are close to publishing our first set of digital editions, I 
> would like to hear your opinion on the following subject: Does it make 
> sense to provide a persistent identifier with the edition (in the 
> individual file / MEI header)? What is the benefit, and is there any 
> preference of what kind of identifier (DOI, or urn, or ...) to use in 
> the case of a strictly digital music edition (no hybrid edition, not 
> the rendering, just the coding)?
>
> Thanks,
> N.
>
> *Dr. Norbert Dubowy *
> Mozart-Institut/Digitale Mozart-Edition
> Cheflektor/ManagingEditor
>
> Internationale Stiftung Mozarteum
> Schwarzstr. 26
> 5020 Salzburg, Austria
> T +43 (0) 662 889 40 66
> F +43 (0) 662 889 40 68
> E dubowy at mozarteum.at <mailto:dubowy%40mozarteum.at>
> www.mozarteum.at <http://www.mozarteum.at/>
>
> Newsletter Stiftung Mozarteum <http://www.mozarteum.at/content/newsletter>
> Facebook Stiftung Mozarteum <http://www.facebook.com/StiftungMozarteum>
> ZVR: 438729131, UID: ATU33977907
>
>
> _______________________________________________
> mei-catalog-ig mailing list
> mei-catalog-ig at lists.uni-paderborn.de
> https://lists.uni-paderborn.de/mailman/listinfo/mei-catalog-ig

-- 
     --Jim DeLaHunt,jdlh at jdlh.com      http://blog.jdlh.com/  (http://jdlh.com/)
       multilingual websites consultant

       355-1027 Davie St, Vancouver BC V6E 4L2, Canada
          Canada mobile +1-604-376-8953

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uni-paderborn.de/pipermail/mei-catalog-ig/attachments/20181121/e23c436c/attachment.html>


More information about the mei-catalog-ig mailing list