[mei-catalog-ig] Persistent identifier for digital music editions
Jim DeLaHunt
from.mei-l at jdlh.com
Thu Nov 22 08:21:51 CET 2018
Dr. Dubowy:
On 2018-11-15 01:17, Norbert Dubowy Internationale Stiftung Mozarteum wrote:
> Dear metadata and cataloging IG,
> As we are close to publishing our first set of digital editions, I
> would like to hear your opinion on the following subject: Does it make
> sense to provide a persistent identifier with the edition (in the
> individual file / MEI header)? What is the benefit, and is there any
> preference of what kind of identifier (DOI, or urn, or ...) to use in
> the case of a strictly digital music edition (no hybrid edition, not
> the rendering, just the coding)? ...
This is a very insightful question. I am no musicologist or academic,
but I would like to reply as someone interested in transcribing the
legacy corpus of public domain scores to digital score formats
<http://keyboardphilharmonic.org/>, and as an experienced software
engineer.
Yes, I encourage you to put in a stable machine-readable identifier with
the edition. And, I encourage you to thoughtful and clear about what
that identifier refers to, and what it does not refer to.
Identifiers make many tasks easier. For a digital score, being able to
refer to an edition, and the work of musical composition it represents,
and the specific revision within an edition, are all important and
useful. The identifier should be in a format which software can find and
read within the digital score, and which humans can also find, and copy
and paste (e.g. into an email). The identifier should be stable — once
assigned, it should not change. The identifier should be unique — no
other thing should ever have an identifier which could be confused with
this identifier.
I am not aware of a system of identifiers for music score editions. If
there is one, someone please cite it, so I can learn about it. If there
is not one, you may have to make your own system of identifiers. There
are identifiers for works of musical composition, but there may be many
editions of the same work. There are library catalogue call numbers,
e.g. in WorldCat, but as far as I know these are specific to each
library, and two identical copies of the same printed score could have
different WorldCat numbers from different libraries. It would be
valuable to have a catalogue of edition identifiers which would lead
different people in those libraries to come up with the same edition
number for their identical copies.
Consider how you will handle revisions. I believe the lesson of
electronic text and software publishing is that revisions become more
and more numerous and frequent as the representation becomes more
symbolic and digital. Where a book might have only one or two editions,
and one or four or ten printings, the same content as a wiki or software
repository might have hundreds of numbered versions, for one "edition".
It is valuable to have a stable identifier which represents what is
stable and continuing about the edition, and a separate revision
identifier to which represents what changes within those stable boundaries.
There are a few widely-used and trustworthy systems of identifiers I
know of, which might be useful for your purposes.
MusicBrainz <https://musicbrainz.org/> assigns Work MBID
<https://musicbrainz.org/doc/Work>s for compositions, and Artist MBID
<https://musicbrainz.org/doc/Artist>s for people and groups. For
instance, the identifier
<https://musicbrainz.org/artist/9ddd7abc-9e1b-471d-8031-583bc6bc8be9> is
a stable identifier for a person known as Чайковский, Tchaikovsky, and
Tschaikowski. The identifier
<https://musicbrainz.org/work/ba27a04b-1a26-4771-909e-81b2f8449ff7>
refers to the 1875 version of 9ddd7abc's /Concerto for Piano and
Orchestra no. 1 op. 23/, while the identifier
<https://musicbrainz.org/work/71b01883-7b2d-488b-91cc-d0d44f182743>
refers to an 1879 revision of that same concerto. MusicBrainz does not
have score edition identifiers, that I know of. But where they have
identifers (for Work and Artist and Release and more), they are unique,
stable, reliable, and useful.
Wikidata assigns identifiers for many kinds of things. The identifier
Q7315 <https://www.wikidata.org/wiki/Q7315> is the identifier for the
composer Musicbrainz identifies as
artist/ddd7abc-9e1b-471d-8031-583bc6bc8be9. The identifier Q162935
<https://www.wikidata.org/wiki/Q162935> refers to both the 1875 and the
1879 versions of that concerto which MusicBrainz identifies as
work/ba27a04b-1a26-4771-909e-81b2f8449ff7 and
work/71b01883-7b2d-488b-91cc-d0d44f182743 . Wikidata is also good at
maintaining lists of identifiers in other systems for the things it
identifies, so it is a powerful identifier cross-reference resource.
It is also possible to compute stable unique identifiers for the content
of "just the coding", that is of arbitrary. This is done with tools
which can reduce any content (byte sequence) of any length down to a
"digest", consisting of a predefined number of binary bytes, that is
unlikely to match the digest of any other content. Such digests are
known of names like "md5", "sha1", and "sha256". The sha-1 digest of the
preceding paragraph is 426eae7043658f83d2c1e50d77f29b20bb3434e1 . The
sha-256 digest is
71fa49b4fa7c6fbe77d4e681c6168ba0eb49c5ba816ae218750a03ca5f265103 . So,
you could choose to define the edition identifier for "just the coding"
as the digest of the MEI-format text of the digital score itself. Every
change to the content, the "coding", will result in a different digest.
A definition like this requires defining a "canonical form" for the
MEI-format text, and defining what to do about the space in the score
for edition identifier when calculating the edition identifier.
Standards like XML signing and XML comparison have come up with those
definitions, however.
I am sorry this is not a specific reply to your question. I hope it is
helpful in giving you ideas for deciding what edition identifiers you
will use.
Best regards,
—Jim DeLaHunt, Vancouver, Canada
On 2018-11-15 01:17, Norbert Dubowy Internationale Stiftung Mozarteum wrote:
> Dear metadata and cataloging IG,
> As we are close to publishing our first set of digital editions, I
> would like to hear your opinion on the following subject: Does it make
> sense to provide a persistent identifier with the edition (in the
> individual file / MEI header)? What is the benefit, and is there any
> preference of what kind of identifier (DOI, or urn, or ...) to use in
> the case of a strictly digital music edition (no hybrid edition, not
> the rendering, just the coding)?
>
> Thanks,
> N.
>
> *Dr. Norbert Dubowy *
> Mozart-Institut/Digitale Mozart-Edition
> Cheflektor/ManagingEditor
>
> Internationale Stiftung Mozarteum
> Schwarzstr. 26
> 5020 Salzburg, Austria
> T +43 (0) 662 889 40 66
> F +43 (0) 662 889 40 68
> E dubowy at mozarteum.at <mailto:dubowy%40mozarteum.at>
> www.mozarteum.at <http://www.mozarteum.at/>
>
> Newsletter Stiftung Mozarteum <http://www.mozarteum.at/content/newsletter>
> Facebook Stiftung Mozarteum <http://www.facebook.com/StiftungMozarteum>
> ZVR: 438729131, UID: ATU33977907
>
>
> _______________________________________________
> mei-catalog-ig mailing list
> mei-catalog-ig at lists.uni-paderborn.de
> https://lists.uni-paderborn.de/mailman/listinfo/mei-catalog-ig
--
--Jim DeLaHunt,jdlh at jdlh.com http://blog.jdlh.com/ (http://jdlh.com/)
multilingual websites consultant
355-1027 Davie St, Vancouver BC V6E 4L2, Canada
Canada mobile +1-604-376-8953
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uni-paderborn.de/pipermail/mei-catalog-ig/attachments/20181121/e23c436c/attachment.html>
More information about the mei-catalog-ig
mailing list