[MEI-L] FRBR in MEI

Tue Nov 13 12:32:47 CET 2012

Dear Axel,

thanks for putting that together. See my comments inline. 

Am 13.11.2012 um 11:33 schrieb Axel Teich Geertinger <atge at kb.dk>:

> Dear list,
> 
> as some of you know, we have been experimenting for some time with implementing FRBR group 1 entities (work, expression, manifestation, item; see http://www.ifla.org/publications/functional-requirements-for-bibliographic-records) in <meiHead> to be able to clearly distinguish these four levels of description. With <source> translating to FRBR manifestation, we have two of them already (work and manifestation), so we have added the two others along with some container and linking elements. In my opinion, the results so far are very promising, and I think this is the time to discuss whether and how to go ahead with it. As with the <bibl> discussion, I need to know which way to head in order to get MerMEId ready for release. So, I hope we can get to an agreement about what is going to be in the next MEI schema, at least in general terms. This will probably get a bit lengthy, but I hope some of you will have a look at it.
> 
> For the time being, we are using a customization (thanks to Johannes), extending the MEI 2012 schema as follows:
> 
> 1)	In <work>, we have introduced the element <expressionList>, a container for <expression> elements, which have the same content model as <work>. <expression> has a child <componentGrp>, an ordered list allowing <expression> child elements; it is intended to represent a sequence of sections such as movements. This can be nested; it enables us to describe the structure of, say, an opera, divided into acts, acts into scenes, and scenes into subsections of scenes (recitative, aria etc.), each with their own incipit etc. 
> Actually, <work> also allows <componentGrp> as a child, though we are not using it; I am not sure whether it may be useful or not.
> 
> 2)	Likewise, in <source>, we have an <itemList> containing <item> elements, i.e. descriptions of individual copies/exemplars of a source. As with <work> and <expression>, <source> and <item> share the same content model (though we do not use all elements at both levels - more on that later). Also <source> and <item> have an optional <componentGrp> to describe their constituents.
> 
> 3)	All four FRBR entity elements have a <relationList> child, containing <relation> elements. These establish the relations between entities not immediately deductable from the XML tree, such as expression-to-manifestation (these are the main links between the <workDesc> and <sourceDesc> sub-trees), manifestation-to-manifestation (for instance, identifying one source as a reprint or copy of another one; this also allows the encoding of a stemma), or external relations such as work-to-work.
> These new elements replace <relatedItem> now found in <work> and <source>.
> 
> Apart from an agreement on the overall structure, there are a number of issues to address. I will try to list them here as good as I can, though I am sure there are more. 
> 
> 1)	The naming of elements. For now, we have defined a generic <componentGrp> element available at all four levels, meaning that the schema allows for such rubbish as putting works into items, since <componentGrp> allows <work>, <expression>, <source>, and <item>. We may choose renaming them into <expressionComponents> etc. or something like that in order to control their different contents, or leaving it to the individual encoder (or schematron) to avoid such nonsense.

While Schematron rules are not equally well supported than the RelaxNG schema itself, they are part of the schema, and thus official part of the MEI specification. If a given application is not capable of enforcing the rules expressed in Schematron by validating against them, this doesn't mean that they are obsolete. In this case, Schematron allows a less complex definition of the intended schema, and that's the reason why we chose it. It is by no means optional, as many other rules for MEI, which are also expressed in Schematron, aren't optional. 

> Speaking of element names, the good old name <sourceDesc> does not sound quite right to me. Especially if we introduce <expressionList> and <itemList>, I think <sourceList> would be more appropriate. The actual description of sources is what I would expect to find *inside* <source>. But I know that may too late to change... 

As this whole FRBR thing introduces a model which is quite distinct to TEI, it seems not unreasonable to reflect that by choosing a different name for the sources' container. This would have some consequences, though, and for me, it is connected with the question of whether sourceDesc (or whatever we decide to call it) is a child of fileDesc or not. Currently, sourceDesc is a child of fileDesc, while workDesc is a sibling. I see the reason for putting sourceDesc in there in the first place (these are the sources used to create the file, i.e. the MEI instance), but I wonder if this is still true. What about a catalogue of works and sources, which may not even have any music in it? Wouldn't it be better to add a pointer from somewhere in fileDesc to the extracted sourceDesc, indicating which source was used for the transcription? Or is it safe to rely on @source references down in the music subtree?

> 
> 2)	The content models of <source> and <item>, respectively. Obviously, some (well, most) elements will be needed at both levels, but to minimize confusion, I would suggest a few restrictions. The most obvious one would be to move <physLoc> out of <physDesc> and allowing it in <item> only. <watermark> may or may not be banned from <source> (would it make sense to describe the watermark of a print edition, or of individual copies only? I am not sure). 

The question here is if we want to allow people to use MEI without FRBR, i.e. without making the distinction between manifestations (the print run) and items (the individual copies). Although I think that FRBR is a very clever model, I wonder if we can or should enforce it. I could imagine to put it in a separate module, which would add (besides the elements) a couple of Schematron rules (see their importance above) that keep the usage of source in line with the FRBR model. If this module is turned of, people would be free to use MEI like before. But this would create inconsistencies, so we clearly have to make a decision here about balancing between flexibility and standardization. 

> 
> 3)	Variations FRBR (http://www.dlib.indiana.edu/projects/vfrbr/). Would it be desirable to aim at offering fully VFRBR compliant encoding? It seems we are pretty close already, though not all VFRBR attributes are matched precisely by MEI elements. I must admit I have no opinion on whether it is important.
> Perhaps the only element really missing from VFRBR compliancy is <extent> in <expression>. It should be no problem introducing it, I guess. 
> And now we're at it, certain projects have requested to be able to specify the duration (<extent>) of a work. Usually, this could and should be placed in <expression> rather than <work>, but what about the situation where the composer actually prescribes a specific duration for her work - an instruction that the actual expressions may or may not follow?

Like you, I have no particular opinion on this. I would just argue that if Variations uses a subset of FRBR (and I don't know if they do), we should still keep the full, non-flavored FRBR. If all they did is adding things, I'm fine to include them. We should avoid to enforce models which are useful for a certain project, but might not be for others. 

> 
> 4)	There is a problem possibly emerging from the notation-centric nature of MEI, or perhaps it is really a FRBR problem; namely the handling of performances and recordings. FRBR treats them both as expressions, i.e. as "siblings" to what I (and MerMEId) would regard as different versions of the work. We encode performances using <eventList> elements within expression/history, i.e. as (grand-)children of <expression>, which really makes sense to me. A performance must be of a certain version (form, instrumentation) of the work, so I strongly believe we should keep it this way. It's just not how FRBR sees it. On the other hand, as far as I can see there is nothing (except the practical an conceptual difficulties) that prevents users from encoding e performance or a recording as an expression, so FRBR compliance is probably possible also in this respect. I just wouldn't recommend it, and I actually suspect FRBR having a problem there rather than MEI. 

I haven't looked this up, but are you sure that performances and recordings are on the same level? I would see performances as expressions, while recordings are manifestations. Of course a performance follows a certain version of a work, like the piano version (=expression). But, the musician moves that to a different domain (graphical to audio), and he may or may not play the repeats, and he may or may not follow the dynamic indications of the score. There certainly is a strong relationship between both expressions, but they are distinct to me. I see your reasons for putting everything into an eventList, and thus subsuming it under one expression, but that might not always be the most appropriate model. Sometimes, it might be better to use separate expressions for the piano version and it's performances and connect them with one or more relations. 

> I haven't looked into the details of recordings metadata yet, but I guess we'll have to address that too at some point. Without having given it much thought, I see two options here and now (apart from <expression>): <bibl> and <source>, depending on the recording's relation to the encoding. We may want to add a number of elements to <source> to accommodate recording information.
> 
> 5)	Finally, an issue related to the FRBR discussion, though not directly a consequence of it: MEI 2012 allows multiple <work> elements within <workDesc>. I can't think of any situation, however, in which it may be desirable to describe more than one work in a single file. On the contrary, it could easily cause a lot of confusion, so I would actually suggest allowing only one <work> element; in other words: either skip <workDesc> and have 1 optional <work> in <meiHead>, or keep <workDesc>, and change its content model to be the one used by <work> now.

Again, I think that this perspective is biased from your application, where it makes perfect sense. Consider you're working on Wagner's Ring. You might want to say something about all these works in just one file. All I want to say is that this is a modeling question, which is clearly project-specific. It seems perfectly reasonable to restrict merMEId to MEI instances with only one work, but I wouldn't restrict MEI to one work per file. This may result in preprocessing files before operating on them with merMEId, but we have similar situations for many other aspects for MEI, so this isn't bad per se. 

> 
> Any comments greatly appreciated :-)
> 
> All the best,
> Axel
> 

Thanks again for putting that together!!!
Best,
Johannes

> _______________________________________________
> mei-l mailing list
> mei-l at lists.uni-paderborn.de
> https://lists.uni-paderborn.de/mailman/listinfo/mei-l