[MEI-L] syllable connectors

Fri Jul 11 11:53:49 CEST 2014

I also like Zoltan's solution because it disentangles the problem. Although
personally I would try to stay away from it as much as possible and rely on
@wordpos, which is more descriptive. I would only use it, for example, in a
diplomatic transcription -- an exercise that is akin to what Mr. OMR does.

So I also like Perry's idea of introducing an attribute that models the
function of the connector, so that I can encode what @wordpos doen't cover
(such as elision). I would make @con (or better name) an open list,
providing at least the "elided" value.

Raff

On Fri, Jul 11, 2014 at 11:40 AM, Johannes Kepper <kepper at edirom.de> wrote:

> I would second that. If we're about to go down that road and change the
> model of coding syllables as suggested, I'd like to have the option of
> marking up the connectors as such. This allows Mr. MakeUseOfTheEncoding to
> just strip these elements, rely on the values of @wordpos, and connect
> things as necessary. Of course this scenario depends on a whole lot of
> things, but still I see great benefit in the possibility. It will be
> optional, but at least I would like to use it that way.
>
> TLDR:
> +1 for <con> (without being married to that tag name)
>
> jo
>
>
>
>
> Am 11.07.2014 um 11:34 schrieb Kőmíves Zoltán <zolaemil at gmail.com>:
>
> > Hi Perry,
> >
> > Maybe we could have our cake and eat it too, if we kept the connector
> data to be structured under en element within <syl>:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>Dios</syl>
> >     <syl con="elided" >que<con>˘</con></syl>
> >     <syl>al</syl>
> >     <syl>mun<con>-</con></syl>
> >     <syl>do</syl>
> >   </verse>
> > </lyrics>
> >
> > Maybe this would make it easier to deal with situations mentioned above
> easier to deal with in processors, i.e.
> >
> > <syl wordpos="i">long-<con>-</con></syl>
> > <syl wordpos="t">term</syl>
> >
> > <syl>wan<con>- - - - - - - - - - -</con></syl>
> >
> > while still allow more flexibility as to the connector's data type.
> >
> > I'm not closely acquainted with Mr. OMR, so I do not know how important
> it is, to accommodate his inaccuracy in the schema (by decreasing the level
> of structuredness of the encoding), but in fact, even <syl>wan-</syl> could
> be allowed, stating _nothing_ about the separation of the connector from
> the syllable.
> >
> > Best,
> > Zoltan
> >
> >
> >
> > 2014-07-10 22:38 GMT+01:00 Roland, Perry D. (pdr4h) <
> pdr4h at eservices.virginia.edu>:
> >
> >
> > Hi Craig,
> >
> >
> >
> > Let’s get rid of the (relatively) easy stuff first --
> >
> >
> >
> > 1. The position of a syllable with a word is handled by syl/@wordpos,
> where the value can be ‘i’, ‘m’, or ‘t’.  There probably should also be a
> value for ‘complete word’ as well, ‘w’ perhaps?
> >
> >
> >
> > 2. There was a typo in the elision example.  It should’ve been --
> >
> >
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >
> >   <verse>
> >
> >     <syl>Dios</syl>
> >
> >     <syl con="elided" >que˘</syl>
> >
> >     <syl>al</syl>
> >
> >     <syl>mun-</syl>
> >
> >     <syl>do</syl>
> >
> >   </verse>
> >
> > </lyrics>
> >
> >
> >
> > 3. Since @n is restricted to a single NMTOKEN, use verse/@label to
> capture multiple verse numbers:
> >
> >
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >
> >   <verse label="1.,2.,6">
> >
> >     <syl>Dios</syl>
> >
> >     <syl con="elided" >que˘</syl>
> >
> >     <syl>al</syl>
> >
> >     <syl>mun-</syl>
> >
> >     <syl>do</syl>
> >
> >   </verse>
> >
> > </lyrics>
> >
> >
> >
> > Now, for the real “meat”.  We can’t forget that one of the functions of
> MEI, if not the primary one, is to represent existing documents.  The
> reality is that real-world documents are going to contain errors and
> inconsistencies that have to be dealt with.  If MEI could mandate the
> “correct” way of doing things, as SCORE can, for example that divisions
> between syllables are always indicated by either a dash or an underscore,
> then life would be much easier.  What we need, however, is to accommodate
> multiple purposes -- recording what the document says, what it *ought to
> say*, and what it means (when we’re interested in that sort of thing).
> >
> >
> >
> > What other XML representations typically do (and what MEI does in some
> other places) is to record what the document says as the content of
> elements and any necessary annotations of that content as embedded elements
> and/or attributes.  My proposed changes are an attempt to do that with
> <syl>.  In the case of the syllable “que˘” above, the elision marker is “in
> the text” of the document.  I would prefer to leave it there and record the
> fact that this syllable is elided with the next in the @con attribute.
>  Sure, the connector can be moved to an attribute, but there are several
> problems with that approach:
> >
> >
> >
> > a. the ability to say anything about the visual aspect of the connector,
> for example that it’s bold, is lost;
> >
> > b. omissions or errors can’t be explicitly indicated, for example that a
> connector isn’t present but ought to be or is there when it shouldn’t be;
> >
> > c. it’s difficult (but not impossible) to handle multiple values when,
> for example multiple connectors are present (erroneously or not), as in
> ‘que-˘’.  The rules of SCORE and modern notation aren’t universally
> followed.  J
> >
> >
> >
> > In other words, the markup <syl con=”separated”>wan-</syl> contains the
> same data as <syl con=”d”>wan</syl>.  The difference is simply where the
> data is.  BUT, as I indicated in the list above, the latter is much more
> restrictive regarding what data can appear.  Unless @con is also allowed to
> contain unrestricted CDATA, it will never be able to accommodate all the
> symbols that have been (or even could be) used to connect syllables.
> >
> >
> >
> > I recognize that you’d like the data to be as regular as possible and so
> want to place the connector(s) in an attribute.  But what would you do with
> (c)?  Splitting the data by putting ‘que’ in <syl> and ‘-˘’ in @con (if
> such a thing were allowed) could still result in “something [going] wrong
> in some obscure case”, whether the splitting is done by a human or by a
> software agent (Mr. OMR).  In my opinion, it’s less dangerous to leave the
> data together in just one place.
> >
> >
> >
> > I agree with you that putting "wan- - - - - - - - - - -" inside <syl> is
> not likely to be useful for (re-)rendering notation from the markup (and
> should probably be processed to be simply “wan-“), but it does preserve
> information about the original document.  Again, it’s less dangerous (and
> less work) to just leave it in place than to create the markup <syl
> con=”d”>wan</syl>, especially if you’re not interesting in (re-)rendering
> or reconstructing the text (i.e., prose), which might be the case if the
> creation of the markup is done in stages.
> >
> >
> >
> > Under no circumstances is this change (putting the connector in the
> character data of <syl>) meant to take the place of intelligent, dynamic
> rendering using SCORE or any other processor.  For example, even when a
> processor encounters “wan- - - - - - - - - - -" it shouldn’t render this
> literally.  The size and number of dashes, underscores, etc. should be
> automagically calculated.  But this data could be useful to those dealing
> strictly with an image of the notation, and so shouldn’t be discarded.
> >
> >
> >
> > --
> >
> > p.
> >
> >
> >
> >
> >
> >
> >
> > From: mei-l [mailto:mei-l-bounces+pdr4h=
> virginia.edu at lists.uni-paderborn.de] On Behalf Of Craig Sapp
> > Sent: Wednesday, July 09, 2014 5:39 PM
> >
> >
> > To: Music Encoding Initiative
> > Subject: Re: [MEI-L] syllable connectors
> >
> >
> >
> > Hi Perry,
> >
> >
> >
> > <<But now I believe it would be better to use @con to record the
> *function* of the connector and put the lyric transcription/visual
> rendition *inside* the syllable element itself as is done in many other
> places in MEI. >>
> >
> > << Repetitions of a connector, "wan - - - - - - - - - - -" for example,
> would be allowed inside <syl> so that no data is lost (well, except for the
> location of each dash), but could be compressed to a single dash for
> presentation purposes.>>
> >
> >
> >
> > I am not too keen on placing the visual aspect of the lyrics text inside
> of <syl> CDATA since it is mixing the underlying prose content with its
> graphical presentation in the music.  The <syl> character data should only
> contain the prose of the text.  If the text extracted from the music should
> include "-  -  -  -" after the word "wan", then it should be in the <syl>
> character data; otherwise, it should not.
> >
> >
> >
> > For text extraction from lyrics, I would want to know if the <syl> data
> is at the start, middle or end of word, so that I can extract the data
> segmented by words instead of syllables by adding spaces or not between the
> <syl> character data (primarily for searching purposes, but also for
> displaying as regular prose/verse).  It would be preferable if I do not
> have to delete any characters from the <syl> data when reconstructing the
> prose, since something will go wrong in some obscure case.
> >
> >
> >
> > When two syllables of a word are separated by a long distance between
> two notes in a graphical score, multiple dashes are used.  If they layout
> of music changes, then the single/multiple dash display should change
> (automatically).  Hard-encoding of single/double dashes distinction is not
> very useful for manipulation of the layout unless you are intent on
> encoding the static layout of a specific edition.
> >
> >
> >
> > As an aside: I often come across the reverse case when two syllables are
> too close to comfortably be separated by a hyphen, the hyphen should be
> dropped and the two syllables should be displayed as a single word.  I do
> not know any notation editor which handles this case, and I have to do it
> manually when necessary (attaching the word to a single note, and leaving
> the next note without a syllable).
> >
> >
> >
> > SCORE always uses a dashed line between two syllables, and multiple
> dashes appear automatically as the line is extended.
> >
> > The number, size and distance between the dashes is controllable on this
> line.  In other words SCORE does not use a character-encoding of a hyphen
> to display the word separators.  The same goes for word extenders which are
> not literally a sequence of underscores.
> >
> >
> >
> >
> >
> > <<The text inside <syl> can be processed (using regular expression
> matching) to create any output needed, for example, the text "as-is" with
> hyphenated words (e.g., "wan-ton a-ban-don") or "joined-up" in a more
> poetic style (e.g., "wanton abandon"). >>
> >
> >
> >
> > This is how the Humdrum representation for lyrics works.  In general it
> works well, but there are complications.  In particular when there is a
> hyphen between two syllables in prose, you need a way of indicating that it
> should remain.  I don't come across that much in lyrics, but I would encode
> the word "long-term" as two syllables:
> >
> > "long--" and "-term", with the double hyphen indicating that when the
> lyrics are extracted from the music, the final prose should include a
> hyphen between those two syllables.  Such a system should be spelled out.
>  This system works well in 7-bit ASCII data, but I wonder if someone uses a
> strange or inconsistent unicode hyphen characters, what will happen?  Also,
> this would not be great if graphic-like display is used, for example "wan -
> - - - - - - - - - " could be compensated for in a regular expression, but
> only after discovering that someone was doing such a thing in the data, and
> would make the regular expression quite complicated for removing the
> extended hyphen.
> >
> >
> >
> > Another complication when extracting text prose, is how am I to detect
> an elision character in the CDATA as you have pointed out so many of them
> occur in unicode? :-)  This seems to make a case for a functional elision
> tag which contains a optional attribute for how it should be rendered as
> character(s) for separating two syllables.
> >
> >
> >
> >
> >
> > I don't understand this encoding which you can explain more:
> >
> >
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>Dios</syl>
> >     <syl con="elided">que˘al</syl>
> >     <syl>mun-</syl>
> >     <syl>do</syl>
> >   </verse>
> > </lyrics>
> >
> >
> >
> > I would expect that the syl at con attribute describes how the current
> syllable connects to the following syllable, not an internal connector:
> >
> >
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>Dios</syl>
> >
> >     <syl con="elided">que</syl>
> >     <syl>al</syl>
> >     <syl>mun-</syl>
> >     <syl>do</syl>
> >   </verse>
> > </lyrics>
> >
> >
> >
> > Also remember a few months ago we were having problems on representing
> verse numbers (in rondeaux), such as , "1.,2.,6" for indicating that the
> line of music is for the 1st, 2nd and 6th verses.  How should this be
> encoded.  In most musical editors, this has to be treated as regular text
> with a space elision before the first syllable in the lyrics.
> >
> >
> >
> >
> >
> > -=+Craig
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On 9 July 2014 11:50, Roland, Perry D. (pdr4h) <
> pdr4h at eservices.virginia.edu> wrote:
> >
> >
> > Hi everybody,
> >
> > I knew that eventually someone would trip over this.  :-)  And that we'd
> need to fix it.
> >
> > This is another one of those places where the original purpose/form of
> MEI conflicts with later developments.  One of the original purposes of
> syl/@con was to allow a hand-encoder to mark the *function* of a syllable
> connector just by indicating just what they saw -- if the score contained a
> dash, the encoder would write <syl con="d"> and so on.  Another was to make
> it easier to convert existing representations into MEI.  For me, however,
> the main point was to get at the function of the individual syllable.
> >
> > But now I believe it would be better to use @con to record the
> *function* of the connector and put the lyric transcription/visual
> rendition *inside* the syllable element itself as is done in many other
> places in MEI.  Consider for a moment --
> >
> > "wan", "ton", and "wanton" are all English words.  The difference
> between the word "wan" followed by the word "ton" and the single word
> "wanton" divided syllabically is all in the connectors between the
> syllables.  For example:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>wan-</syl>
> >     <syl>ton</syl>
> >   </verse>
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> > </lyrics>
> >
> > Of course, in the following markup, because a connector is absent the
> difference is not discernible:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> > </lyrics>
> >
> > But, if we allow @con to have a value of "none", we're really no better
> off because we still don't know which visual connector *ought* to be
> present or what its (supposed) purpose is.  The following is still
> semantically indistinguishable from the preceding example because the
> orthography of the word "wan" and that of the first syllable of "wanton"
> (without its hyphen) are the same thing:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl con="none">wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> > </lyrics>
> >
> > But, to record which connector *should* be present we can use <supplied>:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>wan<supplied>-</supplied></syl>
> >     <syl>ton</syl>
> >   </verse>
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> > </lyrics>
> >
> > Or use <gap> to record a missing connector without supplying one:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>wan<gap reason="missing hyphen"/></syl>
> >     <syl>ton</syl>
> >   </verse>
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> > </lyrics>
> >
> > Having put the connector *inside* <syl>, @con can be used to record the
> function of the connector:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl con="separated">wan<supplied>-</supplied></syl>
> >     <syl>ton</syl>
> >   </verse>
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> > </lyrics>
> >
> > Actually, I think I prefer @con to record info *about the syllable*
> since it's an attribute *of* the syllable.  The new values for @con (or for
> a new attribute if we want to keep @con around but deprecate it) could be
> "separated", "extended", "elided", and "unknown".  But I could be persuaded
> otherwise.
> >
> > This also works in the (hopefully) more usual case when the connector is
> present but our favorite naïve encoder (Mr. OMR) can't (or doesn't want to)
> determine the function of the connector:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>wan-</syl>
> >     <syl>ton</syl>
> >   </verse>
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> > </lyrics>
> >
> > It would be better to have this info, of course, because depending on
> the rhythm of the vocal line and the prevailing notational style, a dash
> can be used for both separation and extension.  For example, when the first
> syllable is to be sung on multiple notes the markup could be:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl con="extended">wan-</syl>
> >     <syl>ton</syl>
> >   </verse>
> >   <verse>
> >     <syl>wan</syl>
> >     <syl>ton</syl>
> >   </verse>
> > </lyrics>
> >
> > In fact, there could be (and often are) multiple dashes filling the
> space between the first and last notes of the melisma or just one depending
> on the source document or on the rendering processor (when the MEI is to be
> rendered).  The same thing occurs with the underscore separator.
> >
> > But this kind of many-visual-representations-to-one-function situation
> is particularly acute when it comes to elision.  Various symbols have been
> used to indicate syllable elision -- breve, inverted breve, caron,
> circumflex, and tilde just to name a few.  The following example indicates
> an elision of "que" and "al":
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>Dios</syl>
> >     <syl con="elided">que˘al</syl>
> >     <syl>mun-</syl>
> >     <syl>do</syl>
> >   </verse>
> > </lyrics>
> >
> > But so does this:
> >
> > <lyrics xmlns="http://www.music-encoding.org/ns/mei">
> >   <verse>
> >     <syl>Dios</syl>
> >     <syl con="elided">que^al</syl
> >     <syl>mun-</syl>
> >     <syl>do</syl>
> >   </verse>
> > </lyrics>
> >
> > One could encounter any number of visual renditions indicating elision.
> And one should be able to use any appropriate Unicode or SMuFL code point
> for the connector.  SMuFL provides
> >
> >       U+E550
> >       lyricsElisionNarrow
> >       Narrow elision
> >
> >       U+E551
> >       lyricsElision
> >       Elision
> >
> >       U+E552
> >       lyricsElisionWide
> >       Wide elision
> >
> >       U+E553
> >       lyricsHyphenBaseline
> >       Baseline hyphen
> >
> >       U+E554
> >       lyricsHyphenBaselineNonBreaking
> >       Non-breaking baseline hyphen
> >
> >      (See the attached image or
> http://www.smufl.org/version/latest/range/lyrics/ for visual examples)
> >
> > The text inside <syl> can be processed (using regular expression
> matching) to create any output needed, for example, the text "as-is" with
> hyphenated words (e.g., "wan-ton a-ban-don") or "joined-up" in a more
> poetic style (e.g., "wanton abandon").  Repetitions of a connector, "wan -
> - - - - - - - - - -" for example, would be allowed inside <syl> so that no
> data is lost (well, except for the location of each dash), but could be
> compressed to a single dash for presentation purposes.
> >
> > I believe this will work better than the old system (it's clearer, no
> info is lost), but I'd like to hear other viewpoints.
> >
> > --
> > p.
> >
> >
> >
> >
> > > -----Original Message-----
> > > From: mei-l [mailto:mei-l-bounces at lists.uni-paderborn.de] On Behalf Of
> >
> > > Christine Siegert
> > > Sent: Friday, July 04, 2014 9:58 AM
> > > To: Music Encoding Initiative
> > > Subject: Re: [MEI-L] syllable connectors
> > >
> > > Dear Johannes, dear list,
> > > The Sarti project agrees, too.
> > > All the best,
> > > Christine
> > >
> > >
> > > Prof. Dr. Christine Siegert
> > > Universität der Künste Berlin
> > > Fakultät Musik, Musikwissenschaft
> > > Fasanenstraße 1B
> > > D-10623 Berlin
> > >
> > > Tel.: +49 (0)30 3185 2318
> > > siegert at udk-berlin.de
> > > -----Ursprüngliche Nachricht-----
> > > From: Karen McAulay
> > > Sent: Friday, July 04, 2014 12:16 PM
> > > To: Music Encoding Initiative
> > > Subject: Re: [MEI-L] syllable connectors
> > >
> > > Yes!
> > >
> > > Best wishes
> > > Karen
> > >
> > > Dr. Karen McAulay
> > > Music and Academic Services Librarian
> > > +44 (0)141 270 8267 (direct)
> > > K.McAulay at rcs.ac.uk
> > > -----Original Message-----
> > > From: mei-l [mailto:mei-l-bounces at lists.uni-paderborn.de] On Behalf Of
> > > Johannes Kepper
> > > Sent: 04 July 2014 10:56
> > > To: Music Encoding Initiative
> > > Subject: [MEI-L] syllable connectors
> > >
> > > Dear MEI-Listeners,
> > >
> > > doing some manual coding of vocal music, we ran across a situation
> where
> > > the
> > > layout of the printed score did not allow to put in any separator
> (well,
> > > better connector) between two syllables of a word. The current list of
> > > allowed connectors does not have an explicit option of "no connector at
> > > all". Do we all agree that there should be one?
> > >
> > > Best,
> > > Johannes
> > >
> > > _______________________________________________
> > > mei-l mailing list
> > > mei-l at lists.uni-paderborn.de
> > > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
> > >
> > >
> > > _______________________________________________
> > > mei-l mailing list
> > > mei-l at lists.uni-paderborn.de
> > > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
> >
> >
> > _______________________________________________
> > mei-l mailing list
> > mei-l at lists.uni-paderborn.de
> > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
> >
> >
> >
> >
> > _______________________________________________
> > mei-l mailing list
> > mei-l at lists.uni-paderborn.de
> > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
> >
> >
> > _______________________________________________
> > mei-l mailing list
> > mei-l at lists.uni-paderborn.de
> > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
>
>
> _______________________________________________
> mei-l mailing list
> mei-l at lists.uni-paderborn.de
> https://lists.uni-paderborn.de/mailman/listinfo/mei-l
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uni-paderborn.de/pipermail/mei-l/attachments/20140711/0848915a/attachment.html>