[MEI-L] syllable connectors

Kőmíves Zoltán zolaemil at gmail.com
Fri Jul 11 11:34:10 CEST 2014


Hi Perry,

Maybe we could have our cake and eat it too, if we kept the connector data
to be structured under en element within <syl>:

<lyrics xmlns="http://www.music-encoding.org/ns/mei">
  <verse>
    <syl>Dios</syl>
    <syl con="elided" >que<con>˘</con></syl>
    <syl>al</syl>
    <syl>mun<con>-</con></syl>
    <syl>do</syl>
  </verse>
</lyrics>

Maybe this would make it easier to deal with situations mentioned above
easier to deal with in processors, i.e.

<syl wordpos="i">long-<con>-</con></syl>
<syl wordpos="t">term</syl>

<syl>wan<con>- - - - - - - - - - -</con></syl>

while still allow more flexibility as to the connector's data type.

I'm not closely acquainted with Mr. OMR, so I do not know how important it
is, to accommodate his inaccuracy in the schema (by decreasing the level of
structuredness of the encoding), but in fact, even <syl>wan-</syl> could be
allowed, stating _nothing_ about the separation of the connector from the
syllable.

Best,
Zoltan



2014-07-10 22:38 GMT+01:00 Roland, Perry D. (pdr4h) <
pdr4h at eservices.virginia.edu>:

>
>
> Hi Craig,
>
>
>
> Let’s get rid of the (relatively) easy stuff first --
>
>
>
> 1. The position of a syllable with a word is handled by syl/@wordpos,
> where the value can be ‘i’, ‘m’, or ‘t’.  There probably should also be a
> value for ‘complete word’ as well, ‘w’ perhaps?
>
>
>
> 2. There was a typo in the elision example.  It should’ve been --
>
>
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>
>   <verse>
>
>     <syl>Dios</syl>
>
>     <syl con="elided" >que˘</syl>
>
>     <syl>al</syl>
>
>     <syl>mun-</syl>
>
>     <syl>do</syl>
>
>   </verse>
>
> </lyrics>
>
>
>
> 3. Since @n is restricted to a single NMTOKEN, use verse/@label to capture
> multiple verse numbers:
>
>
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>
>   <verse label="1.,2.,6">
>
>     <syl>Dios</syl>
>
>     <syl con="elided" >que˘</syl>
>
>     <syl>al</syl>
>
>     <syl>mun-</syl>
>
>     <syl>do</syl>
>
>   </verse>
>
> </lyrics>
>
>
>
> Now, for the real “meat”.  We can’t forget that one of the functions of
> MEI, if not the primary one, is to represent existing documents.  The
> reality is that real-world documents are going to contain errors and
> inconsistencies that have to be dealt with.  If MEI could mandate the
> “correct” way of doing things, as SCORE can, for example that divisions
> between syllables are always indicated by either a dash or an underscore,
> then life would be much easier.  What we need, however, is to accommodate
> multiple purposes -- recording what the document says, what it *ought to
> say*, and what it means (when we’re interested in that sort of thing).
>
>
>
> What other XML representations typically do (and what MEI does in some
> other places) is to record what the document says as the content of
> elements and any necessary annotations of that content as embedded elements
> and/or attributes.  My proposed changes are an attempt to do that with
> <syl>.  In the case of the syllable “que˘” above, the elision marker is “in
> the text” of the document.  I would prefer to leave it there and record the
> fact that this syllable is elided with the next in the @con attribute.
> Sure, the connector can be moved to an attribute, but there are several
> problems with that approach:
>
>
>
> a. the ability to say anything about the visual aspect of the connector,
> for example that it’s bold, is lost;
>
> b. omissions or errors can’t be explicitly indicated, for example that a
> connector isn’t present but ought to be or is there when it shouldn’t be;
>
> c. it’s difficult (but not impossible) to handle multiple values when, for
> example multiple connectors are present (erroneously or not), as in
> ‘que-˘’.  The rules of SCORE and modern notation aren’t universally
> followed.  J
>
>
>
> In other words, the markup <syl con=”separated”>wan-</syl> contains the
> same data as <syl con=”d”>wan</syl>.  The difference is simply where the
> data is.  BUT, as I indicated in the list above, the latter is much more
> restrictive regarding what data can appear.  Unless @con is also allowed to
> contain unrestricted CDATA, it will never be able to accommodate all the
> symbols that have been (or even could be) used to connect syllables.
>
>
>
> I recognize that you’d like the data to be as regular as possible and so
> want to place the connector(s) in an attribute.  But what would you do with
> (c)?  Splitting the data by putting ‘que’ in <syl> and ‘-˘’ in @con (if
> such a thing were allowed) could still result in “something [going] wrong
> in some obscure case”, whether the splitting is done by a human or by a
> software agent (Mr. OMR).  In my opinion, it’s less dangerous to leave the
> data together in just one place.
>
>
>
> I agree with you that putting "wan- - - - - - - - - - -" inside <syl> is
> not likely to be useful for (re-)rendering notation from the markup (and
> should probably be processed to be simply “wan-“), but it does preserve
> information about the original document.  Again, it’s less dangerous (and
> less work) to just leave it in place than to create the markup <syl
> con=”d”>wan</syl>, especially if you’re not interesting in (re-)rendering
> or reconstructing the text (i.e., prose), which might be the case if the
> creation of the markup is done in stages.
>
>
>
> Under no circumstances is this change (putting the connector in the
> character data of <syl>) meant to take the place of intelligent, dynamic
> rendering using SCORE or any other processor.  For example, even when a
> processor encounters “wan- - - - - - - - - - -" it shouldn’t render this
> literally.  The size and number of dashes, underscores, etc. should be
> automagically calculated.  But this data could be useful to those dealing
> strictly with an image of the notation, and so shouldn’t be discarded.
>
>
>
> --
>
> p.
>
>
>
>
>
>
>
> *From:* mei-l [mailto:mei-l-bounces+pdr4h=
> virginia.edu at lists.uni-paderborn.de] *On Behalf Of *Craig Sapp
> *Sent:* Wednesday, July 09, 2014 5:39 PM
>
> *To:* Music Encoding Initiative
> *Subject:* Re: [MEI-L] syllable connectors
>
>
>
> Hi Perry,
>
>
>
> <<But now I believe it would be better to use @con to record the
> *function* of the connector and put the lyric transcription/visual
> rendition *inside* the syllable element itself as is done in many other
> places in MEI. >>
>
> << Repetitions of a connector, "wan - - - - - - - - - - -" for example,
> would be allowed inside <syl> so that no data is lost (well, except for the
> location of each dash), but could be compressed to a single dash for
> presentation purposes.>>
>
>
>
> I am not too keen on placing the visual aspect of the lyrics text inside
> of <syl> CDATA since it is mixing the underlying prose content with its
> graphical presentation in the music.  The <syl> character data should only
> contain the prose of the text.  If the text extracted from the music should
> include "-  -  -  -" after the word "wan", then it should be in the <syl>
> character data; otherwise, it should not.
>
>
>
> For text extraction from lyrics, I would want to know if the <syl> data is
> at the start, middle or end of word, so that I can extract the data
> segmented by words instead of syllables by adding spaces or not between the
> <syl> character data (primarily for searching purposes, but also for
> displaying as regular prose/verse).  It would be preferable if I do not
> have to delete any characters from the <syl> data when reconstructing the
> prose, since something will go wrong in some obscure case.
>
>
>
> When two syllables of a word are separated by a long distance between two
> notes in a graphical score, multiple dashes are used.  If they layout of
> music changes, then the single/multiple dash display should change
> (automatically).  Hard-encoding of single/double dashes distinction is not
> very useful for manipulation of the layout unless you are intent on
> encoding the static layout of a specific edition.
>
>
>
> As an aside: I often come across the reverse case when two syllables are
> too close to comfortably be separated by a hyphen, the hyphen should be
> dropped and the two syllables should be displayed as a single word.  I do
> not know any notation editor which handles this case, and I have to do it
> manually when necessary (attaching the word to a single note, and leaving
> the next note without a syllable).
>
>
>
> SCORE always uses a dashed line between two syllables, and multiple dashes
> appear automatically as the line is extended.
>
> The number, size and distance between the dashes is controllable on this
> line.  In other words SCORE does not use a character-encoding of a hyphen
> to display the word separators.  The same goes for word extenders which are
> not literally a sequence of underscores.
>
>
>
>
>
> <<The text inside <syl> can be processed (using regular expression
> matching) to create any output needed, for example, the text "as-is" with
> hyphenated words (e.g., "wan-ton a-ban-don") or "joined-up" in a more
> poetic style (e.g., "wanton abandon"). >>
>
>
>
> This is how the Humdrum representation for lyrics works.  In general it
> works well, but there are complications.  In particular when there is a
> hyphen between two syllables in prose, you need a way of indicating that it
> should remain.  I don't come across that much in lyrics, but I would encode
> the word "long-term" as two syllables:
>
> "long--" and "-term", with the double hyphen indicating that when the
> lyrics are extracted from the music, the final prose should include a
> hyphen between those two syllables.  Such a system should be spelled out.
>  This system works well in 7-bit ASCII data, but I wonder if someone uses a
> strange or inconsistent unicode hyphen characters, what will happen?  Also,
> this would not be great if graphic-like display is used, for example "wan -
> - - - - - - - - - " could be compensated for in a regular expression, but
> only after discovering that someone was doing such a thing in the data, and
> would make the regular expression quite complicated for removing the
> extended hyphen.
>
>
>
> Another complication when extracting text prose, is how am I to detect an
> elision character in the CDATA as you have pointed out so many of them
> occur in unicode? :-)  This seems to make a case for a functional elision
> tag which contains a optional attribute for how it should be rendered as
> character(s) for separating two syllables.
>
>
>
>
>
> I don't understand this encoding which you can explain more:
>
>
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>Dios</syl>
>     <syl con="elided">que˘al</syl>
>     <syl>mun-</syl>
>     <syl>do</syl>
>   </verse>
> </lyrics>
>
>
>
> I would expect that the syl at con attribute describes how the current
> syllable connects to the following syllable, not an internal connector:
>
>
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>Dios</syl>
>
>     <syl con="elided">que</syl>
>     <syl>al</syl>
>     <syl>mun-</syl>
>     <syl>do</syl>
>   </verse>
> </lyrics>
>
>
>
> Also remember a few months ago we were having problems on representing
> verse numbers (in rondeaux), such as , "1.,2.,6" for indicating that the
> line of music is for the 1st, 2nd and 6th verses.  How should this be
> encoded.  In most musical editors, this has to be treated as regular text
> with a space elision before the first syllable in the lyrics.
>
>
>
>
>
> -=+Craig
>
>
>
>
>
>
>
>
>
> On 9 July 2014 11:50, Roland, Perry D. (pdr4h) <
> pdr4h at eservices.virginia.edu> wrote:
>
>
> Hi everybody,
>
> I knew that eventually someone would trip over this.  :-)  And that we'd
> need to fix it.
>
> This is another one of those places where the original purpose/form of MEI
> conflicts with later developments.  One of the original purposes of
> syl/@con was to allow a hand-encoder to mark the *function* of a syllable
> connector just by indicating just what they saw -- if the score contained a
> dash, the encoder would write <syl con="d"> and so on.  Another was to make
> it easier to convert existing representations into MEI.  For me, however,
> the main point was to get at the function of the individual syllable.
>
> But now I believe it would be better to use @con to record the *function*
> of the connector and put the lyric transcription/visual rendition *inside*
> the syllable element itself as is done in many other places in MEI.
>  Consider for a moment --
>
> "wan", "ton", and "wanton" are all English words.  The difference between
> the word "wan" followed by the word "ton" and the single word "wanton"
> divided syllabically is all in the connectors between the syllables.  For
> example:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan-</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> Of course, in the following markup, because a connector is absent the
> difference is not discernible:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> But, if we allow @con to have a value of "none", we're really no better
> off because we still don't know which visual connector *ought* to be
> present or what its (supposed) purpose is.  The following is still
> semantically indistinguishable from the preceding example because the
> orthography of the word "wan" and that of the first syllable of "wanton"
> (without its hyphen) are the same thing:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl con="none">wan</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> But, to record which connector *should* be present we can use <supplied>:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan<supplied>-</supplied></syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> Or use <gap> to record a missing connector without supplying one:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan<gap reason="missing hyphen"/></syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> Having put the connector *inside* <syl>, @con can be used to record the
> function of the connector:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl con="separated">wan<supplied>-</supplied></syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> Actually, I think I prefer @con to record info *about the syllable* since
> it's an attribute *of* the syllable.  The new values for @con (or for a new
> attribute if we want to keep @con around but deprecate it) could be
> "separated", "extended", "elided", and "unknown".  But I could be persuaded
> otherwise.
>
> This also works in the (hopefully) more usual case when the connector is
> present but our favorite naïve encoder (Mr. OMR) can't (or doesn't want to)
> determine the function of the connector:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan-</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> It would be better to have this info, of course, because depending on the
> rhythm of the vocal line and the prevailing notational style, a dash can be
> used for both separation and extension.  For example, when the first
> syllable is to be sung on multiple notes the markup could be:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl con="extended">wan-</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> In fact, there could be (and often are) multiple dashes filling the space
> between the first and last notes of the melisma or just one depending on
> the source document or on the rendering processor (when the MEI is to be
> rendered).  The same thing occurs with the underscore separator.
>
> But this kind of many-visual-representations-to-one-function situation is
> particularly acute when it comes to elision.  Various symbols have been
> used to indicate syllable elision -- breve, inverted breve, caron,
> circumflex, and tilde just to name a few.  The following example indicates
> an elision of "que" and "al":
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>Dios</syl>
>     <syl con="elided">que˘al</syl>
>     <syl>mun-</syl>
>     <syl>do</syl>
>   </verse>
> </lyrics>
>
> But so does this:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>Dios</syl>
>     <syl con="elided">que^al</syl
>     <syl>mun-</syl>
>     <syl>do</syl>
>   </verse>
> </lyrics>
>
> One could encounter any number of visual renditions indicating elision.
> And one should be able to use any appropriate Unicode or SMuFL code point
> for the connector.  SMuFL provides
>
>       U+E550
>       lyricsElisionNarrow
>       Narrow elision
>
>       U+E551
>       lyricsElision
>       Elision
>
>       U+E552
>       lyricsElisionWide
>       Wide elision
>
>       U+E553
>       lyricsHyphenBaseline
>       Baseline hyphen
>
>       U+E554
>       lyricsHyphenBaselineNonBreaking
>       Non-breaking baseline hyphen
>
>      (See the attached image or
> http://www.smufl.org/version/latest/range/lyrics/ for visual examples)
>
> The text inside <syl> can be processed (using regular expression matching)
> to create any output needed, for example, the text "as-is" with hyphenated
> words (e.g., "wan-ton a-ban-don") or "joined-up" in a more poetic style
> (e.g., "wanton abandon").  Repetitions of a connector, "wan - - - - - - - -
> - - -" for example, would be allowed inside <syl> so that no data is lost
> (well, except for the location of each dash), but could be compressed to a
> single dash for presentation purposes.
>
> I believe this will work better than the old system (it's clearer, no info
> is lost), but I'd like to hear other viewpoints.
>
> --
> p.
>
>
>
>
> > -----Original Message-----
> > From: mei-l [mailto:mei-l-bounces at lists.uni-paderborn.de] On Behalf Of
>
> > Christine Siegert
> > Sent: Friday, July 04, 2014 9:58 AM
> > To: Music Encoding Initiative
> > Subject: Re: [MEI-L] syllable connectors
> >
> > Dear Johannes, dear list,
> > The Sarti project agrees, too.
> > All the best,
> > Christine
> >
> >
> > Prof. Dr. Christine Siegert
> > Universität der Künste Berlin
> > Fakultät Musik, Musikwissenschaft
> > Fasanenstraße 1B
> > D-10623 Berlin
> >
> > Tel.: +49 (0)30 3185 2318
> > siegert at udk-berlin.de
> > -----Ursprüngliche Nachricht-----
> > From: Karen McAulay
> > Sent: Friday, July 04, 2014 12:16 PM
> > To: Music Encoding Initiative
> > Subject: Re: [MEI-L] syllable connectors
> >
> > Yes!
> >
> > Best wishes
> > Karen
> >
> > Dr. Karen McAulay
> > Music and Academic Services Librarian
> > +44 (0)141 270 8267 (direct)
> > K.McAulay at rcs.ac.uk
> > -----Original Message-----
> > From: mei-l [mailto:mei-l-bounces at lists.uni-paderborn.de] On Behalf Of
> > Johannes Kepper
> > Sent: 04 July 2014 10:56
> > To: Music Encoding Initiative
> > Subject: [MEI-L] syllable connectors
> >
> > Dear MEI-Listeners,
> >
> > doing some manual coding of vocal music, we ran across a situation where
> > the
> > layout of the printed score did not allow to put in any separator (well,
> > better connector) between two syllables of a word. The current list of
> > allowed connectors does not have an explicit option of "no connector at
> > all". Do we all agree that there should be one?
> >
> > Best,
> > Johannes
> >
> > _______________________________________________
> > mei-l mailing list
> > mei-l at lists.uni-paderborn.de
> > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
> >
> >
> > _______________________________________________
> > mei-l mailing list
> > mei-l at lists.uni-paderborn.de
> > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
>
>
> _______________________________________________
> mei-l mailing list
> mei-l at lists.uni-paderborn.de
> https://lists.uni-paderborn.de/mailman/listinfo/mei-l
>
>
>
> _______________________________________________
> mei-l mailing list
> mei-l at lists.uni-paderborn.de
> https://lists.uni-paderborn.de/mailman/listinfo/mei-l
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uni-paderborn.de/pipermail/mei-l/attachments/20140711/67ce1384/attachment.html>


More information about the mei-l mailing list