[MEI-L] syllable connectors

Wed Jul 9 23:39:17 CEST 2014

Hi Perry,

<<But now I believe it would be better to use @con to record the *function*
of the connector and put the lyric transcription/visual rendition *inside*
the syllable element itself as is done in many other places in MEI. >>
<< Repetitions of a connector, "wan - - - - - - - - - - -" for example,
would be allowed inside <syl> so that no data is lost (well, except for the
location of each dash), but could be compressed to a single dash for
presentation purposes.>>

I am not too keen on placing the visual aspect of the lyrics text inside of
<syl> CDATA since it is mixing the underlying prose content with its
graphical presentation in the music.  The <syl> character data should only
contain the prose of the text.  If the text extracted from the music should
include "-  -  -  -" after the word "wan", then it should be in the <syl>
character data; otherwise, it should not.

For text extraction from lyrics, I would want to know if the <syl> data is
at the start, middle or end of word, so that I can extract the data
segmented by words instead of syllables by adding spaces or not between the
<syl> character data (primarily for searching purposes, but also for
displaying as regular prose/verse).  It would be preferable if I do not
have to delete any characters from the <syl> data when reconstructing the
prose, since something will go wrong in some obscure case.

When two syllables of a word are separated by a long distance between two
notes in a graphical score, multiple dashes are used.  If they layout of
music changes, then the single/multiple dash display should change
(automatically).  Hard-encoding of single/double dashes distinction is not
very useful for manipulation of the layout unless you are intent on
encoding the static layout of a specific edition.

As an aside: I often come across the reverse case when two syllables are
too close to comfortably be separated by a hyphen, the hyphen should be
dropped and the two syllables should be displayed as a single word.  I do
not know any notation editor which handles this case, and I have to do it
manually when necessary (attaching the word to a single note, and leaving
the next note without a syllable).

SCORE always uses a dashed line between two syllables, and multiple dashes
appear automatically as the line is extended.
The number, size and distance between the dashes is controllable on this
line.  In other words SCORE does not use a character-encoding of a hyphen
to display the word separators.  The same goes for word extenders which are
not literally a sequence of underscores.

<<The text inside <syl> can be processed (using regular expression
matching) to create any output needed, for example, the text "as-is" with
hyphenated words (e.g., "wan-ton a-ban-don") or "joined-up" in a more
poetic style (e.g., "wanton abandon"). >>

This is how the Humdrum representation for lyrics works.  In general it
works well, but there are complications.  In particular when there is a
hyphen between two syllables in prose, you need a way of indicating that it
should remain.  I don't come across that much in lyrics, but I would encode
the word "long-term" as two syllables:
"long--" and "-term", with the double hyphen indicating that when the
lyrics are extracted from the music, the final prose should include a
hyphen between those two syllables.  Such a system should be spelled out.
 This system works well in 7-bit ASCII data, but I wonder if someone uses a
strange or inconsistent unicode hyphen characters, what will happen?  Also,
this would not be great if graphic-like display is used, for example "wan -
- - - - - - - - - " could be compensated for in a regular expression, but
only after discovering that someone was doing such a thing in the data, and
would make the regular expression quite complicated for removing the
extended hyphen.

Another complication when extracting text prose, is how am I to detect an
elision character in the CDATA as you have pointed out so many of them
occur in unicode? :-)  This seems to make a case for a functional elision
tag which contains a optional attribute for how it should be rendered as
character(s) for separating two syllables.

I don't understand this encoding which you can explain more:

<lyrics xmlns="http://www.music-encoding.org/ns/mei">
  <verse>
    <syl>Dios</syl>
    <syl con="elided">que˘al</syl>
    <syl>mun-</syl>
    <syl>do</syl>
  </verse>
</lyrics>

I would expect that the syl at con attribute describes how the current
syllable connects to the following syllable, not an internal connector:

<lyrics xmlns="http://www.music-encoding.org/ns/mei">
  <verse>
    <syl>Dios</syl>
    <syl con="elided">que</syl>
    <syl>al</syl>
    <syl>mun-</syl>
    <syl>do</syl>
  </verse>
</lyrics>

Also remember a few months ago we were having problems on representing
verse numbers (in rondeaux), such as , "1.,2.,6" for indicating that the
line of music is for the 1st, 2nd and 6th verses.  How should this be
encoded.  In most musical editors, this has to be treated as regular text
with a space elision before the first syllable in the lyrics.

-=+Craig

On 9 July 2014 11:50, Roland, Perry D. (pdr4h) <pdr4h at eservices.virginia.edu
> wrote:

>
> Hi everybody,
>
> I knew that eventually someone would trip over this.  :-)  And that we'd
> need to fix it.
>
> This is another one of those places where the original purpose/form of MEI
> conflicts with later developments.  One of the original purposes of
> syl/@con was to allow a hand-encoder to mark the *function* of a syllable
> connector just by indicating just what they saw -- if the score contained a
> dash, the encoder would write <syl con="d"> and so on.  Another was to make
> it easier to convert existing representations into MEI.  For me, however,
> the main point was to get at the function of the individual syllable.
>
> But now I believe it would be better to use @con to record the *function*
> of the connector and put the lyric transcription/visual rendition *inside*
> the syllable element itself as is done in many other places in MEI.
>  Consider for a moment --
>
> "wan", "ton", and "wanton" are all English words.  The difference between
> the word "wan" followed by the word "ton" and the single word "wanton"
> divided syllabically is all in the connectors between the syllables.  For
> example:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan-</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> Of course, in the following markup, because a connector is absent the
> difference is not discernible:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> But, if we allow @con to have a value of "none", we're really no better
> off because we still don't know which visual connector *ought* to be
> present or what its (supposed) purpose is.  The following is still
> semantically indistinguishable from the preceding example because the
> orthography of the word "wan" and that of the first syllable of "wanton"
> (without its hyphen) are the same thing:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl con="none">wan</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> But, to record which connector *should* be present we can use <supplied>:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan<supplied>-</supplied></syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> Or use <gap> to record a missing connector without supplying one:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan<gap reason="missing hyphen"/></syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> Having put the connector *inside* <syl>, @con can be used to record the
> function of the connector:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl con="separated">wan<supplied>-</supplied></syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> Actually, I think I prefer @con to record info *about the syllable* since
> it's an attribute *of* the syllable.  The new values for @con (or for a new
> attribute if we want to keep @con around but deprecate it) could be
> "separated", "extended", "elided", and "unknown".  But I could be persuaded
> otherwise.
>
> This also works in the (hopefully) more usual case when the connector is
> present but our favorite naïve encoder (Mr. OMR) can't (or doesn't want to)
> determine the function of the connector:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>wan-</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> It would be better to have this info, of course, because depending on the
> rhythm of the vocal line and the prevailing notational style, a dash can be
> used for both separation and extension.  For example, when the first
> syllable is to be sung on multiple notes the markup could be:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl con="extended">wan-</syl>
>     <syl>ton</syl>
>   </verse>
>   <verse>
>     <syl>wan</syl>
>     <syl>ton</syl>
>   </verse>
> </lyrics>
>
> In fact, there could be (and often are) multiple dashes filling the space
> between the first and last notes of the melisma or just one depending on
> the source document or on the rendering processor (when the MEI is to be
> rendered).  The same thing occurs with the underscore separator.
>
> But this kind of many-visual-representations-to-one-function situation is
> particularly acute when it comes to elision.  Various symbols have been
> used to indicate syllable elision -- breve, inverted breve, caron,
> circumflex, and tilde just to name a few.  The following example indicates
> an elision of "que" and "al":
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>Dios</syl>
>     <syl con="elided">que˘al</syl>
>     <syl>mun-</syl>
>     <syl>do</syl>
>   </verse>
> </lyrics>
>
> But so does this:
>
> <lyrics xmlns="http://www.music-encoding.org/ns/mei">
>   <verse>
>     <syl>Dios</syl>
>     <syl con="elided">que^al</syl
>     <syl>mun-</syl>
>     <syl>do</syl>
>   </verse>
> </lyrics>
>
> One could encounter any number of visual renditions indicating elision.
> And one should be able to use any appropriate Unicode or SMuFL code point
> for the connector.  SMuFL provides
>
>       U+E550
>       lyricsElisionNarrow
>       Narrow elision
>
>       U+E551
>       lyricsElision
>       Elision
>
>       U+E552
>       lyricsElisionWide
>       Wide elision
>
>       U+E553
>       lyricsHyphenBaseline
>       Baseline hyphen
>
>       U+E554
>       lyricsHyphenBaselineNonBreaking
>       Non-breaking baseline hyphen
>
>      (See the attached image or
> http://www.smufl.org/version/latest/range/lyrics/ for visual examples)
>
> The text inside <syl> can be processed (using regular expression matching)
> to create any output needed, for example, the text "as-is" with hyphenated
> words (e.g., "wan-ton a-ban-don") or "joined-up" in a more poetic style
> (e.g., "wanton abandon").  Repetitions of a connector, "wan - - - - - - - -
> - - -" for example, would be allowed inside <syl> so that no data is lost
> (well, except for the location of each dash), but could be compressed to a
> single dash for presentation purposes.
>
> I believe this will work better than the old system (it's clearer, no info
> is lost), but I'd like to hear other viewpoints.
>
> --
> p.
>
>
>
> > -----Original Message-----
> > From: mei-l [mailto:mei-l-bounces at lists.uni-paderborn.de] On Behalf Of
> > Christine Siegert
> > Sent: Friday, July 04, 2014 9:58 AM
> > To: Music Encoding Initiative
> > Subject: Re: [MEI-L] syllable connectors
> >
> > Dear Johannes, dear list,
> > The Sarti project agrees, too.
> > All the best,
> > Christine
> >
> >
> > Prof. Dr. Christine Siegert
> > Universität der Künste Berlin
> > Fakultät Musik, Musikwissenschaft
> > Fasanenstraße 1B
> > D-10623 Berlin
> >
> > Tel.: +49 (0)30 3185 2318
> > siegert at udk-berlin.de
> > -----Ursprüngliche Nachricht-----
> > From: Karen McAulay
> > Sent: Friday, July 04, 2014 12:16 PM
> > To: Music Encoding Initiative
> > Subject: Re: [MEI-L] syllable connectors
> >
> > Yes!
> >
> > Best wishes
> > Karen
> >
> > Dr. Karen McAulay
> > Music and Academic Services Librarian
> > +44 (0)141 270 8267 (direct)
> > K.McAulay at rcs.ac.uk
> > -----Original Message-----
> > From: mei-l [mailto:mei-l-bounces at lists.uni-paderborn.de] On Behalf Of
> > Johannes Kepper
> > Sent: 04 July 2014 10:56
> > To: Music Encoding Initiative
> > Subject: [MEI-L] syllable connectors
> >
> > Dear MEI-Listeners,
> >
> > doing some manual coding of vocal music, we ran across a situation where
> > the
> > layout of the printed score did not allow to put in any separator (well,
> > better connector) between two syllables of a word. The current list of
> > allowed connectors does not have an explicit option of "no connector at
> > all". Do we all agree that there should be one?
> >
> > Best,
> > Johannes
> >
> > _______________________________________________
> > mei-l mailing list
> > mei-l at lists.uni-paderborn.de
> > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
> >
> >
> > _______________________________________________
> > mei-l mailing list
> > mei-l at lists.uni-paderborn.de
> > https://lists.uni-paderborn.de/mailman/listinfo/mei-l
>
> _______________________________________________
> mei-l mailing list
> mei-l at lists.uni-paderborn.de
> https://lists.uni-paderborn.de/mailman/listinfo/mei-l
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uni-paderborn.de/pipermail/mei-l/attachments/20140709/308793af/attachment.html>