                 WEBSTER FONTS
                 =============

* Overview

This file describes special symbols and markup entities used in the GNU
Collaborative International Dictionary of English.

* Introduction

The special characters used in the electronic version of the Webster 1913
are required for visualizing unusual characters used in the etymology and
pronunciation fields of the dictionary, in a form comparable to the way they
appear in the original.

The GCIDE markup provides two ways for representing such characters: using
special "escape sequences" and using special markup entities.  Historically,
"escape sequences" were used to indicate the character's ordinal position in
a special font, prepared by MICRA, Inc. to represent it on screen.  Although
nowadays this method is obsolete, the dictionary corpus still uses these
sequences.  This file describes their mapping to Unicode characters.

An escape sequence has the form \'xx, where "x" represent lowercase
hexadecimal digits.  For example, \'94 stands for "o" with diaeresis.  There
are only 256 such sequences.

Special markup entities are able to represent a wider range of characters.
A markup entity is similar to SGML one, but has a different format.  The
traditional &xx; format was judged inconvenient because the ampersand is
used frequently in the corpus.  Instead, GCIDE entities have the format
<WORD/, where "<" and "/" represent the beginning and end of the entity and
WORD represents the character itself.  Valid WORDs are in some cases
abbreviations (for compactness) of the ISO 8879 recommended symbols.
Characters representable by escape sequences can also be represented by
entities, but the reverse is not true, due to a limited range of the former.

The Greek words appearing in the etymologies, when they are included, are
typed in a roman-letter transcription, which is described below in chapter
"Greek transliteration".

* Unrecognized characters

Wherever the typists did not know the character to use, they usually
inserted a reverse-video question mark (decimal 176).  This appears in
full-ASCII versions as <?/.  This mark was used both for characters in
non-ASCII fonts, and for unreadable characters (i.e., characters smeared in
the original or distorted in the copies available to the typists. The type
in the original was in many places smeared and illegible at the left and
right page margins; occasionally, small parts of words were blotted out by
plain white space).

* Italics

In most places, italic font is represented by the tags <it>...</it>
surrounding the italic text, or by some other tag which also implies italic
font.  In the pronunciations, however, where italicized vowels are used
among non-italic and other special characters to indicate pronunciation, the
special codes <ait/, <eit/, <iit/, <oit/, <uit/, are also used to indicate
the italicized vowel.

* Diacritics

Vowels with a circle above (as in Swedish) are coded <xring/ (x with a ring,
or "degrees" mark over it); vowels with tilde over them are represented by
<xtil/, where "x" is the vowel, as in <etil/ (<atil/ also has code 238);
letters with a dot above are represented by <xdot/ -- letter with a dot
below are represented by <xsdot/ ("subdot"); vowels with the semi-long mark
(a macron with a short perpendicular vertical stroke attached above) are
represented by <xsl/; the circumflex vowels have codes on this list, but may
also be represented as <xcir/; vowels with macrons above are <xmac/
(including <oomac/, the "oo" with an unbroken macron above the two letters,
<aemac/ = the ligature ae with a macron [also 214 = \'d6], and <oemac/ the
ligature oe with a macron [also 215 = \'d7]); vowels with umlauts or a
crescent (breve) above have codes in this list, but may also be represented
by <xum/ and <xcr/ respectively.  There is an occasional hacek or caron mark
(an inverted circumflex) in the original; such letters are coded <xcar/.
The o with a caron has code 213, but no other letter with a caron is
representable by an escape sequence.

The diaeresis is treated typographically as identical to the umlaut.  A
special modification, used only for poetry (see entry "saturnian verse"
under "saturnian") is a vowel with a macron, in which the macron is lighter
than the usual macron, signifying a stressed syllable which has a short
vowel sound.  This is represented by <xsmac/ ("short mac").

Another special character used in pronunciations is an "n" with an underline
(like a macron, but below the letter), used to represent the "ng" sound.
This is coded <nsm/ ("n sub-macron").  The ligated th used in pronunciations
to depict the "th" sound of "the" is coded as <th/.

NOTE: the letter combinations "fi" and "fl" are invariably printed as the
ligatures &filig; and &fllig;, but these ligatures are not marked as such in
this transcription, and the two letters are left as individuals.

* Special symbols

The dagger <dag/, double dagger <ddag/, and paragraph mark <para/ are rarely
used.

The double prime, or "seconds" of a degree is sometimes represented by a
double "light accent" (code 183 = \'b7).  In other places, and in later
versions, it is represented by <sec/ = \'a9.

The symbols "greater than" <gt/ and "less than" are encountered only once,
but are distinguished from the right- and left-angle brackets (> and <)
because of possible typographical differences in some fonts.

The schwa is symbolized by <schwa/.  It is not used in the pronunciations,
but is mentioned as a symbol.  The right-pointing arrow is <rarr/,
consistent with ISO 8879.

Two special entities <and/ and <or/ represent words "and" and "or" in
italics font.

* Symbol summary

Below is a complete list of the symbols used in the Webster, together with
their "webfont" number (escape sequence), corresponding markup entity, and
corresponding symbols in ISO 8879 and Tex coding.  Much of this table was
prepared by Rik Faith, to whom we express our appreciation.

The "Uc" column gives the Unicode representation of the character.  The
"nearest ASCII" equivalents are given for those who want to display the data
as best one can in 7-bit simple ASCII symbols without using the "entity"
symbols.

Comments: 
  (1) The symbol in the "entity" column is the SGML-like symbol used in the
      present Webster files; the symbol in the "ISO 8879" column is the
      symbol for the same character given in "The user's guide to ISO 8879"
      by Smith and Stutely.
      
  (2) An asterisk "*" in the "entity" column means that this symbol and code
value is not used in any form in GCIDE.

  (3) If no asterisk is in the "entity" column, and no other symbol is
there, this means that in the Webster, only the hexadecimal representation
was used (e.g. for \'d8, \'bd, and \'b8).

  (4) \'b6 and \'b7, the heavy and light "accents", are never above a letter
(these are not diacritical marks), but in-between letters, as the stress
accent used in the headwords and pronunciations.  The accent *follows* the
syllable accented.  The light accent \'b7 is also used as the "prime" in
mathematical expressions (e.g. a\'b7 = "a prime"), or as "minutes" in
degrees-minutes-seconds, and when doubled (\'b7\'b7) serves as "double
prime" in mathematical expressions, and as "seconds" in
degrees-minutes-seconds.  The character \'a9 (<sec/ or &Prime;) is also used
to represent the double prime.

  (5) Although the semilong vowels are in the table (e.g. the "asl" = "a
semilong", most of the entries in the ASCII version dictionary use the <xsl/
symbol coding.  If you know of any printers' names for these, do let me
know.

  (6) For some reason, the a breve and u breve have ISO codes (in the
Latin-2 table), but the other vowels don't, in the Smith & Stutely book.  Is
this a mistake?

  (7) The symbol <nsc/ is used for "N small capitals", used in
pronunciations to represent the soun fo the nasal N in French words.

  (8) A weak accent (when not in pronunciations) is symbolized by <prime/,
the "minutes" (of a degree) symbol.  A strong accent is symbolized by
<bprime/ ("bold prime", not an ISO entity).

  (9) If you find any exceptions to these usage assertions, please
let me know.

----------------------------------------------------------------------------
     webfont       ISO 8879   TeX             Uc  ASC   Description
------------------                               
oct dec hex entity             
----------------------------------------------------------------------------
025  21  15  *                 \S             §   *     section symbol

074  60  3c          lt        $<$            <   <     less than
076  62  3e          gt        $>$            >   >     greater than

200 128  80 <Cced/   Ccedil    \c{C}          Ç   C     C cedilla
201 129  81 <uum/    uuml      \"u            ü   ue    u umlaut (diaeresis)
202 130  82 <eacute/ eacute    \'e            é   e     e acute
203 131  83 <acir/   acirc     \^a            â   a     a circumflex
204 132  84 <aum/    auml      \"a            ä   ae    a umlaut (diaeresis)
205 133  85 <agrave/ agrave    \`a            à   a     a grave
206 134  86 <aring/  aring     \aa            å   aa    a ring above
207 135  87 <cced/   ccedil    \c{c}          ç   c     c cedilla
210 136  88 <ecir/   ecirc     \^e            ê   e     e circumflex
211 137  89 <eum/    euml      \"e            ë   e     e umlaut (diaeresis)
212 138  8a <egrave/ egrave    \`e            è   e     e grave
213 139  8b <ium/    iuml      \"i            ï   i     i umlaut (diaeresis)
214 140  8c <icir/   icirc     \^i            î   i     i circumflex
215 141  8d <igrave/ igrave    \`i            ì   i     i grave
216 142  8e <Aum/    Auml                     Ä   A     A umlaut
217 143  8f          Aring                    Å   Aa    A ring above

220 144  90 <Eacute/ Eacute    \'E            É   E     E acute
221 145  91 <ae/     aelig     \ae            æ   ae    ligature ae
222 146  92 <AE/     AElig     \AE            Æ   AE    ligature AE
223 147  93 <ocir/   ocirc     \^o            ô   o     o circumflex
224 148  94 <oum/    ouml      \"o            ö   oe    o umlaut (diaeresis)
225 149  95 <ograve/ ograve    \`o            ò   o     o grave
226 150  96 <ucir/   ucirc     \^u            û   u     u circumflex
227 151  97 <ugrave/ ugrave    \`u            ù   u     u grave
230 152  98 <yum/    yuml                     ÿ   y     y umlaut
231 153  99 <Oum/    Ouml                     Ö   O     O umlaut
232 154  9a <Uum/    Uuml      \"U            Ü   U     U umlaut (diaeresis)
233 155  9b
234 156  9c <pound/  pound     \pounds        £   *     pound sign (British)
235 157  9d  *
236 158  9e  *
237 159  9f  *
240 160  a0 <aacute/ aacute    \'a            á   a     a acute
241 161  a1 <iacute/ iacute    \'i            í   i     i acute
242 162  a2 <oacute/ oacute    \'o            ó   o     o acute
243 163  a3 <uacute/ uacute    \'u            ú   u     u acute
244 164  a4 <ntil/   ntilde    \~n            ñ   ny    n tilde
245 165  a5 <Ntil/   Ntilde                   Ñ   NY    N tilde
246 166  a6 <frac23/           $\frac{2}{3}$  ⅔   2/3   two-thirds
247 167  a7 <frac13/           $\frac{1}{3}$  ⅓   1/3   one-third
250 168  a8  *
251 169  a9 <sec/    Prime                    ˝   ''    seconds (of
                                                        degree or time)
                                                        Also, inches
							or double prime.
252 170  aa  *
253 171  ab <frac12/           $\frac{1}{2}$  ½   1/2   one-half
254 172  ac <frac14/           $\frac{1}{4}$  ¼   1/4   one-quarter
255 173  ad  *
256 174  ae  *
257 175  af  *
260 176  b0 <?/                                   (?)   Place-holder
                                                        for unknown or
							illegible
							character.   
261 177  b1  *
262 178  b2  *
263 179  b3  *
264 180  b4  *                 $\updownarrow$ ↑   *     vertical arrow
265 181  b5 <hand/                            ☞   *     pointing hand
                                                        (printer's "fist")
266 182  b6 <bprime/           \"{}           ˝   ''    bold accent 
                                                        (used in
							pronunciations) 
267 183  b7 <prime/  prime     \'{}           ´   '     light accent 
                                                        (used in
							pronunciations) 
                                                        also minutes
							(of arc or time) 
270 184  b8 <rdquo/  rdquo     ''             ”   "     close double quote
271 185  b9  *
272 186  ba  *                 $\parallel$    ‖   ||    vertical double bar
                                                        (l)
273 187  bb  *
274 188  bc <sect/  sect       \S             §   *     section mark
275 189  bd <ldquo/ ldquo      ``             “   "     open double quotes
276 190  be <amac/  amacr      \=a            ā   a     a macron
277 191  bf <lsquo/ lsquo      `              ‘   `     left single quote

300 192  c0 <nsm/                             ṉ   ng    "n sub-macron"
301 193  c1 <sharp/ sharp      $\sharp$       ♯   #     musical sharp
302 194  c2 <flat/  flat       $\flat$        ♭   *     musical flat
303 195  c3  *                 --             –   --    long dash (en-dash? )
304 196  c4  *                 $-$            ―   -     horizontal line
305 197  c5 <th/ (part 1)                         t     first part of
                                                        th ligature 
                                                        see 231 = e7 for part 2
306 198  c6 <imac/  imacr      \=i            ī   i     i macron
307 199  c7 <emac/  emacr      \=e            ē   e     e macron
310 200  c8 <dsdot/                           ḍ   d     Sanskrit/Tamil d dot 
311 201  c9 <nsdot/                           ṇ   n     Sanskrit/Tamil n dot
312 202  ca <tsdot/                           ṭ   t     Sanskrit/Tamil t dot
313 203  cb <ecr/              \u{e}          ĕ   e     e breve
314 204  cc <icr/              \u{i}          ĭ   i     i breve
315 205  cd  *
316 206  ce <ocr/              \u{o}          ŏ   o     o breve
317 207  cf  -                 --             ‐   -     short dash

320 208  d0  --      mdash     ---            —   ---   long (em) dash
321 209  d1 <OE/     OElig     \OE            Œ   OE    OE ligature
322 210  d2 <oe/     oelig     \oe            œ   oe    oe ligature
323 211  d3 <omac/   omacr     \=o            ō   o     o macron
324 212  d4 <umac/   umacr     \=u            ū   u     u macron
325 213  d5 <ocar/             \v{o}          ǒ   o     o hacek
326 214  d6 <aemac/            \=\ae          ǣ   ae    ae ligature macron
327 215  d7 <oemac/            \=\oe          ōē  oe    oe ligature macron
330 216  d8          par       $\parallel$    ‖   ||    double vertical bar
                                                        (s)
331 217  d9  *
332 218  da  *
333 219  db  *
334 220  dc <ucr/   ubreve     \u{u}          ŭ   u     u breve
335 221  dd <acr/   abreve     \u{a}          ă   a     a breve
336 222  de <cre/   ssmile     \u{}           ˘   ~     crescent
                                                        (like a breve,
							but vertically
							centered -- 
                                                        represents the
							short accent
							in poetic
							meter) 
337 223  df <ymac/             \=y            ȳ   y     y macron

340 224  e0  <asl/                                a     a "semilong"
                                                        (has a macron
                                                        above with a
                                                        short vertical
                                                        bar on top the
                                                        center of the
                                                        macron) 
                                                        Used in
                                                        pronunciations. 
341 225  e1  <esl/                                      e "semilong"
342 226  e2  <isl/                                      i "semilong"
343 227  e3  <osl/                                      o "semilong"
344 228  e4  <usl/                                      u "semilong"
345 229  e5  <adot/                           ȧ   a     a with dot above
346 230  e6  *                                μ   mu    small Greek mu
347 231  e7  <th/ (part 2)                              second part of
                                                        th ligature; 
                                                        see 197 = c5
                                                        for part 1
350 232  e8  *
351 233  e9  *
352 234  ea  *
353 235  eb <edh/   edh                       ð   th    small eth
354 236  ec  *
355 237  ed <thorn/ thorn                     þ   th    small thorn
356 238  ee <atil/  atilde     \~a            ã   a     a tilde
357 239  ef <ndot/                            ṅ   n     n with dot above

360 240  f0 <rsdot/            \d{r}          ṛ   r     r with a dot below
361 241  f1   *
362 242  f2   *
363 243  f3   *
364 244  f4 <yogh/                            ȝ   y     small yogh
365 245  f5 <mdash/ mdash      ---            —   ---   em dash
366 246  f6 <divide/ divide    $\div$         ÷   /     division sign
367 247  f7         ap         $\approx$      ≈   ~=    "double tilde"
370 248  f8 <deg/   deg        ${}^\circ$     °   *     degree sign
371 249  f9 <middot/           $\bullet$      •   *     bold middle dot
372 250  fa   *                $\cdot$        ·   *     light middle dot
373 251  fb <root/  radic      $\surd$        √   *     root sign
374 252  fc   *
375 253  fd   *
376 254  fe   *
377 255  ff   *                 
----------------------------------------------------------------------------

The table below gives some additional information about some of the more
commonly used entities:

-------------------------------------------------------------------
Frequently used:
decimal  hex    char  definition
  128    80     Ç     <Cced/ c cedilla (uppercase)
  129    81     ü     <uum/ u umlaut
  130    82     é     e acute
  131    83     â     a circumflex
  132    84     ä     <aum/ a umlaut
  133    85     à     a grave
  134    86     å     <aring/ a with "ring" (circle) above (Swedish!)
  135    87     ç     <cced/ c cedilla
  136    88     ê     <ecir/ e circumflex
  137    89     ë     <eum/ e umlaut (or e with dieresis above)
  138    8a     è     e grave
  145    91     æ     <ae/ = "ae" fused ligature
  146    92     Æ     <AE/ = upper-case "ae" fused ligature
  147    93     ô     <ocir/ o circumflex
  148    94     ö     <oum/ o "umlaut", used mostly in "coöperation,
                      Zoöl." and in pronunciations
  164    a4     ñ     <ntil/ Spanish "enye"
  166    a6     ⅔     <frac23/ two-thirds (fraction)
  167    a7     ⅓     <frac13/ one-third (fraction)
  169    a9     ˝     <sec/  seconds of degree or time, or double-prime
  171    ab     ½     <frac12/ one-half, as in the original IBM set
  172    ac     ¼     <frac14/ one-fourth (fraction)
  176    b0           <?/ = (reverse-video question mark), used
                      to represent an uncodable or illegible character
  180    b4     ´     long verticle double-headed arrow (a reference mark)
  181    b5     ☞     <hand/ = (the typographer's "fist")
                      Appearing as a "pointing hand" character
                      (for explanatory notes)
  182    b6     ˝     bold accent in headwords
                      replaced in full ASCII version by double quote = "
  183    b7     ´     light accent in headwords
                      replaced within headwords in the full ASCII version
                      by an open-single-quote (` = ASCII 96, not the same
                      as 191, \'bf).   This mark is used also for
		      minutes of a degree, and for "prime" to modify
		      variables in mathematical expressions.
		      -- two of these in sequence represent seconds of
		      a degree, or double prime.  The seconds symbol
		      is also represented by <sec/ (hex a9).
  184    b8     ”     close double quotes (used with 189 [= \'bd],
                      open quote) 
  186    ba     ‖     vertical double bar - represents the symbol used
                      in the printed dictionary before a headword to
		      signify that the word was adopted without
		      anglicization from a foreign language but in the
		      full-ASCII version this function uses \'d8 -- see 216
  188    bc     §     <sect/ section mark
  189    bd     “     open double quotes (used with 184, close quote)
  190    be     ā     <amac/ a macron
  191    bf     ‘     <lsquo/ "left single quote"
                      single open quote mark (not same as ASCII 96)
  192    c0     ṉ     <nsm/ "n sub-macron", an n with a macron below --
                      represents the "ng" sound in pronunciations
  193    c1     ♯     <sharp/ sharp - music notation
  194    c2     ♭     <flat/  flat  - music notation
  195    c3     –     long dash, one pixel removed from left
                      will fuse with left long dash, char 208
  196    c4     ―     graphic horizontal line
  195+208       –—    combination for a very long dash.  In the
                      original typing, the dash char 208 was used for
		      both non-breaking hyphen (in hyphenated words),
		      and for the em-dash used as an introductory mark
		      for various segments.  The em-dash should be
		      distinguished from the hyphen, but that
		      conversion hasn't yet been done.  In the full
		      ASCII version, a double hypen "--" represent the
		      m-dash.
  197    c5      t    <th/ (part 1) first of a pair of characters
  197+231 =      th   used to represent the th ligature --
                      <th/ represents the "th" sound of "mother"
                      see 231 (e7) for part 2
  198    c6      ī   <imac/ = i macron
  199    c7      ē   <emac/ = e macron
  200    c8      ḍ   <dsdot/ Sanskrit/Tamil d with dot underneath
  201    c9      ṇ   <nsdot/ Sanskrit/Tamil n with dot underneath
  202    ca      ṭ   <tsdot/ Sanskrit/Tamil t with dot underneath
  203    cb      ĕ   <ecr/ = e with crescent (breve) above.  Used
                     - in some etymologies and pronunciation
  204    cc      ĭ   <icr/ = i with crescent (breve) above - used
                     - in some etymologies and pronunciation
  206    ce      ŏ   <ocr/ = o with crescent (breve) above - used
                     - in some etymologies and pronunciation
  207    cf      ‐   short dash, used in hyphenated words, and in
                     breaking syllables where no accent is used. But
		     sometimes the typists used the normal hyphen
		     [45], or the long dash (decimal 208) for that
		     purpose.  The normal hyphen is the same length
		     as the long dash, but one pixel higher in the
		     character box. 
                     # In headwords, in the full ASCII version, this
                     short dash is represented by the asterisk "*".
  208    d0      —   <mdash/ = represents the long dash, used for the em 
                     dash which often precedes certain sections within a
                     definition, and which separates some sections,
                     such as wordforms or collocations within a
                     collocation segment.  This is replaced in the
                     full ASCII version by a double hyphen, "--".
  210    d2      œ   <oe/ = "oe" fused ligature
  211    d3      ō   <omac/ = o macron
  212    d4      ū   <umac/ = u macron
  213    d5      ǒ   <ocar/ o with caron (hacek) (inverted
                      circumflex) above 
  214    d6      ǣ   <aemac/ = "ae" ligature with a macron
  215    d7      ōē  <oemac/ = "oe" ligature with a macron
  216    d8      ‖   <par/ double vertical bar (short length; the long
                     length is the graphics character 186)
                     This precedes words marked with a double
		     vertical bar in the original dictionary,
		     signifying that the word was adopted directly
		     into English without modification of the spelling.
  220    dc      ŭ   <ucr/ = u with crescent above - used in some etymologies
  221    dd      ă   <acr/ = a with crescent above - used in some etymologies
  222    de      ˘   <cre/ = "crescent", an upward-curving crescent
                     used as a poetic meter mark
  223    df      ȳ   <ymac/ = y macron (used in Anglo-Saxon?) 
  229    e5      ȧ   <adot/ = a with a dot above (for pronunciations)
  231    e7      h   <th/ (part 2) second of a two-character combination
  197+231 = th       used to represent the th ligature in pronunciations
                     <th/ represents the "th" sound of "mother"
  235    eb      ð   <edh/ = Old English and Icelandic "edh", (or "eth")
                     like a Greek delta with a hatch mark through
		     the ascender. Used to represent the
		     Anglo-Saxon/Icelandic/Gothic character, in
		     etymologies, pronounced like "th"
  237    ed      þ   <thorn/ "thorn", an Old English and Icelandic
                     character, appears like a "p" with an extended
                     ascender.
                     Used to represent the
		     Anglo-Saxon/Icelandic/Gothic character, in
		     etymologies, pronounced like "th" in "thorn"
		     and also as in "brother"
  238    ee      ã   <atil/ a with tilde above - in some etymologies
  244    f4      ȝ   <yogh/ like a script "3" or "z". Used in Old English
                     etymologies, analogous to "y"
  247    f7      ≈   double tilde ("approximately equals").
                     used by typists as a place-holder in word
                     combinations where the capitalized headword
                     should be.
  248    f8      °   <deg/ degrees (temperature or angle).  Note: some
                     typists used a superscript "o" to signify
                     degrees.  This must be corrected!
  249    f9      •   middle dot (bold)
  250    fa      ·   middle dot (light)
  251    fb      √   <root/ "root" sign used in etymologies, as in original

* Additional entities

  <segno/        Represents sheet music sign "Segno" (U+1D109).

* Greek transliteration

Stand-alone Greek letters are represented by entities <alpha/, <beta/,
<gamma/, <lambda/ etc.  Capitalized letters are <ALPHA/, etc.

Text appearing within the markers <grk></grk>, is a Greek transliteration
written in roman letters.  The following rules are used:

** Aspirants

Aspirants are represented by ' (apostrophe) and " (double quote) placed in
front of the letter modified.  Apostrophe stands for ψιλὸν πνεῦμα (ψιλή or
spiritus lenis), and double quote stands for δασὺ πνεῦμα (δασεία or spiritus
asper).

   'a  -- ἀ
   "a  -- ἁ

** Accents

Accents are placed after the accented letter.  The acute accent (ὀξεῖα) is
represented by ` (gravis).  The grave accent (βαρεῖα) is represented by ~
(tilde), and circumflex (περισπωμένη) is represented by circumflex.  Thus:

   a`  -- ά
   a~  -- ὰ
   a^  -- ᾶ
   
Some examples of the combined forms (aspirant + accent):

   'a` -- ἄ
   'a~ -- ἂ
   'a^ -- ἆ
   "a` -- ἄ
   "a~ -- ἂ,
   "a~ -- ἃ

   
** Iota subscriptum

Iota subscript is represented by comma placed after the affected vowel.  If
the vowel is accented, the comma is placed after the accent mark.  For
example:

    a`,  --  ᾴ
    'a`, --  ᾄ

** Diaeresis
    
Diaeresis is represented by a colon immediately after the affected vowel.
If the vowel is accented, the accent is placed after the colon, e.g.:

    i:  --  ϊ
    i:^ --  ῗ
    i:` --  ῒ
    
** Letters

The table below shows, for each Greek letter, the corresponding markup
entity and transliteration.  The capitalized Greek letters are represented
by the capitalized versions of the letters shown here.

-----------------------------------------
  Greek letter    transliteration
------------    ---------------
α   alpha           a
β   beta            b
γ   gamma           g
δ   delta           d
ε   epsilon         e
ζ   zeta            z
η   eta             h
θ   theta           q    [1]
ι   iota            i
κ   kappa           k
λ   lambda          l
μ   mu              m
ν   nu              n
ξ   xi              x
ο   omicron         o
π   pi              p
ρ   rho             r
σ,ς sigma           s    [2]
τ   tau             t
υ   upsilon         y,u  [3]
φ   phi             f
χ   chi             ch   [4]
ψ   psi             ps   [5]
ω   omega           w

---
[1] "th" was used in some earier sections, but was changed due to potential
confusion with the tau+eta combination, as in λυτήριος
(<grk>lyth`rios</grk>, at "lyterian") or ποιητής (<grk>poihth`s</grk>, at
"maker").
[2] Final sigma is not distinguished here from middle sigma, but when
isolated, use <sigmat/ ("terminal sigma") for the final form.
[3] Both y and u are used interchangeably in this edition.
[4] "c" is always followed by "h", so the "h" component is not confusable
with eta.  Applications must first convert "ch" before converting "h", or at
least verify that an "h" to be converted has no preceding "c".
[5] This usage is theoretically confusable with pi-sigma, but that
combination seems never to occur.

Roman "j" and "v" are unused.  Roman "u" is occasionally used instead of "y"
to represent upsilon.

Examples:

<grk>'archai:`zein</grk>   ἀρχαῒζειν
<grk>zw^,on</grk>          ζῷον
<grk>o'i^nos</grk>         οἶνος
<grk>"ydra`rgyros</grk>    ὑδράργυρος


Local Variables:
mode: Outline
coding: utf-8
fill-column: 76
End:

