Next Previous Contents

10. Pinakes xarakthrwn, character sets

O H/Y gia na parasthsei ta glwssika sumbola xrhsimopoiei 1byte=8bit, dhladh exoyme 2^8=256 diaforetika grammata. O kwdikas ASCII (American Standard Code for Information Interchange) orizei aysthra mono ta prwta 128 sumbola (7bit). Ta alla misa sumbola xrhsimopoiountai gia apeikonish eidikwn symbolwn allwn glwsswn alla kai grafikwn symbolwn. Anti8eta, me alles eyrwpaikes glwsses ta ellhnika einai ej' oloklhroy sta 8bit. O profanhs logos einai ta polla diaforetika sumbola poy exei h ellhnikh glwssa se sxesh me tis ypoloipes.

Pros8etes plhrofories gia ta ellhnika sto Diadiktyo, mporeite na breite sto RFC 1947, "Greek Character Encoding for Electronic Mail Messages". Deite sxetika http://andrew2.andrew.cmu.edu/rfc/rfc1947.html

10.1 Typopoihseis twn ellhnikwn

Ta ellhnika yparxoyn se polles diaforetikes typopoihseis. Oi pio synh8ismenes apo aytes, einai ta 737 kai ta 928. Amfotera einai gia monotonika ellhnika. Ta men 737 xrhsimopoiountai apo to DOS, ta de 928 apo ola ta UNIX kai Windows (me mikres parallages). To Linux exei san kuria kwdikoselida ta 928. To oti exoyme gia Ellhnika duo kai pleon protypa, fysika, einai megalo problhma, poy jepernietai me eidikous metatropeis, gia allagh apo to ena set sto allo.

Apo thn tekmhriwsh ths Oracle gia to Linux kai ta egxeiridia toy server, mporei kaneis na brei ta diadedomena ellhnika protypa poy xrhsimopoiountai se baseis dedomenwn (ara kai sta pio shmantika systhmata H/Y) kai toys typopoihmenoys (ma pali;) kwdikous toys:

Epishs, to OS/2 xrhsimopoiei thn kwdikoselida 869 kai 851 gia ta ellhnika.

10.2 737

Ta 737 einai epishs gnwsta kai ws 437G (=437Greek), giati proekycan apo tropopoihsh twn amerikanikwn 437. Ta 737 prwtoemfanisthkan stis ellhnikes EPROM twn MDA kai Hercules kartwn grafikwn twn prwtwn PC, opoy briskontan dhladh sto HARDWARE. Xrhsimopoih8hkan kata koron sto DOS, kai gia ayto ola ta arxeia poy proerxontai apo ekei anamenetai na einai 737. Epeidh ta 737 8ewrountai pleon kataloipo toy DOS, einai kalutera na metatrecete ta arxeia poy einai 737 se 928, bl. convertgreek . Sto Linux, h kwdikoselida 737 yposthrizetai plhrws mono sthn konsola (text-mode), alla yparxoyn kai merikes grammatoseires gia X-Windows.

Tropopoihsh pyrhna gia yposthrijh 737

Exoyn anafer8ei periptwseis, opoy to "d" (DELTA mikro) den plhktrologeitai se kapoioys pyrhnes kai ayto symbainei giati sympiptei me to 128+ESC (128+27=155=asc("d")). Phgainete sto /usr/src/linux/drivers/char/console.c, kapoy leei:

              && (c != 127 || disp_ctrl)
              && (c != 128+27);
allajte se 
              && (c != 127 || disp_ctrl)
              /*      && (c != 128+27)*/;
kai kanete compile ena neo pyrhna.

737 se X-windows

Ta 737 yposthrizontai se merikes apo tis fixed grammatoseires poy einai sto paketo Grafis: graphis .

[ah@computer.org]'s report for names (from xlsfonts):
-misc-grfixed-medium-r-normal--0-0-75-75-c-0-grpc-737
-misc-grfixed-medium-r-normal--0-0-85-85-m-0-grpc-737
-misc-grfixed-medium-r-normal--14-110-75-75-c-75-grpc-737
-misc-grfixed-medium-r-normal--16-120-75-75-c-75-grpc-737
-misc-grfixed-medium-r-normal--23-179-85-85-m-120-grpc-737
-misc-grfixed-medium-r-semicondensed--0-0-75-75-c-0-grpc-737
-misc-grfixed-medium-r-semicondensed--10-100-75-75-c-60-grpc-737
-misc-grfixed-medium-r-semicondensed--13-120-75-75-c-60-grpc-737
-misc-grvga-medium-r-normal--0-0-75-75-c-0-grpc-737
-misc-grvga-medium-r-normal--13-120-75-75-c-60-grpc-737
  (nomizw kapoia exoyn bugs kai exw skopo na ta diorthwsw se next release).

10.3 928

Ta ellhnika 928 einai h pio sugxronh kai diadedomenh typopoihsh kai ka8ierw8hke arxika apo ton ELOT. Argotera eginan apodekta kai apo ton ISO ws ISO-Latin-8859-7, h apla Latin7, akoma kai h UNICODE yposthrijh ellhnikwn basizetai se ayta. Ta 928 xrhsimopoiountai se oles tis efarmoges twn UNIX, sto Internet kai apoteloun to shmerino protypo kai gia to Linux. To protypo 928 yposthrizetai, kai sthn konsola (text-mode), kai se grafiko periballon (X-Windows).

Windows-1253

H kuria apoklish twn Windows ellhnikwn (Windows-1253) apo thn typopoihsh ELOT 928, einai o xarakthras "A", (A tonoumeno) toy 928 o opoios sta Windows antistoixei sto Paragraph mark. Apo ta Windows-1253 leipoyn epishs h anw teleia, kai ta ellhnika omoiwmatika << kai >>. Epeidh moiraia 8a prepei na apodextoume ton periorismo ayto poy mas 8etoyn ta MS-Windows, kai epeidh arketoi xrhstes xrhsimopoioun wintel platforma ergasias, kalo 8a einai na apofeugetai to < A tonoumeno > kata thn apostolh e-mails, postings, klp. Enallaktika mporeite na xrhsimopoieite to 'A ( ' = SHIFT+" ) Paromoia problhmata yparxoyn kai me ta 'E kai 'O. Gia eykolia sas, ayta einai ola ta tonoumena kata 928: AEHIOUW.

10.4 Unicode

Ta UNICODE (ISO 10646) einai 16bit (dhl. 65536 syndiasmoi) kai perilambanoyn polles glwsses, mazi me ta nea ellhnika, poy exoyn offset #370 kai ta arxaia ellhnika me offset #1F00. Yposthrizontai apo ta nea mexri ta arxaia (polytonika) ellhnika kai Grammikh B! To Linux yposthrizei eswterika ta UNICODE, alla akoma h xrhsh toys den einai diadedomenh, giati ejartatai kai apo thn yio8ethsh toys apo tis efarmoges. Gia perissotera deite: http://linuxdoc.org/HOWTO/Unicode-HOWTO.html

====================================================================
Vasilis Vasaitis <vas@hal.csd.auth.gr>:
 An kai den exw asxolh8ei ektenws me to antikeimeno, mporw na syneisferw
kapoia gnwsh poy exw epi toy 8ematos. Loipon, exoyme kai leme:

  Kapoia stigmh, se anupopto xrono, eixa katebasei ena Unicode fixed font
gia ta X windows. Epeidh duskola sbhnw ayta poy katebazw, to brhka na
ka8etai akoma sto disko moy. H grammatoseira ayth den periexei to plhres
Unicode, afou ayto apoteleitai apo perissoteroys apo 38000 xarakthres, apo
toys opoioys oi perissoteroi einai Kinezika/Iapwnika/Koreatika, poy etsi ki
alliws sto 6x13 toy fixed den mpainoyn. Omws me peripoy 2800 xarakthres (h
ekdosh poy exw egw toylaxiston) kaluptei plhrws thn latinikh, ellhnikh,
kyrillikh, armenikh, gewrgianh kai ebraikh grafh, syn kapoia texnika kai
ma8hmatika sumbola. H grammatoseira ayth mporei na xrhsimopoih8ei ws protypo
apo opoiondhpote endiaferetai na sxediasei grammatoseires me pollous
xarakthres· sxetika me pio praktikes efarmoges, deite parakatw. H selida toy
tupoy poy thn eftiaje, an einai akoma h idia, einai:

  http://www.cl.cam.ac.uk/~mgk25/

        Yposthrijh sthn konsola:

  H konsola yposthrizei Unicode edw kai kati aiwnes, mesw bebaia toy UTF8
(gia osoys den jeroyn, to UTF8 einai mia anaparastash toy UniCode me
metablhto mege8os, to opoio gia gia toys 128 prwtoys xarakthres exei thn
idia morfh me to ASCII). To 8ema einai oti etsi ki alliws h yposthrijh ths
VGA gia xarakthres poy emfanizontai sygxronws einai polu periorismenh (256,
512 xwris to anabosbhma).

        Yposthrijh sta X:

  H grammatoseira poy anaferw parapanw doyleuei mia xara, kai h teleytaia
fora poy th dokimasa htan prin polu kairo. Epishs, tyxainei na exw enan X
server me enswmatwmenh yposthrijh TrueType fonts (den fortwnw font server),
kai blepw oti kai ta TrueType doyleuoyn mia xara. Gia osoys den jeroyn, ta
XFree86 4.0 8a erxontai me enswmatwmenh yposthrijh TrueType. H Microsoft
(den exw apo allh etaireia) xrhsimopoiei stis grammatoseires ths to Windows
Glyph List 4 (WGL4), to opoio einai yposunolo toy ISO 10646-1 (ligo polu
ayto poy exei h grammatoseira poy periegraca arxika).

        Efarmoges:

  Edw katarreoyn ola. Ayth th stigmh yparxoyn kana dyo programmata poy
kanoyn metatroph apo/pros UTF8, to yudit kai to Netscape poy mazeuoyn apo
edw ki apo ekei gia na broyn arketa sumbola toy Unicode, kai apo ekei kai
pera to xaos. Pantws kalon 8a einai na arxisei prospa8eia gia ta fonts, kai
fantazomai oti oi efarmoges 8a koitajoyn na akoloy8hsoyn.

---------------

Report apo Panagioti Vrioni:

Gnwrizw oti o Giannis Gyftomitros <yang@hellug.gr> exei hdh arxisei
na asxoleitai me thn dunatothta dhmiourgias Unicode grammatoseirwn
pou na periexoun kai ta ellhhnika (Project Grafis, bl. GRArial k.l.),
isws na exei proxwrhsei kai parapera...

Apo thn ekdosh 6.0, o XFS pou periexetai sto Red Hat exei patch wste
na mporei na emfanisei Trye Type Fonts. Bl. sxetiko "White Paper" stho
"support" ths http://www.redhat.com/ . An balete Unicode TTFonts
(px. ths M$) auta paizoun, me thn ennoia oti fainontai dia8esima
ta fonts me xilia-duo diaforetika encodings. Den kserw omws an paizoun
kai san unicode grammatoseira, px. gia na dei kapoios ena keimeno
me ellhnika, agglika kai kinezika tautoxrona sto Netscape.

=====================================================================

Unicode Links

Yparxei mia fixed grammatoseira gia Xwindows, deite sxetika: http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html

Yparxei kai enas text editor gia Unicode, me to onoma Yudit, ftp://metalab.unc.edu/pub/Linux/apps/editors/X/yudit-1.1.tar.gz

To protypo UTF-8 einai pleon standard sto Internet, deite to sxetiko RFC: http://andrew2.andrew.cmu.edu/rfc/rfc2279.html

Perissotera gia ta nea ellhnika sta Unicode edw: http://charts.unicode.org/Unicode.charts/normal/U0370.html

10.5 Metatropeis ellhnikwn

gr2gr

O Aggelos Xaritshs < ah@computer.org> exei gracei ton metatropea ayton: ftp://ftp.hri.org/pub/greek/programs/gr2gr.prl Trexei me perl (5 h 4). Synepws doyleuei se opoio leitoyrgiko susthma exei egkatasta8ei perl (unix, dos, win32, os2, mac, vms ...).

Yposthrizei polla diaforetika ellhnika, opws:

grfilter

Sto Institouto Texnologias Ypologistwn yparxei to grfilter: ftp://ftp.cti.gr/pub/src/grfilter.tar

greek2lat

Sto directory ftp://corfu.forthnet.gr/pub/greek2lat yparxei enas metatropeas apo 928 se greeklish, katallhlos kai gia WEB sites.

trans120.tar.gz

O Kwstas Kwsths, < kosta@kostis.net > exei gracei epishs ayton ton metatropea, poy yposthrizei kai polla ellhnika, alla kai alles glwsses: http://www.kostis.net/freeware/trans120.tar.gz

gkconv

Yparxei kai ena programma toy Giwrgoy Sphliwth, metatrepei 437, Win95, X win. H dieu8ynsh toy agnoeitai.

recode

Ayto einai ena programmataki genikhs xrhshs apo to GNU project, to opoio yposthrizei metatropeis gia polles diaforetikes glwsses (kai ellhnika). Isws 8a eprepe ola ta ypoloipa programmata kapoia stigmh na enswmatw8oun se ayto. Deite sthn dieu8ynsh http://www.delorie.com/gnu/docs/recode/recode_toc.html

10.6 Tupoi arxeiwn kai metatroph toys

.txt, .doc

Analoga me thn periptwsh, blepe convertgreek

.dbf

Synh8ws einai 737, 8eloyn prosoxh sthn metatroph, afhste to gia kana guru.

.diz,

Synh8ws einai 737, blepe convertgreek

.html,

Prepei na einai 928, kai fainontai kanonika.

.mov, .avi

An exei ypotitloys sta ellhnika, 8a einai OK :-)

.exe, .com

petajte ta

10.7 Ti yparxei akoma sto Internet sxetika me ellhnika;

Xrhsimoi sundesmoi:

Yaxte na breite oti xreiazeste me ayto to search engine: http://www.google.org


Next Previous Contents