Changeset 2594

Show
Ignore:
Timestamp:
06/18/08 02:03:47 (17 months ago)
Author:
fumiakiy
Message:

Updating I18N::LangTags to version 0.35 and Locale::MakeText to version 1.13 for better support of Perl 5.10. BugId:80126

Location:
branches/release-40/extlib
Files:
3 added
5 modified

Legend:

Unmodified
Added
Removed
  • branches/release-40/extlib/I18N/LangTags.pm

    r1098 r2594  
    11 
    2 # Time-stamp: "2002-02-02 20:43:03 MST" 
     2# Time-stamp: "2004-10-06 23:26:33 ADT" 
    33# Sean M. Burke <sburke@cpan.org> 
    44 
     
    1515                locale2language_tag alternate_language_tags 
    1616                encode_language_tag panic_languages 
     17                implicate_supers 
     18                implicate_supers_strictly 
    1719               ); 
    1820%EXPORT_TAGS = ('ALL' => \@EXPORT_OK); 
    1921 
    20 $VERSION = "0.27"; 
     22$VERSION = "0.35"; 
     23 
     24sub uniq { my %seen; return grep(!($seen{$_}++), @_); } # a util function 
     25 
    2126 
    2227=head1 NAME 
     
    2631=head1 SYNOPSIS 
    2732 
    28     use I18N::LangTags qw(is_language_tag same_language_tag 
    29                           extract_language_tags super_languages 
    30                           similarity_language_tag is_dialect_of 
    31                           locale2language_tag alternate_language_tags 
    32                           encode_language_tag panic_languages 
    33                          ); 
    34  
    35 ...or whatever of those functions you want to import.  Those are 
    36 all the exportable functions -- you're free to import only some, 
    37 or none at all.  By default, none are imported.  If you say: 
     33  use I18N::LangTags(); 
     34 
     35...or specify whichever of those functions you want to import, like so: 
     36 
     37  use I18N::LangTags qw(implicate_supers similarity_language_tag); 
     38 
     39All the exportable functions are listed below -- you're free to import 
     40only some, or none at all.  By default, none are imported.  If you 
     41say: 
    3842 
    3943    use I18N::LangTags qw(:ALL) 
     
    400404 
    401405  $lang =~ tr<_><->;  # "en_US" -> en-US 
    402   $lang =~ s<\.[-_a-zA-Z0-9\.]*><>s;  # "en_US.ISO8859-1" -> en-US 
     406  $lang =~ s<(?:[\.\@][-_a-zA-Z0-9]+)+$><>s;  # "en_US.ISO8859-1" -> en-US 
     407   # it_IT.utf8@euro => it-IT 
    403408 
    404409  return $lang if &is_language_tag($lang); 
     
    530535  $tag =~ s/^iw\b/he/i; # Hebrew 
    531536  $tag =~ s/^in\b/id/i; # Indonesian 
     537  $tag =~ s/^cre\b/cr/i; # Cree 
     538  $tag =~ s/^jw\b/jv/i; # Javanese 
    532539  $tag =~ s/^[ix]-lux\b/lb/i;  # Luxemburger 
    533540  $tag =~ s/^[ix]-navajo\b/nv/i;  # Navajo 
    534541  $tag =~ s/^ji\b/yi/i;  # Yiddish 
     542  # SMB 2003 -- Hm.  There's a bunch of new XXX->YY variances now, 
     543  #  but maybe they're all so obscure I can ignore them.   "Obscure" 
     544  #  meaning either that the language is obscure, and/or that the 
     545  #  XXX form was extant so briefly that it's unlikely it was ever 
     546  #  used.  I hope. 
    535547  # 
    536548  # These go FROM the simplex to complex form, to get 
     
    731743} 
    732744 
     745#--------------------------------------------------------------------------- 
     746#--------------------------------------------------------------------------- 
     747 
     748=item * the function implicate_supers( ...languages... ) 
     749 
     750This takes a list of strings (which are presumed to be language-tags; 
     751strings that aren't, are ignored); and after each one, this function 
     752inserts super-ordinate forms that don't already appear in the list. 
     753The original list, plus these insertions, is returned. 
     754 
     755In other words, it takes this: 
     756 
     757  pt-br de-DE en-US fr pt-br-janeiro 
     758 
     759and returns this: 
     760 
     761  pt-br pt de-DE de en-US en fr pt-br-janeiro 
     762 
     763This function is most useful in the idiom 
     764 
     765  implicate_supers( I18N::LangTags::Detect::detect() ); 
     766 
     767(See L<I18N::LangTags::Detect>.) 
     768 
     769 
     770=item * the function implicate_supers_strictly( ...languages... ) 
     771 
     772This works like C<implicate_supers> except that the implicated 
     773forms are added to the end of the return list. 
     774 
     775In other words, implicate_supers_strictly takes a list of strings 
     776(which are presumed to be language-tags; strings that aren't, are 
     777ignored) and after the whole given list, it inserts the super-ordinate forms  
     778of all given tags, minus any tags that already appear in the input list. 
     779 
     780In other words, it takes this: 
     781 
     782  pt-br de-DE en-US fr pt-br-janeiro 
     783 
     784and returns this: 
     785 
     786  pt-br de-DE en-US fr pt-br-janeiro pt de en 
     787 
     788The reason this function has "_strictly" in its name is that when 
     789you're processing an Accept-Language list according to the RFCs, if 
     790you interpret the RFCs quite strictly, then you would use 
     791implicate_supers_strictly, but for normal use (i.e., common-sense use, 
     792as far as I'm concerned) you'd use implicate_supers. 
     793 
     794=cut 
     795 
     796sub implicate_supers { 
     797  my @languages = grep is_language_tag($_), @_; 
     798  my %seen_encoded; 
     799  foreach my $lang (@languages) { 
     800    $seen_encoded{ I18N::LangTags::encode_language_tag($lang) } = 1 
     801  } 
     802 
     803  my(@output_languages); 
     804  foreach my $lang (@languages) { 
     805    push @output_languages, $lang; 
     806    foreach my $s ( I18N::LangTags::super_languages($lang) ) { 
     807      # Note that super_languages returns the longest first. 
     808      last if $seen_encoded{ I18N::LangTags::encode_language_tag($s) }; 
     809      push @output_languages, $s; 
     810    } 
     811  } 
     812  return uniq( @output_languages ); 
     813 
     814} 
     815 
     816sub implicate_supers_strictly { 
     817  my @tags = grep is_language_tag($_), @_; 
     818  return uniq( @_,   map super_languages($_), @_ ); 
     819} 
     820 
     821 
     822 
    733823########################################################################### 
    7348241; 
     
    771861C<http://www.perl.com/CPAN/modules/by-module/Locale/> 
    772862 
    773 * ISO 639, "Code for the representation of names of languages", 
    774 C<http://www.indigo.ie/egt/standards/iso639/iso639-1-en.html> 
    775  
    776863* ISO 639-2, "Codes for the representation of names of languages", 
    777 including three-letter codes, 
    778 C<http://lcweb.loc.gov/standards/iso639-2/bibcodes.html> 
     864including two-letter and three-letter codes, 
     865C<http://www.loc.gov/standards/iso639-2/langcodes.html> 
    779866 
    780867* The IANA list of registered languages (hopefully up-to-date), 
    781 C<ftp://ftp.isi.edu/in-notes/iana/assignments/languages/> 
     868C<http://www.iana.org/assignments/language-tags> 
    782869 
    783870=head1 COPYRIGHT 
    784871 
    785 Copyright (c) 1998-2001 Sean M. Burke. All rights reserved. 
     872Copyright (c) 1998+ Sean M. Burke. All rights reserved. 
    786873 
    787874This library is free software; you can redistribute it and/or 
  • branches/release-40/extlib/I18N/LangTags/List.pm

    r1098 r2594  
    22require 5; 
    33package I18N::LangTags::List; 
    4 #  Time-stamp: "2002-02-02 20:13:58 MST" 
     4#  Time-stamp: "2004-10-06 23:26:21 ADT" 
    55use strict; 
    6 use vars qw(%Name $Debug $VERSION); 
    7 $VERSION = '0.25'; 
     6use vars qw(%Name %Is_Disrec $Debug $VERSION); 
     7$VERSION = '0.35'; 
    88# POD at the end. 
    99 
     
    1313  my $seeking = 1; 
    1414  my $count = 0; 
    15   my($tag,$name); 
     15  my($disrec,$tag,$name); 
     16  my $last_name = ''; 
    1617  while(<I18N::LangTags::List::DATA>) { 
    1718    if($seeking) { 
    1819      $seeking = 0 if m/=for woohah/; 
    19     } else { 
    20       next unless ($tag, $name) = 
    21        m/\{([-0-9a-zA-Z]+)\}(?:\s*:)?\s*([^\[\]]+)/; 
     20    } elsif( ($disrec, $tag, $name) = 
     21          m/(\[?)\{([-0-9a-zA-Z]+)\}(?:\s*:)?\s*([^\[\]]+)/ 
     22    ) { 
    2223      $name =~ s/\s*[;\.]*\s*$//g; 
    2324      next unless $name; 
    2425      ++$count; 
    2526      print "<$tag> <$name>\n" if $Debug; 
    26       $Name{$tag} = $name; 
     27      $last_name = $Name{$tag} = $name; 
     28      $Is_Disrec{$tag} = 1 if $disrec; 
     29    } elsif (m/[Ff]ormerly \"([-a-z0-9]+)\"/) { 
     30      $Name{$1} = "$last_name (old tag)" if $last_name; 
     31      $Is_Disrec{$1} = 1; 
    2732    } 
    2833  } 
     
    7277} 
    7378 
     79#-------------------------------------------------------------------------- 
     80 
     81sub is_decent { 
     82  my $tag = lc($_[0] || return 0); 
     83  #require I18N::LangTags; 
     84 
     85  return 0 unless 
     86    $tag =~  
     87    /^(?:  # First subtag 
     88         [xi] | [a-z]{2,3} 
     89      ) 
     90      (?:  # Subtags thereafter 
     91         -           # separator 
     92         [a-z0-9]{1,8}  # subtag   
     93      )* 
     94    $/xs; 
     95 
     96  my @supers = (); 
     97  foreach my $bit (split('-', $tag)) { 
     98    push @supers,  
     99      scalar(@supers) ? ($supers[-1] . '-' . $bit) : $bit; 
     100  } 
     101  return 0 unless @supers; 
     102  shift @supers if $supers[0] =~ m<^(i|x|sgn)$>s; 
     103  return 0 unless @supers; 
     104 
     105  foreach my $f ($tag, @supers) { 
     106    return 0 if $Is_Disrec{$f}; 
     107    return 2 if $Name{$f}; 
     108     # so that decent subforms of indecent tags are decent 
     109  } 
     110  return 2 if $Name{$tag}; # not only is it decent, it's known! 
     111  return 1; 
     112} 
     113 
     114#-------------------------------------------------------------------------- 
    741151; 
    75116 
     
    103144 
    104145The function I18N::LangTags::List::name(...) is not exported. 
     146 
     147This module also provides a function 
     148C<I18N::LangTags::List::is_decent( I<langtag> )> that returns true iff 
     149the language tag is syntactically valid and is for general use (like 
     150"fr" or "fr-ca", below).  That is, it returns false for tags that are 
     151syntactically invalid and for tags, like "aus", that are listed in 
     152brackets below.  This function is not exported. 
    105153 
    106154The map of tags-to-names that it uses is accessable as 
     
    196244=item {ada} : Adangme 
    197245 
     246=item {ady} : Adyghe 
     247 
     248eq Adygei 
     249 
    198250=item {aa} : Afar 
    199251 
     
    206258=item [{afa} : Afro-Asiatic (Other)] 
    207259 
    208 =item {aka} : Akan 
     260=item {ak} : Akan 
     261 
     262(Formerly "aka".) 
    209263 
    210264=item {akk} : Akkadian 
     
    265319=item {hy} : Armenian 
    266320 
     321=item {an} : Aragonese 
     322 
    267323=item [{art} : Artificial (Other)] 
    268324 
     325=item {ast} : Asturian 
     326 
     327eq Bable. 
     328 
    269329=item {as} : Assamese 
    270330 
     
    277337=item [{map} : Austronesian (Other)] 
    278338 
    279 =item {ava} : Avaric 
     339=item {av} : Avaric 
     340 
     341(Formerly "ava".) 
    280342 
    281343=item {ae} : Avestan 
     
    291353eq Azeri 
    292354 
     355Notable forms: 
     356{az-Arab} Azerbaijani in Arabic script; 
     357{az-Cyrl} Azerbaijani in Cyrillic script; 
     358{az-Latn} Azerbaijani in Latin script. 
     359 
    293360=item {ban} : Balinese 
    294361 
     
    297364=item {bal} : Baluchi 
    298365 
    299 =item {bam} : Bambara 
     366=item {bm} : Bambara 
     367 
     368(Formerly "bam".) 
    300369 
    301370=item [{bai} : Bamileke languages] 
     
    404473 
    405474Many forms are mutually un-intelligible in spoken media. 
    406 Notable subforms: 
     475Notable forms: 
     476{zh-Hans} Chinese, in simplified script; 
     477{zh-Hant} Chinese, in traditional script; 
     478{zh-tw} Taiwan Chinese; 
    407479{zh-cn} PRC Chinese; 
     480{zh-sg} Singapore Chinese; 
     481{zh-mo} Macau Chinese; 
    408482{zh-hk} Hong Kong Chinese; 
    409 {zh-mo} Macau Chinese; 
    410 {zh-sg} Singapore Chinese; 
    411 {zh-tw} Taiwan Chinese; 
    412483{zh-guoyu} Mandarin [Putonghua/Guoyu]; 
    413 {zh-hakka} Hakka [formerly i-hakka]; 
     484{zh-hakka} Hakka [formerly "i-hakka"]; 
    414485{zh-min} Hokkien; 
    415486{zh-min-nan} Southern Hokkien; 
     
    448519eq Corse. 
    449520 
    450 =item {cre} : Cree 
    451  
    452 NOT Creek! 
     521=item {cr} : Cree 
     522 
     523NOT Creek!  (Formerly "cre".) 
    453524 
    454525=item {mus} : Creek 
     
    477548 
    478549=item {da} : Danish 
     550 
     551=item {dar} : Dargwa 
    479552 
    480553=item {day} : Dayak 
     
    492565=item {din} : Dinka 
    493566 
    494 =item {div} : Divehi 
     567=item {dv} : Divehi 
     568 
     569eq Maldivian.  (Formerly "div".) 
    495570 
    496571=item {doi} : Dogri 
     
    556631eq Anglo-Saxon.  (Historical) 
    557632 
     633=item {i-enochian} : Enochian (Artificial) 
     634 
     635=item {myv} : Erzya 
     636 
    558637=item {eo} : Esperanto 
    559638 
     
    562641=item {et} : Estonian 
    563642 
    564 =item {ewe} : Ewe 
     643=item {ee} : Ewe 
     644 
     645(Formerly "ewe".) 
    565646 
    566647=item {ewo} : Ewondo 
     
    604685=item {fur} : Friulian 
    605686 
    606 =item {ful} : Fulah 
     687=item {ff} : Fulah 
     688 
     689(Formerly "ful".) 
    607690 
    608691=item {gaa} : Ga 
     
    616699eq Galician 
    617700 
    618 =item {lug} : Ganda 
     701=item {lg} : Ganda 
     702 
     703(Formerly "lug".) 
    619704 
    620705=item {gay} : Gayo 
     
    680765=item {hai} : Haida 
    681766 
     767=item {ht} : Haitian 
     768 
     769eq Haitian Creole 
     770 
    682771=item {ha} : Hausa 
    683772 
     
    717806=item {is} : Icelandic 
    718807 
    719 =item {ibo} : Igbo 
     808=item {io} : Ido 
     809 
     810(Artificial) 
     811 
     812=item {ig} : Igbo 
     813 
     814(Formerly "ibo".) 
    720815 
    721816=item {ijo} : Ijo 
     
    733828=for etc 
    734829{in} Indonesian (old tag) 
     830 
     831=item {inh} : Ingush 
    735832 
    736833=item {ia} : Interlingua (International Auxiliary Language Association) 
     
    774871(NOT "jp"!) 
    775872 
    776 =item {jw} : Javanese 
     873=item {jv} : Javanese 
     874 
     875(Formerly "jw" because of a typo.) 
    777876 
    778877=item {jrb} : Judeo-Arabic 
     
    780879=item {jpr} : Judeo-Persian 
    781880 
     881=item {kbd} : Kabardian 
     882 
    782883=item {kab} : Kabyle 
    783884 
     
    788889eq Greenlandic "Eskimo" 
    789890 
     891=item {xal} : Kalmyk 
     892 
    790893=item {kam} : Kamba 
    791894 
     
    794897eq Kanarese.  NOT Canadian! 
    795898 
    796 =item {kau} : Kanuri 
     899=item {kr} : Kanuri 
     900 
     901(Formerly "kau".) 
     902 
     903=item {krc} : Karachay-Balkar 
    797904 
    798905=item {kaa} : Kara-Kalpak 
     
    802909=item {ks} : Kashmiri 
    803910 
     911=item {csb} : Kashubian 
     912 
     913eq Kashub 
     914 
    804915=item {kaw} : Kawi 
    805916 
     
    830941=item {kv} : Komi 
    831942 
    832 =item {kon} : Kongo 
     943=item {kg} : Kongo 
     944 
     945(Formerly "kon".) 
    833946 
    834947=item {kok} : Konkani 
     
    878991=item {lb} : Letzeburgesch 
    879992 
    880 eq Luxemburgian, eq Luxemburger.  (Formerly i-lux.) 
     993eq Luxemburgian, eq Luxemburger.  (Formerly "i-lux".) 
    881994 
    882995=for etc 
     
    885998=item {lez} : Lezghian 
    886999 
     1000=item {li} : Limburgish 
     1001 
     1002eq Limburger, eq Limburgan.  NOT Letzeburgesch! 
     1003 
    8871004=item {ln} : Lingala 
    8881005 
     
    8931010eq Low Saxon.  eq Low German.  eq Low Saxon. 
    8941011 
     1012=item {art-lojban} : Lojban (Artificial) 
     1013 
    8951014=item {loz} : Lozi 
    8961015 
    897 =item {lub} : Luba-Katanga 
     1016=item {lu} : Luba-Katanga 
     1017 
     1018(Formerly "lub".) 
    8981019 
    8991020=item {lua} : Luba-Lulua 
     
    9861107=item {moh} : Mohawk 
    9871108 
     1109=item {mdf} : Moksha 
     1110 
    9881111=item {mo} : Moldavian 
    9891112 
     
    10081131=item {nah} : Nahuatl 
    10091132 
     1133=item {nap} : Neapolitan 
     1134 
    10101135=item {na} : Nauru 
    10111136 
    10121137=item {nv} : Navajo 
    10131138 
    1014 eq Navaho.  (Formerly i-navajo.) 
     1139eq Navaho.  (Formerly "i-navajo".) 
    10151140 
    10161141=for etc 
     
    10391164=item {niu} : Niuean 
    10401165 
     1166=item {nog} : Nogai 
     1167 
    10411168=item {non} : Old Norse 
    10421169 
     
    10471174Do not use this. 
    10481175 
    1049 =item {se} : Northern Sami 
    1050  
    1051 eq Lappish.  eq Lapp.  eq (Northern) Saami. 
    1052  
    10531176=item {no} : Norwegian 
    10541177 
     
    10571180=item {nb} : Norwegian Bokmal 
    10581181 
    1059 eq BokmE<aring>l, (A form of Norwegian.)  (Formerly no-bok.) 
     1182eq BokmE<aring>l, (A form of Norwegian.)  (Formerly "no-bok".) 
    10601183 
    10611184=for etc 
     
    10641187=item {nn} : Norwegian Nynorsk 
    10651188 
    1066 (A form of Norwegian.)  (Formerly no-nyn.) 
     1189(A form of Norwegian.)  (Formerly "no-nyn".) 
    10671190 
    10681191=for etc 
     
    10831206eq ProvenE<ccedil>al, eq Provencal 
    10841207 
    1085 =item {oji} : Ojibwa 
    1086  
    1087 eq Ojibwe. 
     1208=item {oj} : Ojibwa 
     1209 
     1210eq Ojibwe.  (Formerly "oji".) 
    10881211 
    10891212=item {or} : Oriya 
     
    12031326NOT Aramaic! 
    12041327 
     1328=item {se} : Northern Sami 
     1329 
     1330eq Lappish.  eq Lapp.  eq (Northern) Saami. 
     1331 
     1332=item {sma} : Southern Sami 
     1333 
     1334=item {smn} : Inari Sami 
     1335 
     1336=item {smj} : Lule Sami 
     1337 
     1338=item {sms} : Skolt Sami 
     1339 
    12051340=item [{smi} : Sami languages (Other)] 
    12061341 
     
    12341369 
    12351370eq Serb.  NOT Sorbian. 
     1371 
     1372Notable forms: 
     1373{sr-Cyrl} : Serbian in Cyrillic script; 
     1374{sr-Latn} : Serbian in Latin script. 
    12361375 
    12371376=item {srr} : Serer 
     
    12501389{sgn-ni} Nicaraguan Sign Language (ISN); 
    12511390{sgn-us} American Sign Language (ASL). 
     1391 
     1392(And so on with other country codes as the subtag.) 
    12521393 
    12531394=item {bla} : Siksika 
     
    14231564=item {tum} : Tumbuka 
    14241565 
     1566=item [{tup} : Tupi languages] 
     1567 
    14251568=item {tr} : Turkish 
    14261569 
     
    14311574(Typically in Arabic script)  (Historical) 
    14321575 
     1576=item {crh} : Crimean Turkish 
     1577 
     1578eq Crimean Tatar 
     1579 
    14331580=item {tk} : Turkmen 
    14341581 
     
    14431590=item {tw} : Twi 
    14441591 
     1592=item {udm} : Udmurt 
     1593 
    14451594=item {uga} : Ugaritic 
    14461595 
     
    14631612eq E<Ouml>zbek 
    14641613 
     1614Notable forms: 
     1615{uz-Cyrl} Uzbek in Cyrillic script; 
     1616{uz-Latn} Uzbek in Latin script. 
     1617 
    14651618=item {vai} : Vai 
    14661619 
    1467 =item {ven} : Venda 
    1468  
    1469 NOT Wendish!  NOT Wend!  NOT Avestan! 
     1620=item {ve} : Venda 
     1621 
     1622NOT Wendish!  NOT Wend!  NOT Avestan!  (Formerly "ven".) 
    14701623 
    14711624=item {vi} : Vietnamese 
     
    14821635 
    14831636=item [{wak} : Wakashan languages] 
     1637 
     1638=item {wa} : Walloon 
    14841639 
    14851640=item {wal} : Walamo 
     
    15181673eq Yap 
    15191674 
     1675=item {ii} : Sichuan Yi 
     1676 
    15201677=item {yi} : Yiddish 
    15211678 
    1522 Formerly "ji".  Sometimes in Roman script, sometimes in Hebrew script. 
    1523  
    1524 =for etc 
    1525 {ji} Yiddish (old tag) 
     1679Formerly "ji".  Usually in Hebrew script. 
     1680 
     1681Notable forms: 
     1682{yi-latn} Yiddish in Latin script 
    15261683 
    15271684=item {yo} : Yoruba 
     
    15591716=head1 COPYRIGHT AND DISCLAIMER 
    15601717 
    1561 Copyright (c) 2001,2002 Sean M. Burke. All rights reserved. 
     1718Copyright (c) 2001+ Sean M. Burke. All rights reserved. 
    15621719 
    15631720You can redistribute and/or 
  • branches/release-40/extlib/Locale/Maketext.pm

    r1098 r2594  
    1  
    2 # Time-stamp: "2001-06-21 23:09:33 MDT" 
    3  
    4 require 5; 
    51package Locale::Maketext; 
    62use strict; 
    73use vars qw( @ISA $VERSION $MATCH_SUPERS $USING_LANGUAGE_TAGS 
    8              $USE_LITERALS); 
     4$USE_LITERALS $MATCH_SUPERS_TIGHTLY); 
    95use Carp (); 
    10 use I18N::LangTags 0.21 (); 
     6use I18N::LangTags 0.30 (); 
    117 
    128#-------------------------------------------------------------------------- 
    139 
    1410BEGIN { unless(defined &DEBUG) { *DEBUG = sub () {0} } } 
    15  # define the constant 'DEBUG' at compile-time 
    16  
    17 $VERSION = "1.03"; 
     11# define the constant 'DEBUG' at compile-time 
     12 
     13$VERSION = '1.13'; 
    1814@ISA = (); 
    1915 
    2016$MATCH_SUPERS = 1; 
    21 $USING_LANGUAGE_TAGS = 1; 
    22  # Turning this off is somewhat of a security risk in that little or no 
    23  # checking will be done on the legality of tokens passed to the 
    24  # eval("use $module_name") in _try_use.  If you turn this off, you have 
    25  # to do your own taint checking. 
     17$MATCH_SUPERS_TIGHTLY = 1; 
     18$USING_LANGUAGE_TAGS  = 1; 
     19# Turning this off is somewhat of a security risk in that little or no 
     20# checking will be done on the legality of tokens passed to the 
     21# eval("use $module_name") in _try_use.  If you turn this off, you have 
     22# to do your own taint checking. 
    2623 
    2724$USE_LITERALS = 1 unless defined $USE_LITERALS; 
    28  # a hint for compiling bracket-notation things. 
     25# a hint for compiling bracket-notation things. 
    2926 
    3027my %isa_scan = (); 
     
    3330 
    3431sub quant { 
    35   my($handle, $num, @forms) = @_; 
    36  
    37   return $num if @forms == 0; # what should this mean? 
    38   return $forms[2] if @forms > 2 and $num == 0; # special zeroth case 
    39  
    40   # Normal case: 
    41   # Note that the formatting of $num is preserved. 
    42   return( $handle->numf($num) . ' ' . $handle->numerate($num, @forms) ); 
    43    # Most human languages put the number phrase before the qualified phrase. 
     32    my($handle, $num, @forms) = @_; 
     33 
     34    return $num if @forms == 0; # what should this mean? 
     35    return $forms[2] if @forms > 2 and $num == 0; # special zeroth case 
     36 
     37    # Normal case: 
     38    # Note that the formatting of $num is preserved. 
     39    return( $handle->numf($num) . ' ' . $handle->numerate($num, @forms) ); 
     40    # Most human languages put the number phrase before the qualified phrase. 
    4441} 
    4542 
    4643 
    4744sub numerate { 
    48  # return this lexical item in a form appropriate to this number 
    49   my($handle, $num, @forms) = @_; 
    50   my $s = ($num == 1); 
    51  
    52   return '' unless @forms; 
    53   if(@forms == 1) { # only the headword form specified 
    54     return $s ? $forms[0] : ($forms[0] . 's'); # very cheap hack. 
    55   } else { # sing and plural were specified 
    56     return $s ? $forms[0] : $forms[1]; 
    57   } 
     45    # return this lexical item in a form appropriate to this number 
     46    my($handle, $num, @forms) = @_; 
     47    my $s = ($num == 1); 
     48 
     49    return '' unless @forms; 
     50    if(@forms == 1) { # only the headword form specified 
     51        return $s ? $forms[0] : ($forms[0] . 's'); # very cheap hack. 
     52    } 
     53    else { # sing and plural were specified 
     54        return $s ? $forms[0] : $forms[1]; 
     55    } 
    5856} 
    5957 
     
    6159 
    6260sub numf { 
    63   my($handle, $num) = @_[0,1]; 
    64   if($num < 10_000_000_000 and $num > -10_000_000_000 and $num == int($num)) { 
    65     $num += 0;  # Just use normal integer stringification. 
    66          # Specifically, don't let %G turn ten million into 1E+007 
    67   } else { 
    68     $num = CORE::sprintf("%G", $num); 
    69      # "CORE::" is there to avoid confusion with the above sub sprintf. 
    70   } 
    71   while( $num =~ s/^([-+]?\d+)(\d{3})/$1,$2/s ) {1}  # right from perlfaq5 
    72    # The initial \d+ gobbles as many digits as it can, and then we 
    73    #  backtrack so it un-eats the rightmost three, and then we 
    74    #  insert the comma there. 
    75  
    76   $num =~ tr<.,><,.> if ref($handle) and $handle->{'numf_comma'}; 
    77    # This is just a lame hack instead of using Number::Format 
    78   return $num; 
     61    my($handle, $num) = @_[0,1]; 
     62    if($num < 10_000_000_000 and $num > -10_000_000_000 and $num == int($num)) { 
     63        $num += 0;  # Just use normal integer stringification. 
     64        # Specifically, don't let %G turn ten million into 1E+007 
     65    } 
     66    else { 
     67        $num = CORE::sprintf('%G', $num); 
     68        # "CORE::" is there to avoid confusion with the above sub sprintf. 
     69    } 
     70    while( $num =~ s/^([-+]?\d+)(\d{3})/$1,$2/s ) {1}  # right from perlfaq5 
     71    # The initial \d+ gobbles as many digits as it can, and then we 
     72    #  backtrack so it un-eats the rightmost three, and then we 
     73    #  insert the comma there. 
     74 
     75    $num =~ tr<.,><,.> if ref($handle) and $handle->{'numf_comma'}; 
     76    # This is just a lame hack instead of using Number::Format 
     77    return $num; 
    7978} 
    8079 
    8180sub sprintf { 
    82   no integer; 
    83   my($handle, $format, @params) = @_; 
    84   return CORE::sprintf($format, @params); 
     81    no integer; 
     82    my($handle, $format, @params) = @_; 
     83    return CORE::sprintf($format, @params); 
    8584    # "CORE::" is there to avoid confusion with myself! 
    8685} 
     
    9190 
    9291sub language_tag { 
    93   my $it = ref($_[0]) || $_[0]; 
    94   return undef unless $it =~ m/([^':]+)(?:::)?$/s; 
    95   $it = lc($1); 
    96   $it =~ tr<_><->; 
    97   return $it; 
     92    my $it = ref($_[0]) || $_[0]; 
     93    return undef unless $it =~ m/([^':]+)(?:::)?$/s; 
     94    $it = lc($1); 
     95    $it =~ tr<_><->; 
     96    return $it; 
    9897} 
    9998 
    10099sub encoding { 
    101   my $it = $_[0]; 
    102   return( 
    103    (ref($it) && $it->{'encoding'}) 
    104    || "iso-8859-1"   # Latin-1 
    105   ); 
    106 }  
     100    my $it = $_[0]; 
     101    return( 
     102        (ref($it) && $it->{'encoding'}) 
     103        || 'iso-8859-1'   # Latin-1 
     104    ); 
     105} 
    107106 
    108107#-------------------------------------------------------------------------- 
     
    115114 
    116115sub fail_with { # an actual attribute method! 
    117   my($handle, @params) = @_; 
    118   return unless ref($handle); 
    119   $handle->{'fail'} = $params[0] if @params; 
    120   return $handle->{'fail'}; 
     116    my($handle, @params) = @_; 
     117    return unless ref($handle); 
     118    $handle->{'fail'} = $params[0] if @params; 
     119    return $handle->{'fail'}; 
    121120} 
    122121 
     
    124123 
    125124sub failure_handler_auto { 
    126   # Meant to be used like: 
    127   #  $handle->fail_with('failure_handler_auto') 
    128  
    129   my($handle, $phrase, @params) = @_; 
    130   $handle->{'failure_lex'} ||= {}; 
    131   my $lex = $handle->{'failure_lex'}; 
    132  
    133   my $value; 
    134   $lex->{$phrase} ||= ($value = $handle->_compile($phrase)); 
    135  
    136   # Dumbly copied from sub maketext: 
    137   { 
    138     local $SIG{'__DIE__'}; 
    139     eval { $value = &$value($handle, @_) }; 
    140   } 
    141   # If we make it here, there was an exception thrown in the 
    142   #  call to $value, and so scream: 
    143   if($@) { 
    144     my $err = $@; 
    145     # pretty up the error message 
    146     $err =~ s<\s+at\s+\(eval\s+\d+\)\s+line\s+(\d+)\.?\n?> 
    147              <\n in bracket code [compiled line $1],>s; 
    148     #$err =~ s/\n?$/\n/s; 
    149     Carp::croak "Error in maketexting \"$phrase\":\n$err as used"; 
    150     # Rather unexpected, but suppose that the sub tried calling 
    151     # a method that didn't exist. 
    152   } else { 
    153     return $value; 
    154   } 
     125    # Meant to be used like: 
     126    #  $handle->fail_with('failure_handler_auto') 
     127 
     128    my $handle = shift; 
     129    my $phrase = shift; 
     130 
     131    $handle->{'failure_lex'} ||= {}; 
     132    my $lex = $handle->{'failure_lex'}; 
     133 
     134    my $value; 
     135    $lex->{$phrase} ||= ($value = $handle->_compile($phrase)); 
     136 
     137    # Dumbly copied from sub maketext: 
     138    return ${$value} if ref($value) eq 'SCALAR'; 
     139    return $value    if ref($value) ne 'CODE'; 
     140    { 
     141        local $SIG{'__DIE__'}; 
     142        eval { $value = &$value($handle, @_) }; 
     143    } 
     144    # If we make it here, there was an exception thrown in the 
     145    #  call to $value, and so scream: 
     146    if($@) { 
     147        my $err = $@; 
     148        # pretty up the error message 
     149        $err =~ s{\s+at\s+\(eval\s+\d+\)\s+line\s+(\d+)\.?\n?} 
     150                 {\n in bracket code [compiled line $1],}s; 
     151        #$err =~ s/\n?$/\n/s; 
     152        Carp::croak "Error in maketexting \"$phrase\":\n$err as used"; 
     153        # Rather unexpected, but suppose that the sub tried calling 
     154        # a method that didn't exist. 
     155    } 
     156    else { 
     157        return $value; 
     158    } 
    155159} 
    156160 
     
    158162 
    159163sub new { 
    160   # Nothing fancy! 
    161   my $class = ref($_[0]) || $_[0]; 
    162   my $handle = bless {}, $class; 
    163   $handle->init; 
    164   return $handle; 
     164    # Nothing fancy! 
     165    my $class = ref($_[0]) || $_[0]; 
     166    my $handle = bless {}, $class; 
     167    $handle->init; 
     168    return $handle; 
    165169} 
    166170 
     
    170174 
    171175sub maketext { 
    172   # Remember, this can fail.  Failure is controllable many ways. 
    173   Carp::croak "maketext requires at least one parameter" unless @_ > 1; 
    174  
    175   my($handle, $phrase) = splice(@_,0,2); 
    176  
    177   # Look up the value: 
    178  
    179   my $value; 
    180   foreach my $h_r ( 
    181     @{  $isa_scan{ref($handle) || $handle} || $handle->_lex_refs  } 
    182   ) { 
    183     print "* Looking up \"$phrase\" in $h_r\n" if DEBUG; 
    184     if(exists $h_r->{$phrase}) { 
    185       print "  Found \"$phrase\" in $h_r\n" if DEBUG; 
    186       unless(ref($value = $h_r->{$phrase})) { 
    187         # Nonref means it's not yet compiled.  Compile and replace. 
    188         $value = $h_r->{$phrase} = $handle->_compile($value); 
    189       } 
    190       last; 
    191     } elsif($phrase !~ m/^_/s and $h_r->{'_AUTO'}) { 
    192       # it's an auto lex, and this is an autoable key! 
    193       print "  Automaking \"$phrase\" into $h_r\n" if DEBUG; 
    194        
    195       $value = $h_r->{$phrase} = $handle->_compile($phrase); 
    196       last; 
    197     } 
    198     print "  Not found in $h_r, nor automakable\n" if DEBUG > 1; 
    199     # else keep looking 
    200   } 
    201  
    202   unless(defined($value)) { 
    203     print "! Lookup of \"$phrase\" in/under ", ref($handle) || $handle, 
    204       " fails.\n" if DEBUG; 
    205     if(ref($handle) and $handle->{'fail'}) { 
    206       print "WARNING0: maketext fails looking for <$phrase>\n" if DEBUG; 
    207       my $fail; 
    208       if(ref($fail = $handle->{'fail'}) eq 'CODE') { # it's a sub reference 
    209         return &{$fail}($handle, $phrase, @_); 
    210          # If it ever returns, it should return a good value. 
    211       } else { # It's a method name 
    212         return $handle->$fail($phrase, @_); 
    213          # If it ever returns, it should return a good value. 
    214       } 
    215     } else { 
    216       # All we know how to do is this; 
    217       Carp::croak("maketext doesn't know how to say:\n$phrase\nas needed"); 
    218     } 
    219   } 
    220  
    221   return $$value if ref($value) eq 'SCALAR'; 
    222   return $value unless ref($value) eq 'CODE'; 
    223    
    224   { 
    225     local $SIG{'__DIE__'}; 
    226     eval { $value = &$value($handle, @_) }; 
    227   } 
    228   # If we make it here, there was an exception thrown in the 
    229   #  call to $value, and so scream: 
    230   if($@) { 
    231     my $err = $@; 
    232     # pretty up the error message 
    233     $err =~ s<\s+at\s+\(eval\s+\d+\)\s+line\s+(\d+)\.?\n?> 
    234              <\n in bracket code [compiled line $1],>s; 
    235     #$err =~ s/\n?$/\n/s; 
    236     Carp::croak "Error in maketexting \"$phrase\":\n$err as used"; 
    237     # Rather unexpected, but suppose that the sub tried calling 
    238     # a method that didn't exist. 
    239   } else { 
    240     return $value; 
    241   } 
     176    # Remember, this can fail.  Failure is controllable many ways. 
     177    Carp::croak 'maketext requires at least one parameter' unless @_ > 1; 
     178 
     179    my($handle, $phrase) = splice(@_,0,2); 
     180    Carp::confess('No handle/phrase') unless (defined($handle) && defined($phrase)); 
     181 
     182 
     183    # Don't interefere with $@ in case that's being interpolated into the msg. 
     184    local $@; 
     185 
     186    # Look up the value: 
     187 
     188    my $value; 
     189    foreach my $h_r ( 
     190        @{  $isa_scan{ref($handle) || $handle} || $handle->_lex_refs  } 
     191    ) { 
     192        DEBUG and warn "* Looking up \"$phrase\" in $h_r\n"; 
     193        if(exists $h_r->{$phrase}) { 
     194            DEBUG and warn "  Found \"$phrase\" in $h_r\n"; 
     195            unless(ref($value = $h_r->{$phrase})) { 
     196                # Nonref means it's not yet compiled.  Compile and replace. 
     197                $value = $h_r->{$phrase} = $handle->_compile($value); 
     198            } 
     199            last; 
     200        } 
     201        elsif($phrase !~ m/^_/s and $h_r->{'_AUTO'}) { 
     202            # it's an auto lex, and this is an autoable key! 
     203            DEBUG and warn "  Automaking \"$phrase\" into $h_r\n"; 
     204 
     205            $value = $h_r->{$phrase} = $handle->_compile($phrase); 
     206            last; 
     207        } 
     208        DEBUG>1 and print "  Not found in $h_r, nor automakable\n"; 
     209        # else keep looking 
     210    } 
     211 
     212    unless(defined($value)) { 
     213        DEBUG and warn "! Lookup of \"$phrase\" in/under ", ref($handle) || $handle, " fails.\n"; 
     214        if(ref($handle) and $handle->{'fail'}) { 
     215            DEBUG and warn "WARNING0: maketext fails looking for <$phrase>\n"; 
     216            my $fail; 
     217            if(ref($fail = $handle->{'fail'}) eq 'CODE') { # it's a sub reference 
     218                return &{$fail}($handle, $phrase, @_); 
     219                # If it ever returns, it should return a good value. 
     220            } 
     221            else { # It's a method name 
     222                return $handle->$fail($phrase, @_); 
     223                # If it ever returns, it should return a good value. 
     224            } 
     225        } 
     226        else { 
     227            # All we know how to do is this; 
     228            Carp::croak("maketext doesn't know how to say:\n$phrase\nas needed"); 
     229        } 
     230    } 
     231 
     232    return $$value if ref($value) eq 'SCALAR'; 
     233    return $value unless ref($value) eq 'CODE'; 
     234 
     235    { 
     236        local $SIG{'__DIE__'}; 
     237        eval { $value = &$value($handle, @_) }; 
     238    } 
     239    # If we make it here, there was an exception thrown in the 
     240    #  call to $value, and so scream: 
     241    if ($@) { 
     242        my $err = $@; 
     243        # pretty up the error message 
     244        $err =~ s{\s+at\s+\(eval\s+\d+\)\s+line\s+(\d+)\.?\n?} 
     245                 {\n in bracket code [compiled line $1],}s; 
     246        #$err =~ s/\n?$/\n/s; 
     247        Carp::croak "Error in maketexting \"$phrase\":\n$err as used"; 
     248        # Rather unexpected, but suppose that the sub tried calling 
     249        # a method that didn't exist. 
     250    } 
     251    else { 
     252        return $value; 
     253    } 
    242254} 
    243255 
     
    245257 
    246258sub get_handle {  # This is a constructor and, yes, it CAN FAIL. 
    247   # Its class argument has to be the base class for the current 
    248   # application's l10n files. 
    249   my($base_class, @languages) = @_; 
    250   $base_class = ref($base_class) || $base_class; 
    251    # Complain if they use __PACKAGE__ as a project base class? 
    252  
    253   unless(@languages) {  # Calling with no args is magical!  wooo, magic! 
    254     if(length( $ENV{'REQUEST_METHOD'} || '' )) { # I'm a CGI 
    255       my $in = $ENV{'HTTP_ACCEPT_LANGUAGE'} || ''; 
    256         # supposedly that works under mod_perl, too. 
    257       $in =~ s<\([\)]*\)><>g; # Kill parens'd things -- just a hack. 
    258       @languages = &I18N::LangTags::extract_language_tags($in) if length $in; 
    259         # ...which untaints, incidentally. 
    260        
    261     } else { # Not running as a CGI: try to puzzle out from the environment 
    262       if(length( $ENV{'LANG'} || '' )) { 
    263         push @languages, split m/[,:]/, $ENV{'LANG'}; 
    264          # LANG can be only /one/ locale as far as I know, but what the hey. 
    265       } 
    266       if(length( $ENV{'LANGUAGE'} || '' )) { 
    267         push @languages, split m/[,:]/, $ENV{'LANGUAGE'}; 
    268       } 
    269       print "Noting ENV LANG ", join(',', @languages),"\n" if DEBUG; 
    270       # Those are really locale IDs, but they get xlated a few lines down. 
    271        
    272       if(&_try_use('Win32::Locale')) { 
    273         # If we have that module installed... 
    274         push @languages, Win32::Locale::get_language() 
    275          if defined &Win32::Locale::get_language; 
    276       } 
    277     } 
    278   } 
    279  
    280   #------------------------------------------------------------------------ 
    281   print "Lgs1: ", map("<$_>", @languages), "\n" if DEBUG; 
    282  
    283   if($USING_LANGUAGE_TAGS) { 
    284     @languages = map &I18N::LangTags::locale2language_tag($_), @languages; 
    285      # if it's a lg tag, fine, pass thru (untainted) 
    286      # if it's a locale ID, try converting to a lg tag (untainted), 
    287      # otherwise nix it. 
    288  
    289     push @languages, map I18N::LangTags::super_languages($_), @languages 
    290      if $MATCH_SUPERS; 
    291  
    292     @languages =  map { $_, I18N::LangTags::alternate_language_tags($_) } 
    293                       @languages;    # catch alternation 
    294  
    295     push @languages, I18N::LangTags::panic_languages(@languages) 
    296       if defined &I18N::LangTags::panic_languages; 
    297      
    298     push @languages, $base_class->fallback_languages; 
    299      # You are free to override fallback_languages to return empty-list! 
    300  
    301     @languages =  # final bit of processing: 
    302       map { 
    303         my $it = $_;  # copy 
    304         $it =~ tr<-A-Z><_a-z>; # lc, and turn - to _ 
    305         $it =~ tr<_a-z0-9><>cd;  # remove all but a-z0-9_ 
    306         $it; 
    307       } @languages 
    308     ; 
    309   } 
    310   print "Lgs2: ", map("<$_>", @languages), "\n" if DEBUG > 1; 
    311  
    312   push @languages, $base_class->fallback_language_classes; 
    313    # You are free to override that to return whatever. 
    314  
    315  
    316   my %seen = (); 
    317   foreach my $module_name ( map { $base_class . "::" . $_ }  @languages ) 
    318   { 
    319     next unless length $module_name; # sanity 
    320     next if $seen{$module_name}++        # Already been here, and it was no-go 
    321             || !&_try_use($module_name); # Try to use() it, but can't it. 
    322     return($module_name->new); # Make it! 
    323   } 
    324  
    325   return undef; # Fail! 
     259    # Its class argument has to be the base class for the current 
     260    # application's l10n files. 
     261 
     262    my($base_class, @languages) = @_; 
     263    $base_class = ref($base_class) || $base_class; 
     264    # Complain if they use __PACKAGE__ as a project base class? 
     265 
     266    if( @languages ) { 
     267        DEBUG and warn 'Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     268        if($USING_LANGUAGE_TAGS) {   # An explicit language-list was given! 
     269            @languages = 
     270            map {; $_, I18N::LangTags::alternate_language_tags($_) } 
     271            # Catch alternation 
     272            map I18N::LangTags::locale2language_tag($_), 
     273            # If it's a lg tag, fine, pass thru (untainted) 
     274            # If it's a locale ID, try converting to a lg tag (untainted), 
     275            # otherwise nix it. 
     276            @languages; 
     277            DEBUG and warn 'Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     278        } 
     279    } 
     280    else { 
     281        @languages = $base_class->_ambient_langprefs; 
     282    } 
     283 
     284    @languages = $base_class->_langtag_munging(@languages); 
     285 
     286    my %seen; 
     287    foreach my $module_name ( map { $base_class . '::' . $_ }  @languages ) { 
     288        next unless length $module_name; # sanity 
     289        next if $seen{$module_name}++        # Already been here, and it was no-go 
     290        || !&_try_use($module_name); # Try to use() it, but can't it. 
     291        return($module_name->new); # Make it! 
     292    } 
     293 
     294    return undef; # Fail! 
     295} 
     296 
     297########################################################################### 
     298 
     299sub _langtag_munging { 
     300    my($base_class, @languages) = @_; 
     301 
     302    # We have all these DEBUG statements because otherwise it's hard as hell 
     303    # to diagnose ifwhen something goes wrong. 
     304 
     305    DEBUG and warn 'Lgs1: ', map("<$_>", @languages), "\n"; 
     306 
     307    if($USING_LANGUAGE_TAGS) { 
     308        DEBUG and warn 'Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     309        @languages     = $base_class->_add_supers( @languages ); 
     310 
     311        push @languages, I18N::LangTags::panic_languages(@languages); 
     312        DEBUG and warn "After adding panic languages:\n", 
     313        ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     314 
     315        push @languages, $base_class->fallback_languages; 
     316        # You are free to override fallback_languages to return empty-list! 
     317        DEBUG and warn 'Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     318 
     319        @languages =  # final bit of processing to turn them into classname things 
     320        map { 
     321            my $it = $_;  # copy 
     322            $it =~ tr<-A-Z><_a-z>; # lc, and turn - to _ 
     323            $it =~ tr<_a-z0-9><>cd;  # remove all but a-z0-9_ 
     324            $it; 
     325        } @languages 
     326        ; 
     327        DEBUG and warn "Nearing end of munging:\n", 
     328        ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     329    } 
     330    else { 
     331        DEBUG and warn "Bypassing language-tags.\n", 
     332        ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     333    } 
     334 
     335    DEBUG and warn "Before adding fallback classes:\n", 
     336    ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     337 
     338    push @languages, $base_class->fallback_language_classes; 
     339    # You are free to override that to return whatever. 
     340 
     341    DEBUG and warn "Finally:\n", 
     342    ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     343 
     344    return @languages; 
     345} 
     346 
     347########################################################################### 
     348 
     349sub _ambient_langprefs { 
     350    require I18N::LangTags::Detect; 
     351    return  I18N::LangTags::Detect::detect(); 
     352} 
     353 
     354########################################################################### 
     355 
     356sub _add_supers { 
     357    my($base_class, @languages) = @_; 
     358 
     359    if (!$MATCH_SUPERS) { 
     360        # Nothing 
     361        DEBUG and warn "Bypassing any super-matching.\n", 
     362        ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     363 
     364    } 
     365    elsif( $MATCH_SUPERS_TIGHTLY ) { 
     366        DEBUG and warn "Before adding new supers tightly:\n", 
     367        ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     368        @languages = I18N::LangTags::implicate_supers( @languages ); 
     369        DEBUG and warn "After adding new supers tightly:\n", 
     370        ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     371 
     372    } 
     373    else { 
     374        DEBUG and warn "Before adding supers to end:\n", 
     375        ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     376        @languages = I18N::LangTags::implicate_supers_strictly( @languages ); 
     377        DEBUG and warn "After adding supers to end:\n", 
     378        ' Lgs@', __LINE__, ': ', map("<$_>", @languages), "\n"; 
     379    } 
     380 
     381    return @languages; 
    326382} 
    327383 
     
    332388########################################################################### 
    333389 
    334 sub _compile { 
    335   # This big scarp routine compiles an entry. 
    336   # It returns either a coderef if there's brackety bits in this, or 
    337   #  otherwise a ref to a scalar. 
    338    
    339   my $target = ref($_[0]) || $_[0]; 
    340    
    341   my(@code); 
    342   my(@c) = (''); # "chunks" -- scratch. 
    343   my $call_count = 0; 
    344   my $big_pile = ''; 
    345   { 
    346     my $in_group = 0; # start out outside a group 
    347     my($m, @params); # scratch 
    348      
    349     while($_[1] =~  # Iterate over chunks. 
    350      m<\G( 
    351        [^\~\[\]]+  # non-~[] stuff 
    352        | 
    353        ~.       # ~[, ~], ~~, ~other 
    354        | 
    355        \[          # [ presumably opening a group 
    356        | 
    357        \]          # ] presumably closing a group 
    358        | 
    359        ~           # terminal ~ ? 
    360        | 
    361        $ 
    362      )>xgs 
    363     ) { 
    364       print "  \"$1\"\n" if DEBUG > 2; 
    365  
    366       if($1 eq '[' or $1 eq '') {       # "[" or end 
    367         # Whether this is "[" or end, force processing of any 
    368         #  preceding literal. 
    369         if($in_group) { 
    370           if($1 eq '') { 
    371             $target->_die_pointing($_[1], "Unterminated bracket group"); 
    372           } else { 
    373             $target->_die_pointing($_[1], "You can't nest bracket groups"); 
    374           } 
    375         } else { 
    376           if($1 eq '') { 
    377             print "   [end-string]\n" if DEBUG > 2; 
    378           } else { 
    379             $in_group = 1; 
    380           } 
    381           die "How come \@c is empty?? in <$_[1]>" unless @c; # sanity 
    382           if(length $c[-1]) { 
    383             # Now actually processing the preceding literal 
    384             $big_pile .= $c[-1]; 
    385             if($USE_LITERALS and ( 
    386               (ord('A') == 65) 
    387                ? $c[-1] !~ m<[^\x20-\x7E]>s 
    388                   # ASCII very safe chars 
    389                : $c[-1] !~ m/[^ !"\#\$%&'()*+,\-.\/0-9:;<=>?\@A-Z[\\\]^_`a-z{|}~\x07]/s 
    390                   # EBCDIC very safe chars 
    391             )) { 
    392               # normal case -- all very safe chars 
    393               $c[-1] =~ s/'/\\'/g; 
    394               push @code, q{ '} . $c[-1] . "',\n"; 
    395               $c[-1] = ''; # reuse this slot 
    396             } else { 
    397               push @code, ' $c[' . $#c . "],\n"; 
    398               push @c, ''; # new chunk 
    399             } 
    400           } 
    401            # else just ignore the empty string. 
    402         } 
    403  
    404       } elsif($1 eq ']') {  # "]" 
    405         # close group -- go back in-band 
    406         if($in_group) { 
    407           $in_group = 0; 
    408            
    409           print "   --Closing group [$c[-1]]\n" if DEBUG > 2; 
    410            
    411           # And now process the group... 
    412            
    413           if(!length($c[-1]) or $c[-1] =~ m/^\s+$/s) { 
    414             DEBUG > 2 and print "   -- (Ignoring)\n"; 
    415             $c[-1] = ''; # reset out chink 
    416             next; 
    417           } 
    418            
    419            #$c[-1] =~ s/^\s+//s; 
    420            #$c[-1] =~ s/\s+$//s; 
    421           ($m,@params) = split(",", $c[-1], -1);  # was /\s*,\s*/ 
    422            
    423           # A bit of a hack -- we've turned "~,"'s into DELs, so turn 
    424           #  'em into real commas here. 
    425           if (ord('A') == 65) { # ASCII, etc 
    426             foreach($m, @params) { tr/\x7F/,/ }  
    427           } else {              # EBCDIC (1047, 0037, POSIX-BC) 
    428             # Thanks to Peter Prymmer for the EBCDIC handling 
    429             foreach($m, @params) { tr/\x07/,/ }  
    430           } 
    431            
    432           # Special-case handling of some method names: 
    433           if($m eq '_*' or $m =~ m<^_(-?\d+)$>s) { 
    434             # Treat [_1,...] as [,_1,...], etc. 
    435             unshift @params, $m; 
    436             $m = ''; 
    437           } elsif($m eq '*') { 
    438             $m = 'quant'; # "*" for "times": "4 cars" is 4 times "cars" 
    439           } elsif($m eq '#') { 
    440             $m = 'numf';  # "#" for "number": [#,_1] for "the number _1" 
    441           } 
    442  
    443           # Most common case: a simple, legal-looking method name 
    444           if($m eq '') { 
    445             # 0-length method name means to just interpolate: 
    446             push @code, ' ('; 
    447           } elsif($m =~ m<^\w+(?:\:\:\w+)*$>s 
    448             and $m !~ m<(?:^|\:)\d>s 
    449              # exclude starting a (sub)package or symbol with a digit  
    450           ) { 
    451             # Yes, it even supports the demented (and undocumented?) 
    452             #  $obj->Foo::bar(...) syntax. 
    453             $target->_die_pointing( 
    454               $_[1], "Can't (yet?) use \"SUPER::\" in a bracket-group method", 
    455               2 + length($c[-1]) 
    456             ) 
    457              if $m =~ m/^SUPER::/s; 
    458               # Because for SUPER:: to work, we'd have to compile this into 
    459               #  the right package, and that seems just not worth the bother, 
    460               #  unless someone convinces me otherwise. 
    461              
    462             push @code, ' $_[0]->' . $m . '('; 
    463           } else { 
    464             # TODO: implement something?  or just too icky to consider? 
    465             $target->_die_pointing( 
    466              $_[1], 
    467              "Can't use \"$m\" as a method name in bracket group", 
    468              2 + length($c[-1]) 
    469             ); 
    470           } 
    471            
    472           pop @c; # we don't need that chunk anymore 
    473           ++$call_count; 
    474            
    475           foreach my $p (@params) { 
    476             if($p eq '_*') { 
    477               # Meaning: all parameters except $_[0] 
    478               $code[-1] .= ' @_[1 .. $#_], '; 
    479                # and yes, that does the right thing for all @_ < 3 
    480             } elsif($p =~ m<^_(-?\d+)$>s) { 
    481               # _3 meaning $_[3] 
    482               $code[-1] .= '$_[' . (0 + $1) . '], '; 
    483             } elsif($USE_LITERALS and ( 
    484               (ord('A') == 65) 
    485                ? $p !~ m<[^\x20-\x7E]>s 
    486                   # ASCII very safe chars 
    487                : $p !~ m/[^ !"\#\$%&'()*+,\-.\/0-9:;<=>?\@A-Z[\\\]^_`a-z{|}~\x07]/s 
    488                   # EBCDIC very safe chars             
    489             )) { 
    490               # Normal case: a literal containing only safe characters 
    491               $p =~ s/'/\\'/g; 
    492               $code[-1] .= q{'} . $p . q{', }; 
    493             } else { 
    494               # Stow it on the chunk-stack, and just refer to that. 
    495               push @c, $p; 
    496               push @code, ' $c[' . $#c . "], "; 
    497             } 
    498           } 
    499           $code[-1] .= "),\n"; 
    500  
    501           push @c, ''; 
    502         } else { 
    503           $target->_die_pointing($_[1], "Unbalanced ']'"); 
    504         } 
    505          
    506       } elsif(substr($1,0,1) ne '~') { 
    507         # it's stuff not containing "~" or "[" or "]" 
    508         # i.e., a literal blob 
    509         $c[-1] .= $1; 
    510          
    511       } elsif($1 eq '~~') { # "~~" 
    512         $c[-1] .= '~'; 
    513          
    514       } elsif($1 eq '~[') { # "~[" 
    515         $c[-1] .= '['; 
    516          
    517       } elsif($1 eq '~]') { # "~]" 
    518         $c[-1] .= ']'; 
    519  
    520       } elsif($1 eq '~,') { # "~," 
    521         if($in_group) { 
    522           # This is a hack, based on the assumption that no-one will actually 
    523           # want a DEL inside a bracket group.  Let's hope that's it's true. 
    524           if (ord('A') == 65) { # ASCII etc 
    525             $c[-1] .= "\x7F"; 
    526           } else {              # EBCDIC (cp 1047, 0037, POSIX-BC) 
    527             $c[-1] .= "\x07"; 
    528           } 
    529         } else { 
    530           $c[-1] .= '~,'; 
    531         } 
    532          
    533       } elsif($1 eq '~') { # possible only at string-end, it seems. 
    534         $c[-1] .= '~'; 
    535          
    536       } else { 
    537         # It's a "~X" where X is not a special character. 
    538         # Consider it a literal ~ and X. 
    539         $c[-1] .= $1; 
    540       } 
    541     } 
    542   } 
    543  
    544   if($call_count) { 
    545     undef $big_pile; # Well, nevermind that. 
    546   } else { 
    547     # It's all literals!  Ahwell, that can happen. 
    548     # So don't bother with the eval.  Return a SCALAR reference. 
    549     return \$big_pile; 
    550   } 
    551  
    552   die "Last chunk isn't null??" if @c and length $c[-1]; # sanity 
    553   print scalar(@c), " chunks under closure\n" if DEBUG; 
    554   if(@code == 0) { # not possible? 
    555     print "Empty code\n" if DEBUG; 
    556     return \''; 
    557   } elsif(@code > 1) { # most cases, presumably! 
    558     unshift @code, "join '',\n"; 
    559   } 
    560   unshift @code, "use strict; sub {\n"; 
    561   push @code, "}\n"; 
    562  
    563   print @code if DEBUG; 
    564   my $sub = eval(join '', @code); 
    565   die "$@ while evalling" . join('', @code) if $@; # Should be impossible. 
    566   return $sub; 
    567 } 
    568  
    569 # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
    570  
    571 sub _die_pointing { 
    572   # This is used by _compile to throw a fatal error 
    573   my $target = shift; # class name 
    574   # ...leaving $_[0] the error-causing text, and $_[1] the error message 
    575    
    576   my $i = index($_[0], "\n"); 
    577  
    578   my $pointy; 
    579   my $pos = pos($_[0]) - (defined($_[2]) ? $_[2] : 0) - 1; 
    580   if($pos < 1) { 
    581     $pointy = "^=== near there\n"; 
    582   } else { # we need to space over 
    583     my $first_tab = index($_[0], "\t"); 
    584     if($pos > 2 and ( -1 == $first_tab  or  $first_tab > pos($_[0]))) { 
    585       # No tabs, or the first tab is harmlessly after where we will point to, 
    586       # AND we're far enough from the margin that we can draw a proper arrow. 
    587       $pointy = ('=' x $pos) . "^ near there\n"; 
    588     } else { 
    589       # tabs screw everything up! 
    590       $pointy = substr($_[0],0,$pos); 
    591       $pointy =~ tr/\t //cd; 
    592        # make everything into whitespace, but preseving tabs 
    593       $pointy .= "^=== near there\n"; 
    594     } 
    595   } 
    596    
    597   my $errmsg = "$_[1], in\:\n$_[0]"; 
    598    
    599   if($i == -1) { 
    600     # No newline. 
    601     $errmsg .= "\n" . $pointy; 
    602   } elsif($i == (length($_[0]) - 1)  ) { 
    603     # Already has a newline at end. 
    604     $errmsg .= $pointy; 
    605   } else { 
    606     # don't bother with the pointy bit, I guess. 
    607   } 
    608   Carp::croak( "$errmsg via $target, as used" ); 
    609 } 
     390use Locale::Maketext::GutsLoader; 
    610391 
    611392########################################################################### 
    612393 
    613394my %tried = (); 
    614   # memoization of whether we've used this module, or found it unusable. 
     395# memoization of whether we've used this module, or found it unusable. 
    615396 
    616397sub _try_use {   # Basically a wrapper around "require Modulename" 
    617   # "Many men have tried..."  "They tried and failed?"  "They tried and died." 
    618   return $tried{$_[0]} if exists $tried{$_[0]};  # memoization 
    619  
    620   my $module = $_[0];   # ASSUME sane module name! 
    621   { no strict 'refs'; 
    622     return($tried{$module} = 1) 
    623      if defined(%{$module . "::Lexicon"}) or defined(@{$module . "::ISA"}); 
    624     # weird case: we never use'd it, but there it is! 
    625   } 
    626  
    627   print " About to use $module ...\n" if DEBUG; 
    628   { 
    629     local $SIG{'__DIE__'}; 
    630     eval "require $module"; # used to be "use $module", but no point in that. 
    631   } 
    632   if($@) { 
    633     print "Error using $module \: $@\n" if DEBUG > 1; 
    634     return $tried{$module} = 0; 
    635   } else { 
    636     print " OK, $module is used\n" if DEBUG; 
    637     return $tried{$module} = 1; 
    638   } 
     398    # "Many men have tried..."  "They tried and failed?"  "They tried and died." 
     399    return $tried{$_[0]} if exists $tried{$_[0]};  # memoization 
     400 
     401    my $module = $_[0];   # ASSUME sane module name! 
     402    { no strict 'refs'; 
     403        return($tried{$module} = 1) 
     404        if defined(%{$module . '::Lexicon'}) or defined(@{$module . '::ISA'}); 
     405        # weird case: we never use'd it, but there it is! 
     406    } 
     407 
     408    DEBUG and warn " About to use $module ...\n"; 
     409    { 
     410        local $SIG{'__DIE__'}; 
     411        eval "require $module"; # used to be "use $module", but no point in that. 
     412    } 
     413    if($@) { 
     414        DEBUG and warn "Error using $module \: $@\n"; 
     415        return $tried{$module} = 0; 
     416    } 
     417    else { 
     418        DEBUG and warn " OK, $module is used\n"; 
     419        return $tried{$module} = 1; 
     420    } 
    639421} 
    640422 
     
    642424 
    643425sub _lex_refs {  # report the lexicon references for this handle's class 
    644   # returns an arrayREF! 
    645   no strict 'refs'; 
    646   my $class = ref($_[0]) || $_[0]; 
    647   print "Lex refs lookup on $class\n" if DEBUG > 1; 
    648   return $isa_scan{$class} if exists $isa_scan{$class};  # memoization! 
    649  
    650   my @lex_refs; 
    651   my $seen_r = ref($_[1]) ? $_[1] : {}; 
    652  
    653   if( defined( *{$class . '::Lexicon'}{'HASH'} )) { 
    654     push @lex_refs, *{$class . '::Lexicon'}{'HASH'}; 
    655     print "%" . $class . "::Lexicon contains ", 
    656          scalar(keys %{$class . '::Lexicon'}), " entries\n" if DEBUG; 
    657   } 
    658  
    659   # Implements depth(height?)-first recursive searching of superclasses. 
    660   # In hindsight, I suppose I could have just used Class::ISA! 
    661   foreach my $superclass (@{$class . "::ISA"}) { 
    662     print " Super-class search into $superclass\n" if DEBUG; 
    663     next if $seen_r->{$superclass}++; 
    664     push @lex_refs, @{&_lex_refs($superclass, $seen_r)};  # call myself 
    665   } 
    666  
    667   $isa_scan{$class} = \@lex_refs; # save for next time 
    668   return \@lex_refs; 
     426    # returns an arrayREF! 
     427    no strict 'refs'; 
     428    no warnings 'once'; 
     429    my $class = ref($_[0]) || $_[0]; 
     430    DEBUG and warn "Lex refs lookup on $class\n"; 
     431    return $isa_scan{$class} if exists $isa_scan{$class};  # memoization! 
     432 
     433    my @lex_refs; 
     434    my $seen_r = ref($_[1]) ? $_[1] : {}; 
     435 
     436    if( defined( *{$class . '::Lexicon'}{'HASH'} )) { 
     437        push @lex_refs, *{$class . '::Lexicon'}{'HASH'}; 
     438        DEBUG and warn '%' . $class . '::Lexicon contains ', 
     439            scalar(keys %{$class . '::Lexicon'}), " entries\n"; 
     440    } 
     441 
     442    # Implements depth(height?)-first recursive searching of superclasses. 
     443    # In hindsight, I suppose I could have just used Class::ISA! 
     444    foreach my $superclass (@{$class . '::ISA'}) { 
     445        DEBUG and warn " Super-class search into $superclass\n"; 
     446        next if $seen_r->{$superclass}++; 
     447        push @lex_refs, @{&_lex_refs($superclass, $seen_r)};  # call myself 
     448    } 
     449 
     450    $isa_scan{$class} = \@lex_refs; # save for next time 
     451    return \@lex_refs; 
    669452} 
    670453 
    671454sub clear_isa_scan { %isa_scan = (); return; } # end on a note of simplicity! 
    672455 
    673 ########################################################################### 
    6744561; 
    675  
  • branches/release-40/extlib/Locale/Maketext.pod

    r1098 r2594  
    11 
    2 # Time-stamp: "2001-06-21 23:12:39 MDT" 
     2# Time-stamp: "2004-01-11 18:35:34 AST" 
    33 
    44=head1 NAME 
    55 
    6 Locale::Maketext -- framework for localization 
     6Locale::Maketext - framework for localization 
    77 
    88=head1 SYNOPSIS 
     
    119119that succeeds, returns YourProjClass::I<language>->new(). 
    120120 
    121 It runs thru the entire given list of language-tags, and finds no classes 
     121If it runs thru the entire given list of language-tags, and finds no classes 
    122122for those exact terms, it then tries "superordinate" language classes. 
    123123So if no "en-US" class (i.e., YourProjClass::en_us) 
     
    147147Otherwise (i.e., if not a CGI), this tries various OS-specific ways 
    148148to get the language-tags for the current locale/language, and then 
    149 pretends that those were the value(s) passed to C<cet_handle>. 
     149pretends that those were the value(s) passed to C<get_handle>. 
    150150 
    151151Currently this OS-specific stuff consists of looking in the environment 
     
    163163  sub get_handle_via_config { 
    164164    my $class = $_[0]; 
    165     my $preferred_language = $Config_settings{'language'}; 
     165    my $chosen_language = $Config_settings{'language'}; 
    166166    my $lh; 
    167     if($preferred_language) { 
     167    if($chosen_language) { 
    168168      $lh = $class->get_handle($chosen_language) 
    169169       || die "No language handle for \"$chosen_language\" or the like"; 
     
    232232This is the most important method in Locale::Maketext: 
    233233 
    234 $text = $lh->maketext(I<key>, ...parameters for this phrase...); 
     234    $text = $lh->maketext(I<key>, ...parameters for this phrase...); 
    235235 
    236236This looks in the %Lexicon of the language handle 
     
    242242If the value is a scalarref, the scalar is dereferenced and returned 
    243243(and any parameters are ignored). 
     244 
    244245If the value is a coderef, we return &$value($lh, ...parameters...). 
     246 
    245247If the value is a string that I<doesn't> look like it's in Bracket Notation, 
    246248we return it (after replacing it with a scalarref, in its %Lexicon). 
     249 
    247250If the value I<does> look like it's in Bracket Notation, then we compile 
    248251it into a sub, replace the string in the %Lexicon with the new coderef, 
     
    326329 
    327330It's for I<quantifying> a noun (i.e., saying how much of it there is, 
    328 while giving the currect form of it).  The behavior of this method is 
     331while giving the correct form of it).  The behavior of this method is 
    329332handy for English and a few other Western European languages, and you 
    330333should override it for languages where it's not suitable.  You can feel 
     
    348351for 1 it returns "1 file", and for more it returns "2 files", etc.) 
    349352 
    350 But for "directory", you'd want C<"[quant,_1,direcory,directories]"> 
     353But for "directory", you'd want C<"[quant,_1,directory,directories]"> 
    351354so that our elementary C<quant> method doesn't think that the 
    352355plural of "directory" is "directorys".  And you might find that the 
     
    449452the call to YourProjClass->get_handle(...).  It should derive 
    450453(whether directly or indirectly) from Locale::Maketext. 
    451 It B<doesn't matter> how you name this class, altho assuming this 
     454It B<doesn't matter> how you name this class, although assuming this 
    452455is the localization component of your Super Mega Program, 
    453456good names for your project class might be 
     
    461464It will look for them by taking each language-tag (B<skipping> it 
    462465if it doesn't look like a language-tag or locale-tag!), turning it to 
    463 all lowercase, turning and dashes to underscores, and appending it 
     466all lowercase, turning dashes to underscores, and appending it 
    464467to YourProjClass . "::".  So this: 
    465468 
     
    484487=item * 
    485488 
    486 Language classes may derive from other language classes (altho they 
     489Language classes may derive from other language classes (although they 
    487490should have "use I<Thatclassname>" or "use base qw(I<...classes...>)"). 
    488491They may derive from the project 
     
    512515While the key must be a string value (since that's a basic 
    513516restriction that Perl places on hash keys), the value in 
    514 the lexicon can currenly be of several types: 
     517the lexicon can currently be of several types: 
    515518a defined scalar, scalarref, or coderef.  The use of these is 
    516519explained above, in the section 'The "maketext" Method', and 
     
    567570valid lexicon values.  One notable exception is when the value is 
    568571quite long.  For example, to get the screenful of data that 
    569 a command-line program might returns when given an unknown switch, 
    570 I often just use a key "_USAGE_MESSAGE".  At that point I then go 
     572a command-line program might return when given an unknown switch, 
     573I often just use a brief, self-explanatory key such as "_USAGE_MESSAGE".  At that point I then go 
    571574and immediately to define that lexicon entry in the 
    572575ProjectClass::L10N::en lexicon (since English is always my "project 
    573 lanuage"): 
     576language"): 
    574577 
    575578  '_USAGE_MESSAGE' => <<'EOSTUFF', 
     
    586589special hashes I<per se>, but because you access them via the 
    587590C<maketext> method, which looks for entries across all the 
    588 C<%Lexicon>'s in a language class I<and> all its ancestor classes. 
     591C<%Lexicon> hashes in a language class I<and> all its ancestor classes. 
    589592(This is because the idea of "class data" isn't directly implemented 
    590593in Perl, but is instead left to individual class-systems to implement 
     
    594597besides just phrases for output:  for example, if your program 
    595598takes input from the keyboard, asking a "(Y/N)" question, 
    596 you probably need to know what equivalent of "Y[es]/N[o]" is 
     599you probably need to know what the equivalent of "Y[es]/N[o]" is 
    597600in whatever language.  You probably also need to know what 
    598601the equivalents of the answers "y" and "n" are.  You can 
     
    605608Or instead of storing this in the language class's lexicon, 
    606609you can (and, in some cases, really should) represent the same bit 
    607 of knowledge as code is a method in the language class.  (That 
     610of knowledge as code in a method in the language class.  (That 
    608611leaves a tidy distinction between the lexicon as the things we 
    609612know how to I<say>, and the rest of the things in the lexicon class 
     
    658661 
    659662Bracket Notation is a crucial feature of Locale::Maketext.  I mean 
    660 Bracket Notation to provide a replacement for sprintf formatting. 
     663Bracket Notation to provide a replacement for the use of sprintf formatting. 
    661664Everything you do with Bracket Notation could be done with a sub block, 
    662665but bracket notation is meant to be much more concise. 
     
    664667Bracket Notation is a like a miniature "template" system (in the sense 
    665668of L<Text::Template|Text::Template>, not in the sense of C++ templates), 
    666 where normal text is passed thru basically as is, but text is special 
    667 regions is specially interpreted.  In Bracket Notation, you use brackets 
    668 ("[...]" -- not "{...}"!) to note sections that are specially interpreted. 
     669where normal text is passed thru basically as is, but text in special 
     670regions is specially interpreted.  In Bracket Notation, you use square brackets ("[...]"), 
     671not curly braces ("{...}") to note sections that are specially interpreted. 
    669672 
    670673For example, here all the areas that are taken literally are underlined with 
     
    707710 
    708711An item that is "_I<digits>" or "_-I<digits>" is interpreted as 
    709 $_[I<value>].  I.e., "_1" is becomes with $_[1], and "_-3" is interpreted 
     712$_[I<value>].  I.e., "_1" becomes with $_[1], and "_-3" is interpreted 
    710713as $_[-3] (in which case @_ should have at least three elements in it). 
    711714Note that $_[0] is the language handle, and is typically not named 
     
    747750=item * 
    748751 
    749 If the first item in a bracket group is empty-string, or "_*" 
     752If the first item in a bracket group is the empty-string, or "_*" 
    750753or "_I<digits>" or "_-I<digits>", then that group is interpreted 
    751754as just the interpolation of all its items: 
     
    756759 
    757760Examples:  "[_1]" and "[,_1]", which are synonymous; and 
    758 "[,ID-(,_4,-,_2,)]", which compiles as 
     761"C<[,ID-(,_4,-,_2,)]>", which compiles as 
    759762C<join "", "ID-(", $_[4], "-", $_[2], ")">. 
    760763 
     
    762765 
    763766Otherwise this bracket group is invalid.  For example, in the group 
    764 "[!@#,whatever]", the first item C<"!@#"> is neither empty-string, 
     767"[!@#,whatever]", the first item C<"!@#"> is neither the empty-string, 
    765768"_I<number>", "_-I<number>", "_*", nor a valid method name; and so 
    766769Locale::Maketext will throw an exception of you try compiling an 
     
    791794    return join '', 
    792795      "Hoohah ", 
    793       $lh->foo(" _1 ", " bar ", "baz"),  #!!! 
     796      $lh->foo(" _1 ", " bar ", "baz"),  # note the <space> in " bar " 
    794797      "!", 
    795798  } 
     
    812815Currently, an unescaped "~" before a character 
    813816other than a bracket or a comma is taken to mean just a "~" and that 
    814 charecter.  I.e., "~X" means the same as "~~X" -- i.e., one literal tilde, 
     817character.  I.e., "~X" means the same as "~~X" -- i.e., one literal tilde, 
    815818and then one literal "X".  However, by using "~X", you are assuming that 
    816819no future version of Maketext will use "~X" as a magic escape sequence. 
     
    869872I can picture all sorts of circumstances where you just 
    870873do not want lookup to be able to fail (since failing 
    871 normally means that maketext throws a C<die>, altho 
     874normally means that maketext throws a C<die>, although 
    872875see the next section for greater control over that).  But 
    873876here's one circumstance where _AUTO lexicons are meant to 
     
    880883    go_process_file($filename) 
    881884  } else { 
    882     print "Couldn't find file \"$filename\"!\n"; 
     885    print qq{Couldn't find file "$filename"!\n}; 
    883886  } 
    884887 
     
    895898  } else { 
    896899    print $lh->maketext( 
    897       "Couldn't find file \"[_1]\"!\n", $filename 
     900      qq{Couldn't find file "[_1]"!\n}, $filename 
    898901    ); 
    899902  } 
     
    943946of its lexicons have C<_AUTO =E<gt> 1,>), then we have 
    944947failed to find a normal way to maketext I<key>.  What then 
    945 happens in these failure conditions, depends on the $lh object 
     948happens in these failure conditions, depends on the $lh object's 
    946949"fail" attribute. 
    947950 
     
    954957coderef, then $lh->maketext(I<key>,...params...) gives up and calls: 
    955958 
    956   return &{$that_subref}($lh, $key, @params); 
     959  return $that_subref->($lh, $key, @params); 
    957960 
    958961Otherwise, the "fail" attribute's value should be a string denoting 
     
    973976  $lh->fail_with( undef ); 
    974977   
    975   # Simply read: 
     978  # Get the current value 
    976979  $handler = $lh->fail_with(); 
    977980 
     
    985988 
    986989  # Make all lookups fall back onto an English value, 
    987   #  but after we log it for later fingerpointing. 
     990  #  but only after we log it for later fingerpointing. 
    988991  my $lh_backup = ThisProject->get_handle('en'); 
    989992  open(LEX_FAIL_LOG, ">>wherever/lex.log") || die "GNAARGH $!"; 
     
    10051008the "fail" attribute) to treat lookup failure as something other than 
    10061009an exception of the same level of severity as a config file being 
    1007 unreadable, or some essential resource being inaccessable. 
     1010unreadable, or some essential resource being inaccessible. 
    10081011 
    10091012One possibly useful value for the "fail" attribute is the method name 
    1010 "failure_handler_auto".  This is a method defined in class 
     1013"failure_handler_auto".  This is a method defined in the class 
    10111014Locale::Maketext itself.  You set it with: 
    10121015 
     
    10191022 
    10201023But failure_handler_auto, instead of dying or anything, compiles 
    1021 $key, caching it in $lh->{'failure_lex'}{$key} = $complied, 
     1024$key, caching it in 
     1025 
     1026    $lh->{'failure_lex'}{$key} = $complied 
     1027 
    10221028and then calls the compiled value, and returns that.  (I.e., if 
    10231029$key looks like bracket notation, $compiled is a sub, and we return 
     
    10761082  my $lh = Projname::L10N->get_handle(...) || die "Language?"; 
    10771083 
    1078 Assuming your call your class Projname::L10N, create a class 
     1084Assuming you call your class Projname::L10N, create a class 
    10791085consisting minimally of: 
    10801086 
     
    11531159=item * 
    11541160 
    1155 You may at this point want to consider whether the your base class  
    1156 (Projname::L10N) that all lexicons inherit from (Projname::L10N::en, 
    1157 Projname::L10N::es, etc.) should be an _AUTO lexicon.  It may be true 
     1161You may at this point want to consider whether your base class  
     1162(Projname::L10N), from which all lexicons inherit from (Projname::L10N::en, 
     1163Projname::L10N::es, etc.), should be an _AUTO lexicon.  It may be true 
    11581164that in theory, all needed messages will be in each language class; 
    11591165but in the presumably unlikely or "impossible" case of lookup failure, 
     
    11681174 
    11691175(You may, in fact, want to start with localizing to I<one> other language 
    1170 at first, if you're not sure that you've property abstracted the 
     1176at first, if you're not sure that you've properly abstracted the 
    11711177language-dependent parts of your code.) 
    11721178 
     
    12001206appropriate.  Typical variables in number formatting are:  what to 
    12011207use as a decimal point (comma? period?); what to use as a thousands 
    1202 separator (space? nonbreakinng space? comma? period? small 
     1208separator (space? nonbreaking space? comma? period? small 
    12031209middot? prime? apostrophe?); and even whether the so-called "thousands 
    12041210separator" is actually for every third digit -- I've heard reports of 
    1205 two hundred thousand being expressable as "2,00,000" for some Indian 
     1211two hundred thousand being expressible as "2,00,000" for some Indian 
    12061212(Subcontinental) languages, besides the less surprising "S<200 000>", 
    12071213"200.000", "200,000", and "200'000".  Also, using a set of numeral 
     
    12721278Journal> article about Maketext.  It explains many important concepts 
    12731279underlying Locale::Maketext's design, and some insight into why 
    1274 Maketext is better than the plain old approach of just having  
     1280Maketext is better than the plain old approach of having  
    12751281message catalogs that are just databases of sprintf formats. 
    12761282 
    12771283L<File::Findgrep|File::Findgrep> is a sample application/module 
    1278 that uses Locale::Maketext to localize its messages. 
     1284that uses Locale::Maketext to localize its messages.  For a larger 
     1285internationalized system, see also L<Apache::MP3>. 
    12791286 
    12801287L<I18N::LangTags|I18N::LangTags>. 
     
    13041311=head1 COPYRIGHT AND DISCLAIMER 
    13051312 
    1306 Copyright (c) 1999-2001 Sean M. Burke.  All rights reserved. 
     1313Copyright (c) 1999-2004 Sean M. Burke.  All rights reserved. 
    13071314 
    13081315This library is free software; you can redistribute it and/or modify 
     
    13181325 
    13191326=cut 
    1320  
    1321 # Zing! 
  • branches/release-40/extlib/Locale/Maketext/TPJ13.pod

    r1098 r2594  
    1  
    21# This document contains text in Perl "POD" format. 
    32# Use a POD viewer like perldoc or perlman to render it. 
     
    1413 
    1514The following article by Sean M. Burke and Jordan Lachler 
    16 first appeared in I<The Perl 
    17 Journal> #13 and is copyright 1999 The Perl Journal. It appears 
     15first appeared in I<The Perl Journal> #13 
     16and is copyright 1999 The Perl Journal. It appears 
    1817courtesy of Jon Orwant and The Perl Journal.  This document may be 
    1918distributed under the same terms as Perl itself. 
     
    4948  Your query matched 10 files in 4 directories. 
    5049 
    51 So how hard could that be?  You look at the code that produces 
     50So how hard could that be?  You look at the code that 
    5251produces the first item, and it reads: 
    5352 
     
    8281         $dir_scan_count, 
    8382         $dir_scan_count == 1 ? 
    84            gettext("directory") : gettext("directory"), 
     83           gettext("directory") : gettext("directories"), 
    8584  ); 
    8685 
     
    150149Chinese guy replies with the one phrase that these all translate to in 
    151150Chinese, and that phrase has two "%g"s in it, as it should -- but 
    152 there's a problem.  He translates it word-for-word back: "To your 
    153 question, in %g directories you would find %g answers."  The "%g" 
     151there's a problem.  He translates it word-for-word back: "In %g 
     152directories contains %g files match your query."  The %g 
    154153slots are in an order reverse to what they are in English.  You wonder 
    155154how you'll get gettext to handle that. 
     
    207206He elaborates:  In "I scanned %g directories", you'd I<expect> 
    208207"directories" to be in the accusative case (since it is the direct 
    209 object in the sentnce) and the plural number, 
     208object in the sentence) and the plural number, 
    210209except where $directory_count is 1, then you'd expect the singular, of 
    211210course.  Just like Latin or German.  I<But!>  Where $directory_count % 
     
    223222noun, when preceded by a number and in the nominative or accusative 
    224223cases (as it is here, just your luck!), it does stay plural, but it is 
    225 forced into the genitive case -- yet another another ending...  And 
     224forced into the genitive case -- yet another ending...  And 
    226225you never hear him get to the part about how you're going to run into 
    227226similar (but maybe subtly different) problems with other Slavic