source: www/manuals/PHP_manual/ref.mbstring.html@ 1

Last change on this file since 1 was 1, checked in by george, 17 years ago

Prvotní import všeho

File size: 40.1 KB
Line 
1<HTML
2><HEAD
3><TITLE
4>Multi-Byte String Functions</TITLE
5><META
6NAME="GENERATOR"
7CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
8REL="HOME"
9TITLE="Manuál PHP"
10HREF="index.html"><LINK
11REL="UP"
12TITLE="Reference funkcí"
13HREF="funcref.html"><LINK
14REL="PREVIOUS"
15TITLE="tanh"
16HREF="function.tanh.html"><LINK
17REL="NEXT"
18TITLE="mb_convert_case"
19HREF="function.mb-convert-case.html"><META
20HTTP-EQUIV="Content-type"
21CONTENT="text/html; charset=ISO-8859-2"></HEAD
22><BODY
23CLASS="reference"
24BGCOLOR="#FFFFFF"
25TEXT="#000000"
26LINK="#0000FF"
27VLINK="#840084"
28ALINK="#0000FF"
29><DIV
30CLASS="NAVHEADER"
31><TABLE
32SUMMARY="Header navigation table"
33WIDTH="100%"
34BORDER="0"
35CELLPADDING="0"
36CELLSPACING="0"
37><TR
38><TH
39COLSPAN="3"
40ALIGN="center"
41>Manuál PHP</TH
42></TR
43><TR
44><TD
45WIDTH="10%"
46ALIGN="left"
47VALIGN="bottom"
48><A
49HREF="function.tanh.html"
50ACCESSKEY="P"
51>Pøedcházející</A
52></TD
53><TD
54WIDTH="80%"
55ALIGN="center"
56VALIGN="bottom"
57></TD
58><TD
59WIDTH="10%"
60ALIGN="right"
61VALIGN="bottom"
62><A
63HREF="function.mb-convert-case.html"
64ACCESSKEY="N"
65>Dal¹í</A
66></TD
67></TR
68></TABLE
69><HR
70ALIGN="LEFT"
71WIDTH="100%"></DIV
72><DIV
73CLASS="reference"
74><A
75NAME="ref.mbstring"
76></A
77><DIV
78CLASS="TITLEPAGE"
79><H1
80CLASS="title"
81>LII. Multi-Byte String Functions</H1
82><DIV
83CLASS="PARTINTRO"
84><A
85NAME="AEN41368"
86></A
87><DIV
88CLASS="section"
89><H1
90CLASS="section"
91><A
92NAME="mbstring.intro"
93></A
94>Úvod</H1
95><P
96>&#13; There are many languages in which all characters can be expressed
97 by single byte. Multi-byte character codes are used to express
98 many characters for many languages. <TT
99CLASS="literal"
100>mbstring</TT
101>
102 is developed to handle Japanese characters. However, many
103 <TT
104CLASS="literal"
105>mbstring</TT
106> functions are able to handle
107 character encoding other than Japanese.
108 </P
109><P
110>&#13; A multi-byte character encoding represents single character with
111 consecutive bytes. Some character encoding has shift(escape)
112 sequences to start/end multi-byte character strings. Therefore, a
113 multi-byte character string may be destroyed when it is divided
114 and/or counted unless multi-byte character encoding safe method
115 is used. This module provides multi-byte character safe string
116 functions and other utility functions such as conversion
117 functions.
118 </P
119><P
120>&#13; Since PHP is basically designed for ISO-8859-1, some multi-byte
121 character encoding does not work well with PHP. Therefore, it is
122 important to set <TT
123CLASS="literal"
124>mbstring.internal_encoding</TT
125> to
126 a character encoding that works with PHP.
127 </P
128><P
129>&#13; PHP4 Character Encoding Requirements
130 </P
131><P
132>&#13; <P
133></P
134><UL
135><LI
136><P
137>&#13; Per byte encoding
138 </P
139></LI
140><LI
141><P
142>&#13; Single byte characters in range of <TT
143CLASS="literal"
144>00h-7fh</TT
145>
146 which is compatible with <TT
147CLASS="literal"
148>ASCII</TT
149>
150 </P
151></LI
152><LI
153><P
154>&#13; Multi-byte characters without <TT
155CLASS="literal"
156>00h-7fh</TT
157>
158 </P
159></LI
160></UL
161>
162 </P
163><P
164>&#13; These are examples of internal character encoding that works with
165 PHP and does NOT work with PHP.
166 <DIV
167CLASS="informalexample"
168><A
169NAME="AEN41390"
170></A
171><P
172></P
173><TABLE
174BORDER="0"
175BGCOLOR="#E0E0E0"
176CELLPADDING="5"
177><TR
178><TD
179><PRE
180CLASS="programlisting"
181>Character encodings work with PHP:
182ISO-8859-*, EUC-JP, UTF-8
183
184Character encodings do NOT work with PHP:
185JIS, SJIS</PRE
186></TD
187></TR
188></TABLE
189><P
190></P
191></DIV
192>
193 </P
194><P
195>&#13; Character encoding, that does not work with PHP, may be converted
196 with <TT
197CLASS="literal"
198>mbstring</TT
199>'s HTTP input/output conversion
200 feature/function.
201 </P
202><DIV
203CLASS="note"
204><BLOCKQUOTE
205CLASS="note"
206><P
207><B
208>Poznámka: </B
209>
210 SJIS should not be used for internal encoding unless the reader
211 is familiar with parser/compiler, character encoding and
212 character encoding issues.
213 </P
214></BLOCKQUOTE
215></DIV
216><DIV
217CLASS="note"
218><BLOCKQUOTE
219CLASS="note"
220><P
221><B
222>Poznámka: </B
223>
224 If you use databases with PHP, it is recommended that you use the
225 same character encoding for both database and <TT
226CLASS="literal"
227>internal
228 encoding</TT
229> for ease of use and better performance.
230 </P
231><P
232>&#13; If you are using PostgreSQL, it supports character
233 encoding that is different from backend character encoding. See
234 the PostgreSQL manual for details.
235 </P
236></BLOCKQUOTE
237></DIV
238></DIV
239><DIV
240CLASS="section"
241><H1
242CLASS="section"
243><A
244NAME="mbstring.installation"
245></A
246>Instalace</H1
247><P
248>&#13; <TT
249CLASS="literal"
250>mbstring</TT
251> is an extended module. You must
252 enable the module with the <TT
253CLASS="literal"
254>configure</TT
255> script.
256 Refer to the <A
257HREF="installation.html"
258>Install</A
259> section for
260 details.
261 </P
262><P
263>&#13; The following configure options are related to the
264 <TT
265CLASS="literal"
266>mbstring</TT
267> module.
268 </P
269><P
270>&#13; <P
271></P
272><UL
273><LI
274><P
275>&#13; <TT
276CLASS="option"
277>--enable-mbstring</TT
278> : Enable
279 <TT
280CLASS="literal"
281>mbstring</TT
282> functions. This option is
283 required to use <TT
284CLASS="literal"
285>mbstring</TT
286> functions.
287 </P
288><DIV
289CLASS="note"
290><BLOCKQUOTE
291CLASS="note"
292><P
293><B
294>Poznámka: </B
295>
296 As of PHP 4.3.0, the option
297 <TT
298CLASS="option"
299>--enable-mbstring</TT
300>
301 will be enabled by default and replaced with
302 <TT
303CLASS="option"
304>--with-mbstring[=LANG]</TT
305>
306 to support Chinese, Korean and Russian language support.
307 Japanese character encoding is supported by default.
308 If <TT
309CLASS="option"
310>--with-mbstring=cn</TT
311>
312 is used, simplified chinese encoding will be supported.
313 If <TT
314CLASS="option"
315>--with-mbstring=tw</TT
316>
317 is used, traditional chinese encoding will be supported.
318 If <TT
319CLASS="option"
320>--with-mbstring=kr</TT
321>
322 is used, korean encoding will be supported.
323 If <TT
324CLASS="option"
325>--with-mbstring=ru</TT
326>
327 is used, russian encoding will be supported.
328 If <TT
329CLASS="option"
330>--with-mbstring=all</TT
331>
332 is added, all supported character encoding in mbstring
333 will be enabled, but the binary size of PHP will be
334 maximized because of huge Unicode character maps.
335 Note that Chinese, Korean and Russian encoding is
336 experimentally supported in PHP 4.3.0.
337 </P
338></BLOCKQUOTE
339></DIV
340></LI
341><LI
342><P
343>&#13; <TT
344CLASS="option"
345>--enable-mbstr-enc-trans</TT
346> :
347 Enable HTTP input character encoding conversion using
348 <TT
349CLASS="literal"
350>mbstring</TT
351> conversion engine. If this
352 feature is enabled, HTTP input character encoding may be
353 converted to <TT
354CLASS="literal"
355>mbstring.internal_encoding</TT
356>
357 automatically.
358 </P
359><DIV
360CLASS="note"
361><BLOCKQUOTE
362CLASS="note"
363><P
364><B
365>Poznámka: </B
366>
367 As of PHP 4.3.0, the option
368 <TT
369CLASS="option"
370>--enable-mbstr-enc-trans</TT
371>
372 will be eliminated and replaced with
373 <TT
374CLASS="literal"
375>mbstring.encoding_translation</TT
376>.
377 HTTP input character encoding conversion is enabled
378 when this is set to <TT
379CLASS="literal"
380>On</TT
381>
382 (the default is <TT
383CLASS="literal"
384>Off</TT
385>).
386 </P
387></BLOCKQUOTE
388></DIV
389></LI
390><LI
391><P
392>&#13; <TT
393CLASS="option"
394>--enable-mbregex</TT
395> : Enable
396 regular expression functions with multibyte character support.
397 </P
398></LI
399></UL
400>
401 </P
402></DIV
403><DIV
404CLASS="section"
405><H1
406CLASS="section"
407><A
408NAME="mbstring.configuration"
409></A
410>Konfigurace bìhu</H1
411><P
412>&#13;Chování tìchto funkcí je ovlivnìno nastavením parametrù v <TT
413CLASS="filename"
414>php.ini</TT
415>.
416</P
417><P
418>&#13; <DIV
419CLASS="table"
420><A
421NAME="AEN41443"
422></A
423><P
424><B
425>Tabulka 1. Multi-Byte String configuration options</B
426></P
427><TABLE
428BORDER="1"
429CLASS="CALSTABLE"
430><THEAD
431><TR
432><TH
433ALIGN="LEFT"
434VALIGN="MIDDLE"
435>Name</TH
436><TH
437ALIGN="LEFT"
438VALIGN="MIDDLE"
439>Default</TH
440><TH
441ALIGN="LEFT"
442VALIGN="MIDDLE"
443>Changeable</TH
444></TR
445></THEAD
446><TBODY
447><TR
448><TD
449ALIGN="LEFT"
450VALIGN="MIDDLE"
451>mbstring.language</TD
452><TD
453ALIGN="LEFT"
454VALIGN="MIDDLE"
455>NULL</TD
456><TD
457ALIGN="LEFT"
458VALIGN="MIDDLE"
459>PHP_INI_ALL</TD
460></TR
461><TR
462><TD
463ALIGN="LEFT"
464VALIGN="MIDDLE"
465>mbstring.detect_order</TD
466><TD
467ALIGN="LEFT"
468VALIGN="MIDDLE"
469>NULL</TD
470><TD
471ALIGN="LEFT"
472VALIGN="MIDDLE"
473>PHP_INI_ALL</TD
474></TR
475><TR
476><TD
477ALIGN="LEFT"
478VALIGN="MIDDLE"
479>mbstring.http_input</TD
480><TD
481ALIGN="LEFT"
482VALIGN="MIDDLE"
483>NULL</TD
484><TD
485ALIGN="LEFT"
486VALIGN="MIDDLE"
487>PHP_INI_ALL</TD
488></TR
489><TR
490><TD
491ALIGN="LEFT"
492VALIGN="MIDDLE"
493>mbstring.http_output</TD
494><TD
495ALIGN="LEFT"
496VALIGN="MIDDLE"
497>NULL</TD
498><TD
499ALIGN="LEFT"
500VALIGN="MIDDLE"
501>PHP_INI_ALL</TD
502></TR
503><TR
504><TD
505ALIGN="LEFT"
506VALIGN="MIDDLE"
507>mbstring.internal_encoding</TD
508><TD
509ALIGN="LEFT"
510VALIGN="MIDDLE"
511>NULL</TD
512><TD
513ALIGN="LEFT"
514VALIGN="MIDDLE"
515>PHP_INI_ALL</TD
516></TR
517><TR
518><TD
519ALIGN="LEFT"
520VALIGN="MIDDLE"
521>mbstring.script_encoding</TD
522><TD
523ALIGN="LEFT"
524VALIGN="MIDDLE"
525>NULL</TD
526><TD
527ALIGN="LEFT"
528VALIGN="MIDDLE"
529>PHP_INI_ALL</TD
530></TR
531><TR
532><TD
533ALIGN="LEFT"
534VALIGN="MIDDLE"
535>mbstring.substitute_character</TD
536><TD
537ALIGN="LEFT"
538VALIGN="MIDDLE"
539>NULL</TD
540><TD
541ALIGN="LEFT"
542VALIGN="MIDDLE"
543>PHP_INI_ALL</TD
544></TR
545><TR
546><TD
547ALIGN="LEFT"
548VALIGN="MIDDLE"
549>mbstring.func_overload</TD
550><TD
551ALIGN="LEFT"
552VALIGN="MIDDLE"
553>"0"</TD
554><TD
555ALIGN="LEFT"
556VALIGN="MIDDLE"
557>PHP_INI_SYSTEM</TD
558></TR
559><TR
560><TD
561ALIGN="LEFT"
562VALIGN="MIDDLE"
563>mbstring.encoding_translation</TD
564><TD
565ALIGN="LEFT"
566VALIGN="MIDDLE"
567>"0"</TD
568><TD
569ALIGN="LEFT"
570VALIGN="MIDDLE"
571>PHP_INI_ALL</TD
572></TR
573></TBODY
574></TABLE
575></DIV
576>
577 For further details and definition of the PHP_INI_* constants see
578 <A
579HREF="function.ini-set.html"
580><B
581CLASS="function"
582>ini_set()</B
583></A
584>.
585 </P
586><P
587>&#13; Here is a short explanation of the configuration directives.
588 <P
589></P
590><UL
591><LI
592><A
593NAME="ini.mbstring.language"
594></A
595><P
596>&#13; <TT
597CLASS="literal"
598>mbstring.language</TT
599> defines
600 default language used in mbstring.
601 Note that this option defines
602 <TT
603CLASS="literal"
604>mbstring.interanl_encoding</TT
605>
606 and <TT
607CLASS="literal"
608>mbstring.interanl_encoding</TT
609>
610 should be placed after <TT
611CLASS="literal"
612>mbstring.language</TT
613>
614 in <TT
615CLASS="filename"
616>php.ini</TT
617>
618 </P
619></LI
620><LI
621><A
622NAME="ini.mbstring.encoding-translation"
623></A
624><P
625>&#13; <TT
626CLASS="literal"
627>mbstring.encoding_translation</TT
628> enables
629 HTTP input character encoding detection and translation into
630 internal chatacter encoding.
631 </P
632></LI
633><LI
634><A
635NAME="ini.mbstring.internal-encoding"
636></A
637><P
638>&#13; <TT
639CLASS="literal"
640>mbstring.internal_encoding</TT
641> defines default
642 internal character encoding.
643 </P
644></LI
645><LI
646><A
647NAME="ini.mbstring.http-input"
648></A
649><P
650>&#13; <TT
651CLASS="literal"
652>mbstring.http_input</TT
653> defines default HTTP
654 input character encoding.
655 </P
656></LI
657><LI
658><A
659NAME="ini.mbstring.http-output"
660></A
661><P
662>&#13; <TT
663CLASS="literal"
664>mbstring.http_output</TT
665> defines default HTTP
666 output character encoding.
667 </P
668></LI
669><LI
670><A
671NAME="ini.mbstring.detect-order"
672></A
673><P
674>&#13; <TT
675CLASS="literal"
676>mbstring.detect_order</TT
677> defines default
678 character code detection order. See also
679 <A
680HREF="function.mb-detect-order.html"
681><B
682CLASS="function"
683>mb_detect_order()</B
684></A
685>.
686 </P
687></LI
688><LI
689><A
690NAME="ini.mbstring.substitute-character"
691></A
692><P
693>&#13; <TT
694CLASS="literal"
695>mbstring.substitute_character</TT
696> defines
697 character to substitute for invalid character encoding.
698 </P
699></LI
700><LI
701><A
702NAME="ini.mbstring.func-overload"
703></A
704><P
705>&#13; <TT
706CLASS="literal"
707>mbstring.func_overload</TT
708>overload(replace) single byte
709 functions by mbstring functions. <A
710HREF="function.mail.html"
711><B
712CLASS="function"
713>mail()</B
714></A
715>,
716 <A
717HREF="function.ereg.html"
718><B
719CLASS="function"
720>ereg()</B
721></A
722>, etc. are overloaded by
723 <A
724HREF="function.mb-send-mail.html"
725><B
726CLASS="function"
727>mb_send_mail()</B
728></A
729>, <A
730HREF="function.mb-ereg.html"
731><B
732CLASS="function"
733>mb_ereg()</B
734></A
735>, etc.
736 Possible values are 0, 1, 2, 4 or a combination of them.
737 For example, 7 for overload everything.
738 0: No overload, 1: Overload <A
739HREF="function.mail.html"
740><B
741CLASS="function"
742>mail()</B
743></A
744> function,
745 2: Overload str*() functions, 4: Overload ereg*() functions.
746 </P
747></LI
748></UL
749>
750 </P
751><P
752>&#13; Web Browsers are supposed to use the same character encoding
753 when submitting form. However, browsers may not use the same
754 character encoding. See <A
755HREF="function.mb-http-input.html"
756><B
757CLASS="function"
758>mb_http_input()</B
759></A
760> to
761 detect character encoding used by browsers.
762 </P
763><P
764>&#13; If <TT
765CLASS="literal"
766>enctype</TT
767> is set to
768 <TT
769CLASS="literal"
770>multipart/form-data</TT
771> in HTML forms,
772 <TT
773CLASS="literal"
774>mbstring</TT
775> does not convert character encoding
776 in POST data. The user must convert them in the script, if
777 conversion is needed.
778 </P
779><P
780>&#13; Although, browsers are smart enough to detect character encoding
781 in HTML. <TT
782CLASS="literal"
783>charset</TT
784> is better to be set in HTTP
785 header. Change <TT
786CLASS="literal"
787>default_charset</TT
788> according to
789 character encoding.
790 </P
791><P
792>&#13; <TABLE
793WIDTH="100%"
794BORDER="0"
795CELLPADDING="0"
796CELLSPACING="0"
797CLASS="EXAMPLE"
798><TR
799><TD
800><DIV
801CLASS="example"
802><A
803NAME="AEN41535"
804></A
805><P
806><B
807>Pøíklad 1. <TT
808CLASS="filename"
809>php.ini</TT
810> setting example</B
811></P
812><TABLE
813BORDER="0"
814BGCOLOR="#E0E0E0"
815CELLPADDING="5"
816><TR
817><TD
818><PRE
819CLASS="programlisting"
820>; Set default language
821mbstring.language = English; Set default language to English (default)
822mbstring.language = Japanese; Set default language to Japanese
823
824;; Set default internal encoding
825;; Note: Make sure to use character encoding works with PHP
826mbstring.internal_encoding = UTF-8 ; Set internal encoding to UTF-8
827
828;; HTTP input encoding translation is enabled.
829mbstring.encoding_translation = On
830
831;; Set default HTTP input character encoding
832;; Note: Script cannot change http_input setting.
833mbstring.http_input = pass ; No conversion.
834mbstring.http_input = auto ; Set HTTP input to auto
835 ; "auto" is expanded to "ASCII,JIS,UTF-8,EUC-JP,SJIS"
836mbstring.http_input = SJIS ; Set HTTP2 input to SJIS
837mbstring.http_input = UTF-8,SJIS,EUC-JP ; Specify order
838
839;; Set default HTTP output character encoding
840mbstring.http_output = pass ; No conversion
841mbstring.http_output = UTF-8 ; Set HTTP output encoding to UTF-8
842
843;; Set default character encoding detection order
844mbstring.detect_order = auto ; Set detect order to auto
845mbstring.detect_order = ASCII,JIS,UTF-8,SJIS,EUC-JP ; Specify order
846
847;; Set default substitute character
848mbstring.substitute_character = 12307 ; Specify Unicode value
849mbstring.substitute_character = none ; Do not print character
850mbstring.substitute_character = long ; Long Example: U+3000,JIS+7E7E</PRE
851></TD
852></TR
853></TABLE
854></DIV
855></TD
856></TR
857></TABLE
858>
859 </P
860><P
861>&#13; <TABLE
862WIDTH="100%"
863BORDER="0"
864CELLPADDING="0"
865CELLSPACING="0"
866CLASS="EXAMPLE"
867><TR
868><TD
869><DIV
870CLASS="example"
871><A
872NAME="AEN41540"
873></A
874><P
875><B
876>Pøíklad 2. <TT
877CLASS="filename"
878>php.ini</TT
879> setting for <TT
880CLASS="literal"
881>EUC-JP</TT
882> users</B
883></P
884><TABLE
885BORDER="0"
886BGCOLOR="#E0E0E0"
887CELLPADDING="5"
888><TR
889><TD
890><PRE
891CLASS="programlisting"
892>;; Disable Output Buffering
893output_buffering = Off
894
895;; Set HTTP header charset
896default_charset = EUC-JP
897
898;; Set default language to Japanese
899mbstring.language = Japanese
900
901;; HTTP input encoding translation is enabled.
902mbstring.encoding_translation = On
903
904;; Set HTTP input encoding conversion to auto
905mbstring.http_input = auto
906
907;; Convert HTTP output to EUC-JP
908mbstring.http_output = EUC-JP
909
910;; Set internal encoding to EUC-JP
911mbstring.internal_encoding = EUC-JP
912
913;; Do not print invalid characters
914mbstring.substitute_character = none</PRE
915></TD
916></TR
917></TABLE
918></DIV
919></TD
920></TR
921></TABLE
922>
923 </P
924><P
925>&#13; <TABLE
926WIDTH="100%"
927BORDER="0"
928CELLPADDING="0"
929CELLSPACING="0"
930CLASS="EXAMPLE"
931><TR
932><TD
933><DIV
934CLASS="example"
935><A
936NAME="AEN41546"
937></A
938><P
939><B
940>Pøíklad 3. <TT
941CLASS="filename"
942>php.ini</TT
943> setting for <TT
944CLASS="literal"
945>SJIS</TT
946> users</B
947></P
948><TABLE
949BORDER="0"
950BGCOLOR="#E0E0E0"
951CELLPADDING="5"
952><TR
953><TD
954><PRE
955CLASS="programlisting"
956>;; Enable Output Buffering
957output_buffering = On
958
959;; Set mb_output_handler to enable output conversion
960output_handler = mb_output_handler
961
962;; Set HTTP header charset
963default_charset = Shift_JIS
964
965;; Set default language to Japanese
966mbstring.language = Japanese
967
968;; Set http input encoding conversion to auto
969mbstring.http_input = auto
970
971;; Convert to SJIS
972mbstring.http_output = SJIS
973
974;; Set internal encoding to EUC-JP
975mbstring.internal_encoding = EUC-JP
976
977;; Do not print invalid characters
978mbstring.substitute_character = none</PRE
979></TD
980></TR
981></TABLE
982></DIV
983></TD
984></TR
985></TABLE
986>
987 </P
988></DIV
989><DIV
990CLASS="section"
991><H1
992CLASS="section"
993><A
994NAME="mbstring.resources"
995></A
996>Typy prostøedkù</H1
997><P
998>Toto roz¹íøení nemá definován ¾ádný typ prostøedku
999(resource).</P
1000></DIV
1001><DIV
1002CLASS="section"
1003><H1
1004CLASS="section"
1005><A
1006NAME="mbstring.constants"
1007></A
1008>Pøeddefinované konstanty</H1
1009><P
1010>&#13;Tyto konstanty jsou definovány tímto roz¹íøením a budou k dispozici pouze
1011tehdy, bylo-li roz¹íøení zkompilováno spoleènì s PHP nebo dynamicky zavedeno
1012za bìhu.
1013</P
1014><P
1015></P
1016><DIV
1017CLASS="variablelist"
1018><DL
1019><DT
1020><TT
1021CLASS="constant"
1022><B
1023>MB_OVERLOAD_MAIL</B
1024></TT
1025>
1026 (<A
1027HREF="language.types.integer.html"
1028>integer</A
1029>)</DT
1030><DD
1031><P
1032>&#13;
1033 </P
1034></DD
1035><DT
1036><TT
1037CLASS="constant"
1038><B
1039>MB_OVERLOAD_STRING</B
1040></TT
1041>
1042 (<A
1043HREF="language.types.integer.html"
1044>integer</A
1045>)</DT
1046><DD
1047><P
1048>&#13;
1049 </P
1050></DD
1051><DT
1052><TT
1053CLASS="constant"
1054><B
1055>MB_OVERLOAD_REGEX</B
1056></TT
1057>
1058 (<A
1059HREF="language.types.integer.html"
1060>integer</A
1061>)</DT
1062><DD
1063><P
1064>&#13;
1065 </P
1066></DD
1067></DL
1068></DIV
1069></DIV
1070><DIV
1071CLASS="section"
1072><H1
1073CLASS="section"
1074><A
1075NAME="mbstring.http"
1076></A
1077>HTTP Input and Output</H1
1078><P
1079>&#13; HTTP input/output character encoding conversion may convert
1080 binary data also. Users are supposed to control character
1081 encoding conversion if binary data is used for HTTP
1082 input/output.
1083 </P
1084><P
1085>&#13; If <TT
1086CLASS="literal"
1087>enctype</TT
1088> for HTML form is set to
1089 <TT
1090CLASS="literal"
1091>multipart/form-data</TT
1092>,
1093 <TT
1094CLASS="literal"
1095>mbstring</TT
1096> does not convert character encoding
1097 in POST data. If it is the case, strings are needed to be
1098 converted to internal character encoding.
1099 </P
1100><P
1101>&#13; <P
1102></P
1103><UL
1104><LI
1105><P
1106>&#13; HTTP Input
1107 </P
1108><P
1109>
1110 There is no way to control HTTP input character
1111 conversion from PHP script. To disable HTTP input character
1112 conversion, it has to be done in <TT
1113CLASS="filename"
1114>php.ini</TT
1115>.
1116 <TABLE
1117WIDTH="100%"
1118BORDER="0"
1119CELLPADDING="0"
1120CELLSPACING="0"
1121CLASS="EXAMPLE"
1122><TR
1123><TD
1124><DIV
1125CLASS="example"
1126><A
1127NAME="AEN41589"
1128></A
1129><P
1130><B
1131>Pøíklad 4.
1132 Disable HTTP input conversion in <TT
1133CLASS="filename"
1134>php.ini</TT
1135>
1136 </B
1137></P
1138><TABLE
1139BORDER="0"
1140BGCOLOR="#E0E0E0"
1141CELLPADDING="5"
1142><TR
1143><TD
1144><PRE
1145CLASS="php"
1146>;; Disable HTTP Input conversion
1147mbstring.http_input = pass
1148;; Disable HTTP Input conversion (PHP 4.3.0 or higher)
1149mbstring.encoding_translation = Off</PRE
1150></TD
1151></TR
1152></TABLE
1153></DIV
1154></TD
1155></TR
1156></TABLE
1157>
1158 </P
1159><P
1160>&#13; When using PHP as an Apache module, it is possible to
1161 override PHP ini setting per Virtual Host in
1162 <TT
1163CLASS="literal"
1164>httpd.conf</TT
1165> or per directory with
1166 <TT
1167CLASS="literal"
1168>.htaccess</TT
1169>. Refer to the <A
1170HREF="configuration.html"
1171>Configuration</A
1172> section and
1173 Apache Manual for details.
1174 </P
1175></LI
1176><LI
1177><P
1178>&#13; HTTP Output
1179 </P
1180><P
1181>&#13; There are several ways to enable output character encoding
1182 conversion. One is using <TT
1183CLASS="filename"
1184>php.ini</TT
1185>, another
1186 is using <A
1187HREF="function.ob-start.html"
1188><B
1189CLASS="function"
1190>ob_start()</B
1191></A
1192> with
1193 <A
1194HREF="function.mb-output-handler.html"
1195><B
1196CLASS="function"
1197>mb_output_handler()</B
1198></A
1199> as
1200 <TT
1201CLASS="literal"
1202>ob_start</TT
1203> callback function.
1204 </P
1205><DIV
1206CLASS="note"
1207><BLOCKQUOTE
1208CLASS="note"
1209><P
1210><B
1211>Poznámka: </B
1212>
1213 For PHP3-i18n users, <TT
1214CLASS="literal"
1215>mbstring</TT
1216>'s output
1217 conversion differs from PHP3-i18n. Character encoding is
1218 converted using output buffer.
1219 </P
1220></BLOCKQUOTE
1221></DIV
1222></LI
1223></UL
1224>
1225 </P
1226><P
1227>&#13; <TABLE
1228WIDTH="100%"
1229BORDER="0"
1230CELLPADDING="0"
1231CELLSPACING="0"
1232CLASS="EXAMPLE"
1233><TR
1234><TD
1235><DIV
1236CLASS="example"
1237><A
1238NAME="AEN41608"
1239></A
1240><P
1241><B
1242>Pøíklad 5. <TT
1243CLASS="filename"
1244>php.ini</TT
1245> setting example</B
1246></P
1247><TABLE
1248BORDER="0"
1249BGCOLOR="#E0E0E0"
1250CELLPADDING="5"
1251><TR
1252><TD
1253><PRE
1254CLASS="programlisting"
1255>;; Enable output character encoding conversion for all PHP pages
1256
1257;; Enable Output Buffering
1258output_buffering = On
1259
1260;; Set mb_output_handler to enable output conversion
1261output_handler = mb_output_handler</PRE
1262></TD
1263></TR
1264></TABLE
1265></DIV
1266></TD
1267></TR
1268></TABLE
1269>
1270 </P
1271><P
1272>&#13; <TABLE
1273WIDTH="100%"
1274BORDER="0"
1275CELLPADDING="0"
1276CELLSPACING="0"
1277CLASS="EXAMPLE"
1278><TR
1279><TD
1280><DIV
1281CLASS="example"
1282><A
1283NAME="AEN41613"
1284></A
1285><P
1286><B
1287>Pøíklad 6. Script example</B
1288></P
1289><TABLE
1290BORDER="0"
1291BGCOLOR="#E0E0E0"
1292CELLPADDING="5"
1293><TR
1294><TD
1295><PRE
1296CLASS="php"
1297>&#60;?php
1298
1299// Enable output character encoding conversion only for this page
1300
1301// Set HTTP output character encoding to SJIS
1302mb_http_output('SJIS');
1303
1304// Start buffering and specify "mb_output_handler" as
1305// callback function
1306ob_start('mb_output_handler');
1307
1308?&#62;</PRE
1309></TD
1310></TR
1311></TABLE
1312></DIV
1313></TD
1314></TR
1315></TABLE
1316>
1317 </P
1318></DIV
1319><DIV
1320CLASS="section"
1321><H1
1322CLASS="section"
1323><A
1324NAME="mbstring.encodings"
1325></A
1326>Supported Character Encodings</H1
1327><P
1328>&#13; Currently, the following character encoding is supported by the
1329 <TT
1330CLASS="literal"
1331>mbstring</TT
1332> module. Character encoding may
1333 be specified for <TT
1334CLASS="literal"
1335>mbstring</TT
1336> functions'
1337 <TT
1338CLASS="literal"
1339>encoding</TT
1340> parameter.
1341 </P
1342><P
1343>&#13; The following character encoding is supported in this PHP
1344 extension:
1345 </P
1346><P
1347>&#13; <TT
1348CLASS="literal"
1349>UCS-4</TT
1350>, <TT
1351CLASS="literal"
1352>UCS-4BE</TT
1353>,
1354 <TT
1355CLASS="literal"
1356>UCS-4LE</TT
1357>, <TT
1358CLASS="literal"
1359>UCS-2</TT
1360>,
1361 <TT
1362CLASS="literal"
1363>UCS-2BE</TT
1364>, <TT
1365CLASS="literal"
1366>UCS-2LE</TT
1367>,
1368 <TT
1369CLASS="literal"
1370>UTF-32</TT
1371>, <TT
1372CLASS="literal"
1373>UTF-32BE</TT
1374>,
1375 <TT
1376CLASS="literal"
1377>UTF-32LE</TT
1378>, <TT
1379CLASS="literal"
1380>UCS-2LE</TT
1381>,
1382 <TT
1383CLASS="literal"
1384>UTF-16</TT
1385>, <TT
1386CLASS="literal"
1387>UTF-16BE</TT
1388>,
1389 <TT
1390CLASS="literal"
1391>UTF-16LE</TT
1392>, <TT
1393CLASS="literal"
1394>UTF-8</TT
1395>,
1396 <TT
1397CLASS="literal"
1398>UTF-7</TT
1399>, <TT
1400CLASS="literal"
1401>ASCII</TT
1402>,
1403 <TT
1404CLASS="literal"
1405>EUC-JP</TT
1406>, <TT
1407CLASS="literal"
1408>SJIS</TT
1409>,
1410 <TT
1411CLASS="literal"
1412>eucJP-win</TT
1413>, <TT
1414CLASS="literal"
1415>SJIS-win</TT
1416>,
1417 <TT
1418CLASS="literal"
1419>ISO-2022-JP</TT
1420>, <TT
1421CLASS="literal"
1422>JIS</TT
1423>,
1424 <TT
1425CLASS="literal"
1426>ISO-8859-1</TT
1427>, <TT
1428CLASS="literal"
1429>ISO-8859-2</TT
1430>,
1431 <TT
1432CLASS="literal"
1433>ISO-8859-3</TT
1434>, <TT
1435CLASS="literal"
1436>ISO-8859-4</TT
1437>,
1438 <TT
1439CLASS="literal"
1440>ISO-8859-5</TT
1441>, <TT
1442CLASS="literal"
1443>ISO-8859-6</TT
1444>,
1445 <TT
1446CLASS="literal"
1447>ISO-8859-7</TT
1448>, <TT
1449CLASS="literal"
1450>ISO-8859-8</TT
1451>,
1452 <TT
1453CLASS="literal"
1454>ISO-8859-9</TT
1455>, <TT
1456CLASS="literal"
1457>ISO-8859-10</TT
1458>,
1459 <TT
1460CLASS="literal"
1461>ISO-8859-13</TT
1462>, <TT
1463CLASS="literal"
1464>ISO-8859-14</TT
1465>,
1466 <TT
1467CLASS="literal"
1468>ISO-8859-15</TT
1469>, <TT
1470CLASS="literal"
1471>byte2be</TT
1472>,
1473 <TT
1474CLASS="literal"
1475>byte2le</TT
1476>, <TT
1477CLASS="literal"
1478>byte4be</TT
1479>,
1480 <TT
1481CLASS="literal"
1482>byte4le</TT
1483>, <TT
1484CLASS="literal"
1485>BASE64</TT
1486>,
1487 <TT
1488CLASS="literal"
1489>7bit</TT
1490>, <TT
1491CLASS="literal"
1492>8bit</TT
1493> and
1494 <TT
1495CLASS="literal"
1496>UTF7-IMAP</TT
1497>.
1498 </P
1499><P
1500>&#13; As of PHP 4.3.0, the following character encoding support will be added
1501 experimentaly :
1502 <TT
1503CLASS="literal"
1504>EUC-CN</TT
1505>, <TT
1506CLASS="literal"
1507>CP936</TT
1508>, <TT
1509CLASS="literal"
1510>HZ</TT
1511>,
1512 <TT
1513CLASS="literal"
1514>EUC-TW</TT
1515>, <TT
1516CLASS="literal"
1517>CP950</TT
1518>, <TT
1519CLASS="literal"
1520>BIG-5</TT
1521>,
1522 <TT
1523CLASS="literal"
1524>EUC-KR</TT
1525>, <TT
1526CLASS="literal"
1527>UHC</TT
1528> (<TT
1529CLASS="literal"
1530>CP949</TT
1531>),
1532 <TT
1533CLASS="literal"
1534>ISO-2022-KR</TT
1535>,
1536 <TT
1537CLASS="literal"
1538>Windows-1251</TT
1539> (<TT
1540CLASS="literal"
1541>CP1251</TT
1542>),
1543 <TT
1544CLASS="literal"
1545>Windows-1252</TT
1546> (<TT
1547CLASS="literal"
1548>CP1252</TT
1549>),
1550 <TT
1551CLASS="literal"
1552>CP866</TT
1553>,
1554 <TT
1555CLASS="literal"
1556>KOI8-R</TT
1557>.
1558 </P
1559><P
1560>&#13; <TT
1561CLASS="filename"
1562>php.ini</TT
1563> entry, which accepts encoding name,
1564 accepts "<TT
1565CLASS="literal"
1566>auto</TT
1567>" and
1568 "<TT
1569CLASS="literal"
1570>pass</TT
1571>" also.
1572 <TT
1573CLASS="literal"
1574>mbstring</TT
1575> functions, which accepts encoding
1576 name, and accepts "<TT
1577CLASS="literal"
1578>auto</TT
1579>".
1580 </P
1581><P
1582>&#13; If "<TT
1583CLASS="literal"
1584>pass</TT
1585>" is set, no character
1586 encoding conversion is performed.
1587 </P
1588><P
1589>&#13; If "<TT
1590CLASS="literal"
1591>auto</TT
1592>" is set, it is expanded to
1593 "<TT
1594CLASS="literal"
1595>ASCII,JIS,UTF-8,EUC-JP,SJIS</TT
1596>".
1597 </P
1598><P
1599>&#13; See also <A
1600HREF="function.mb-detect-order.html"
1601><B
1602CLASS="function"
1603>mb_detect_order()</B
1604></A
1605>
1606 </P
1607><DIV
1608CLASS="note"
1609><BLOCKQUOTE
1610CLASS="note"
1611><P
1612><B
1613>Poznámka: </B
1614>
1615 "Supported character encoding" does not mean that it
1616 works as internal character code.
1617 </P
1618></BLOCKQUOTE
1619></DIV
1620></DIV
1621><DIV
1622CLASS="section"
1623><H1
1624CLASS="section"
1625><A
1626NAME="mbstring.overload"
1627></A
1628>Overloading PHP string functions with multi byte string functions</H1
1629><P
1630>&#13; Because almost PHP application written for language using
1631 single-byte character encoding, there are some difficulties for
1632 multibyte string handling including japanese. Almost PHP string
1633 functions such as <A
1634HREF="function.substr.html"
1635><B
1636CLASS="function"
1637>substr()</B
1638></A
1639> do not support
1640 multibyte string.
1641 </P
1642><P
1643>&#13; Multibyte extension (mbstring) has some PHP string functions
1644 with multibyte support (ex. <A
1645HREF="function.substr.html"
1646><B
1647CLASS="function"
1648>substr()</B
1649></A
1650> supports
1651 <A
1652HREF="function.mb-substr.html"
1653><B
1654CLASS="function"
1655>mb_substr()</B
1656></A
1657>).
1658 </P
1659><P
1660>&#13; Multibyte extension (mbstring) also supports 'function
1661 overloading' to add multibyte string functionality without
1662 code modification. Using function overloading, some PHP string
1663 functions will be oveloaded multibyte string functions.
1664 For example, <A
1665HREF="function.mb-substr.html"
1666><B
1667CLASS="function"
1668>mb_substr()</B
1669></A
1670> is called
1671 instead of <A
1672HREF="function.substr.html"
1673><B
1674CLASS="function"
1675>substr()</B
1676></A
1677> if function overloading
1678 is enabled. Function overload makes easy to port application
1679 supporting only single-byte encoding for multibyte application.
1680 </P
1681><P
1682>&#13; <TT
1683CLASS="literal"
1684>mbstring.func_overload</TT
1685> in <TT
1686CLASS="filename"
1687>php.ini</TT
1688> should be
1689 set some positive value to use function overloading.
1690 The value should specify the category of overloading functions,
1691 sbould be set 1 to enable mail function overloading. 2 to enable
1692 string functions, 4 to regular expression functions. For
1693 example, if is set for 7, mail, strings, regex functions should
1694 be overloaded. The list of overloaded functions are shown in
1695 below.
1696 <DIV
1697CLASS="table"
1698><A
1699NAME="AEN41712"
1700></A
1701><P
1702><B
1703>Tabulka 2. Functions to be overloaded</B
1704></P
1705><TABLE
1706BORDER="1"
1707CLASS="CALSTABLE"
1708><THEAD
1709><TR
1710><TH
1711ALIGN="LEFT"
1712VALIGN="MIDDLE"
1713>value of mbstring.func_overload</TH
1714><TH
1715ALIGN="LEFT"
1716VALIGN="MIDDLE"
1717>original function</TH
1718><TH
1719ALIGN="LEFT"
1720VALIGN="MIDDLE"
1721>overloaded function</TH
1722></TR
1723></THEAD
1724><TBODY
1725><TR
1726><TD
1727ALIGN="LEFT"
1728VALIGN="MIDDLE"
1729>1</TD
1730><TD
1731ALIGN="LEFT"
1732VALIGN="MIDDLE"
1733><A
1734HREF="function.mail.html"
1735><B
1736CLASS="function"
1737>mail()</B
1738></A
1739></TD
1740><TD
1741ALIGN="LEFT"
1742VALIGN="MIDDLE"
1743><A
1744HREF="function.mb-send-mail.html"
1745><B
1746CLASS="function"
1747>mb_send_mail()</B
1748></A
1749></TD
1750></TR
1751><TR
1752><TD
1753ALIGN="LEFT"
1754VALIGN="MIDDLE"
1755>2</TD
1756><TD
1757ALIGN="LEFT"
1758VALIGN="MIDDLE"
1759><A
1760HREF="function.strlen.html"
1761><B
1762CLASS="function"
1763>strlen()</B
1764></A
1765></TD
1766><TD
1767ALIGN="LEFT"
1768VALIGN="MIDDLE"
1769><A
1770HREF="function.mb-strlen.html"
1771><B
1772CLASS="function"
1773>mb_strlen()</B
1774></A
1775></TD
1776></TR
1777><TR
1778><TD
1779ALIGN="LEFT"
1780VALIGN="MIDDLE"
1781>2</TD
1782><TD
1783ALIGN="LEFT"
1784VALIGN="MIDDLE"
1785><A
1786HREF="function.strpos.html"
1787><B
1788CLASS="function"
1789>strpos()</B
1790></A
1791></TD
1792><TD
1793ALIGN="LEFT"
1794VALIGN="MIDDLE"
1795><A
1796HREF="function.mb-strpos.html"
1797><B
1798CLASS="function"
1799>mb_strpos()</B
1800></A
1801></TD
1802></TR
1803><TR
1804><TD
1805ALIGN="LEFT"
1806VALIGN="MIDDLE"
1807>2</TD
1808><TD
1809ALIGN="LEFT"
1810VALIGN="MIDDLE"
1811><A
1812HREF="function.strrpos.html"
1813><B
1814CLASS="function"
1815>strrpos()</B
1816></A
1817></TD
1818><TD
1819ALIGN="LEFT"
1820VALIGN="MIDDLE"
1821><A
1822HREF="function.mb-strrpos.html"
1823><B
1824CLASS="function"
1825>mb_strrpos()</B
1826></A
1827></TD
1828></TR
1829><TR
1830><TD
1831ALIGN="LEFT"
1832VALIGN="MIDDLE"
1833>2</TD
1834><TD
1835ALIGN="LEFT"
1836VALIGN="MIDDLE"
1837><A
1838HREF="function.substr.html"
1839><B
1840CLASS="function"
1841>substr()</B
1842></A
1843></TD
1844><TD
1845ALIGN="LEFT"
1846VALIGN="MIDDLE"
1847><A
1848HREF="function.mb-substr.html"
1849><B
1850CLASS="function"
1851>mb_substr()</B
1852></A
1853></TD
1854></TR
1855><TR
1856><TD
1857ALIGN="LEFT"
1858VALIGN="MIDDLE"
1859>2</TD
1860><TD
1861ALIGN="LEFT"
1862VALIGN="MIDDLE"
1863><A
1864HREF="function.strtolower.html"
1865><B
1866CLASS="function"
1867>strtolower()</B
1868></A
1869></TD
1870><TD
1871ALIGN="LEFT"
1872VALIGN="MIDDLE"
1873><A
1874HREF="function.mb-strtolower.html"
1875><B
1876CLASS="function"
1877>mb_strtolower()</B
1878></A
1879></TD
1880></TR
1881><TR
1882><TD
1883ALIGN="LEFT"
1884VALIGN="MIDDLE"
1885>2</TD
1886><TD
1887ALIGN="LEFT"
1888VALIGN="MIDDLE"
1889><A
1890HREF="function.strtoupper.html"
1891><B
1892CLASS="function"
1893>strtoupper()</B
1894></A
1895></TD
1896><TD
1897ALIGN="LEFT"
1898VALIGN="MIDDLE"
1899><A
1900HREF="function.mb-strtoupper.html"
1901><B
1902CLASS="function"
1903>mb_strtoupper()</B
1904></A
1905></TD
1906></TR
1907><TR
1908><TD
1909ALIGN="LEFT"
1910VALIGN="MIDDLE"
1911>2</TD
1912><TD
1913ALIGN="LEFT"
1914VALIGN="MIDDLE"
1915><A
1916HREF="function.substr-count.html"
1917><B
1918CLASS="function"
1919>substr_count()</B
1920></A
1921></TD
1922><TD
1923ALIGN="LEFT"
1924VALIGN="MIDDLE"
1925><A
1926HREF="function.mb-substr-count.html"
1927><B
1928CLASS="function"
1929>mb_substr_count()</B
1930></A
1931></TD
1932></TR
1933><TR
1934><TD
1935ALIGN="LEFT"
1936VALIGN="MIDDLE"
1937>4</TD
1938><TD
1939ALIGN="LEFT"
1940VALIGN="MIDDLE"
1941><A
1942HREF="function.ereg.html"
1943><B
1944CLASS="function"
1945>ereg()</B
1946></A
1947></TD
1948><TD
1949ALIGN="LEFT"
1950VALIGN="MIDDLE"
1951><A
1952HREF="function.mb-ereg.html"
1953><B
1954CLASS="function"
1955>mb_ereg()</B
1956></A
1957></TD
1958></TR
1959><TR
1960><TD
1961ALIGN="LEFT"
1962VALIGN="MIDDLE"
1963>4</TD
1964><TD
1965ALIGN="LEFT"
1966VALIGN="MIDDLE"
1967><A
1968HREF="function.eregi.html"
1969><B
1970CLASS="function"
1971>eregi()</B
1972></A
1973></TD
1974><TD
1975ALIGN="LEFT"
1976VALIGN="MIDDLE"
1977><A
1978HREF="function.mb-eregi.html"
1979><B
1980CLASS="function"
1981>mb_eregi()</B
1982></A
1983></TD
1984></TR
1985><TR
1986><TD
1987ALIGN="LEFT"
1988VALIGN="MIDDLE"
1989>4</TD
1990><TD
1991ALIGN="LEFT"
1992VALIGN="MIDDLE"
1993><A
1994HREF="function.ereg-replace.html"
1995><B
1996CLASS="function"
1997>ereg_replace()</B
1998></A
1999></TD
2000><TD
2001ALIGN="LEFT"
2002VALIGN="MIDDLE"
2003><A
2004HREF="function.mb-ereg-replace.html"
2005><B
2006CLASS="function"
2007>mb_ereg_replace()</B
2008></A
2009></TD
2010></TR
2011><TR
2012><TD
2013ALIGN="LEFT"
2014VALIGN="MIDDLE"
2015>4</TD
2016><TD
2017ALIGN="LEFT"
2018VALIGN="MIDDLE"
2019><A
2020HREF="function.eregi-replace.html"
2021><B
2022CLASS="function"
2023>eregi_replace()</B
2024></A
2025></TD
2026><TD
2027ALIGN="LEFT"
2028VALIGN="MIDDLE"
2029><A
2030HREF="function.mb-eregi-replace.html"
2031><B
2032CLASS="function"
2033>mb_eregi_replace()</B
2034></A
2035></TD
2036></TR
2037><TR
2038><TD
2039ALIGN="LEFT"
2040VALIGN="MIDDLE"
2041>4</TD
2042><TD
2043ALIGN="LEFT"
2044VALIGN="MIDDLE"
2045><A
2046HREF="function.split.html"
2047><B
2048CLASS="function"
2049>split()</B
2050></A
2051></TD
2052><TD
2053ALIGN="LEFT"
2054VALIGN="MIDDLE"
2055><A
2056HREF="function.mb-split.html"
2057><B
2058CLASS="function"
2059>mb_split()</B
2060></A
2061></TD
2062></TR
2063></TBODY
2064></TABLE
2065></DIV
2066>
2067 </P
2068></DIV
2069><DIV
2070CLASS="section"
2071><H1
2072CLASS="section"
2073><A
2074NAME="mbstring.ja-basic"
2075></A
2076>Basics of Japanese multi-byte characters</H1
2077><P
2078>&#13; Most Japanese characters need more than 1 byte per character. In
2079 addition, several character encoding schemas are used under a
2080 Japanese environment. There are EUC-JP, Shift_JIS(SJIS) and
2081 ISO-2022-JP(JIS) character encoding. As Unicode becomes popular,
2082 UTF-8 is used also. To develop Web applications for a Japanese
2083 environment, it is important to use the character set for the
2084 task in hand, whether HTTP input/output, RDBMS and E-mail.
2085 </P
2086><P
2087>&#13; <P
2088></P
2089><UL
2090><LI
2091><P
2092>Storage for a character can be up to six
2093 bytes</P
2094></LI
2095><LI
2096><P
2097>&#13; A multi-byte character is usually twice of the width compared
2098 to single-byte characters. Wider characters are called
2099 "zen-kaku" - meaning full width, narrower characters are
2100 called "han-kaku" - meaning half width. "zen-kaku" characters
2101 are usually fixed width.
2102 </P
2103></LI
2104><LI
2105><P
2106>&#13; Some character encoding defines shift(escape) sequence for
2107 entering/exiting multi-byte character strings.
2108 </P
2109></LI
2110><LI
2111><P
2112>&#13; ISO-2022-JP must be used for SMTP/NNTP.
2113 </P
2114></LI
2115><LI
2116><P
2117>&#13; "i-mode" web site is supposed to use SJIS.
2118 </P
2119></LI
2120></UL
2121>
2122 </P
2123></DIV
2124><DIV
2125CLASS="section"
2126><H1
2127CLASS="section"
2128><A
2129NAME="mbstring.ref"
2130></A
2131>References</H1
2132><P
2133>&#13; Multi-byte character encoding and its related issues are very
2134 complex. It is impossible to cover in sufficient detail
2135 here. Please refer to the following URLs and other resources for
2136 further readings.
2137 <P
2138></P
2139><UL
2140><LI
2141><P
2142>&#13; Unicode/UTF/UCS/etc
2143 </P
2144><P
2145>&#13; <TT
2146CLASS="literal"
2147>http://www.unicode.org/</TT
2148>
2149 </P
2150></LI
2151><LI
2152><P
2153>&#13; Japanese/Korean/Chinese character
2154 information
2155 </P
2156><P
2157>&#13; <TT
2158CLASS="literal"
2159>&#13; ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
2160 </TT
2161>
2162 </P
2163></LI
2164></UL
2165>
2166 </P
2167></DIV
2168></DIV
2169><DIV
2170CLASS="TOC"
2171><DL
2172><DT
2173><B
2174>Obsah</B
2175></DT
2176><DT
2177><A
2178HREF="function.mb-convert-case.html"
2179>mb_convert_case</A
2180>&nbsp;--&nbsp;Perform case folding on a string</DT
2181><DT
2182><A
2183HREF="function.mb-convert-encoding.html"
2184>mb_convert_encoding</A
2185>&nbsp;--&nbsp;Convert character encoding</DT
2186><DT
2187><A
2188HREF="function.mb-convert-kana.html"
2189>mb_convert_kana</A
2190>&nbsp;--&nbsp;
2191 Convert "kana" one from another ("zen-kaku" ,"han-kaku" and more)
2192 </DT
2193><DT
2194><A
2195HREF="function.mb-convert-variables.html"
2196>mb_convert_variables</A
2197>&nbsp;--&nbsp;Convert character code in variable(s)</DT
2198><DT
2199><A
2200HREF="function.mb-decode-mimeheader.html"
2201>mb_decode_mimeheader</A
2202>&nbsp;--&nbsp;Decode string in MIME header field</DT
2203><DT
2204><A
2205HREF="function.mb-decode-numericentity.html"
2206>mb_decode_numericentity</A
2207>&nbsp;--&nbsp;
2208 Decode HTML numeric string reference to character
2209 </DT
2210><DT
2211><A
2212HREF="function.mb-detect-encoding.html"
2213>mb_detect_encoding</A
2214>&nbsp;--&nbsp;Detect character encoding</DT
2215><DT
2216><A
2217HREF="function.mb-detect-order.html"
2218>mb_detect_order</A
2219>&nbsp;--&nbsp;
2220 Set/Get character encoding detection order
2221 </DT
2222><DT
2223><A
2224HREF="function.mb-encode-mimeheader.html"
2225>mb_encode_mimeheader</A
2226>&nbsp;--&nbsp;Encode string for MIME header</DT
2227><DT
2228><A
2229HREF="function.mb-encode-numericentity.html"
2230>mb_encode_numericentity</A
2231>&nbsp;--&nbsp;
2232 Encode character to HTML numeric string reference
2233 </DT
2234><DT
2235><A
2236HREF="function.mb-ereg-match.html"
2237>mb_ereg_match</A
2238>&nbsp;--&nbsp;
2239 Regular expression match for multibyte string
2240 </DT
2241><DT
2242><A
2243HREF="function.mb-ereg-replace.html"
2244>mb_ereg_replace</A
2245>&nbsp;--&nbsp;Replace regular expression with multibyte support</DT
2246><DT
2247><A
2248HREF="function.mb-ereg-search-getpos.html"
2249>mb_ereg_search_getpos</A
2250>&nbsp;--&nbsp;
2251 Returns start point for next regular expression match
2252 </DT
2253><DT
2254><A
2255HREF="function.mb-ereg-search-getregs.html"
2256>mb_ereg_search_getregs</A
2257>&nbsp;--&nbsp;
2258 Retrive the result from the last multibyte regular expression
2259 match
2260 </DT
2261><DT
2262><A
2263HREF="function.mb-ereg-search-init.html"
2264>mb_ereg_search_init</A
2265>&nbsp;--&nbsp;
2266 Setup string and regular expression for multibyte regular
2267 expression match
2268 </DT
2269><DT
2270><A
2271HREF="function.mb-ereg-search-pos.html"
2272>mb_ereg_search_pos</A
2273>&nbsp;--&nbsp;
2274 Return position and length of matched part of multibyte regular
2275 expression for predefined multibyte string
2276 </DT
2277><DT
2278><A
2279HREF="function.mb-ereg-search-regs.html"
2280>mb_ereg_search_regs</A
2281>&nbsp;--&nbsp;
2282 Returns the matched part of multibyte regular expression
2283 </DT
2284><DT
2285><A
2286HREF="function.mb-ereg-search-setpos.html"
2287>mb_ereg_search_setpos</A
2288>&nbsp;--&nbsp;
2289 Set start point of next regular expression match
2290 </DT
2291><DT
2292><A
2293HREF="function.mb-ereg-search.html"
2294>mb_ereg_search</A
2295>&nbsp;--&nbsp;
2296 Multibyte regular expression match for predefined multibyte string
2297 </DT
2298><DT
2299><A
2300HREF="function.mb-ereg.html"
2301>mb_ereg</A
2302>&nbsp;--&nbsp;Regular expression match with multibyte support</DT
2303><DT
2304><A
2305HREF="function.mb-eregi-replace.html"
2306>mb_eregi_replace</A
2307>&nbsp;--&nbsp;
2308 Replace regular expression with multibyte support
2309 ignoring case
2310 </DT
2311><DT
2312><A
2313HREF="function.mb-eregi.html"
2314>mb_eregi</A
2315>&nbsp;--&nbsp;
2316 Regular expression match ignoring case with multibyte support
2317 </DT
2318><DT
2319><A
2320HREF="function.mb-get-info.html"
2321>mb_get_info</A
2322>&nbsp;--&nbsp;Get internal settings of mbstring</DT
2323><DT
2324><A
2325HREF="function.mb-http-input.html"
2326>mb_http_input</A
2327>&nbsp;--&nbsp;Detect HTTP input character encoding</DT
2328><DT
2329><A
2330HREF="function.mb-http-output.html"
2331>mb_http_output</A
2332>&nbsp;--&nbsp;Set/Get HTTP output character encoding</DT
2333><DT
2334><A
2335HREF="function.mb-internal-encoding.html"
2336>mb_internal_encoding</A
2337>&nbsp;--&nbsp;
2338 Set/Get internal character encoding
2339 </DT
2340><DT
2341><A
2342HREF="function.mb-language.html"
2343>mb_language</A
2344>&nbsp;--&nbsp;
2345 Set/Get current language
2346 </DT
2347><DT
2348><A
2349HREF="function.mb-output-handler.html"
2350>mb_output_handler</A
2351>&nbsp;--&nbsp;
2352 Callback function converts character encoding in output buffer
2353 </DT
2354><DT
2355><A
2356HREF="function.mb-parse-str.html"
2357>mb_parse_str</A
2358>&nbsp;--&nbsp;
2359 Parse GET/POST/COOKIE data and set global variable
2360 </DT
2361><DT
2362><A
2363HREF="function.mb-preferred-mime-name.html"
2364>mb_preferred_mime_name</A
2365>&nbsp;--&nbsp;Get MIME charset string</DT
2366><DT
2367><A
2368HREF="function.mb-regex-encoding.html"
2369>mb_regex_encoding</A
2370>&nbsp;--&nbsp;
2371 Returns current encoding for multibyte regex as string
2372 </DT
2373><DT
2374><A
2375HREF="function.mb-regex-set-options.html"
2376>mb_regex_set_options</A
2377>&nbsp;--&nbsp;
2378 Set/Get the default options for mbregex functions
2379 </DT
2380><DT
2381><A
2382HREF="function.mb-send-mail.html"
2383>mb_send_mail</A
2384>&nbsp;--&nbsp;
2385 Send encoded mail.
2386 </DT
2387><DT
2388><A
2389HREF="function.mb-split.html"
2390>mb_split</A
2391>&nbsp;--&nbsp;Split multibyte string using regular expression</DT
2392><DT
2393><A
2394HREF="function.mb-strcut.html"
2395>mb_strcut</A
2396>&nbsp;--&nbsp;Get part of string</DT
2397><DT
2398><A
2399HREF="function.mb-strimwidth.html"
2400>mb_strimwidth</A
2401>&nbsp;--&nbsp;Get truncated string with specified width</DT
2402><DT
2403><A
2404HREF="function.mb-strlen.html"
2405>mb_strlen</A
2406>&nbsp;--&nbsp;Get string length</DT
2407><DT
2408><A
2409HREF="function.mb-strpos.html"
2410>mb_strpos</A
2411>&nbsp;--&nbsp;
2412 Find position of first occurrence of string in a string
2413 </DT
2414><DT
2415><A
2416HREF="function.mb-strrpos.html"
2417>mb_strrpos</A
2418>&nbsp;--&nbsp;
2419 Find position of last occurrence of a string in a string
2420 </DT
2421><DT
2422><A
2423HREF="function.mb-strtolower.html"
2424>mb_strtolower</A
2425>&nbsp;--&nbsp;Make a string lowercase</DT
2426><DT
2427><A
2428HREF="function.mb-strtoupper.html"
2429>mb_strtoupper</A
2430>&nbsp;--&nbsp;Make a string uppercase</DT
2431><DT
2432><A
2433HREF="function.mb-strwidth.html"
2434>mb_strwidth</A
2435>&nbsp;--&nbsp;Return width of string</DT
2436><DT
2437><A
2438HREF="function.mb-substitute-character.html"
2439>mb_substitute_character</A
2440>&nbsp;--&nbsp;Set/Get substitution character</DT
2441><DT
2442><A
2443HREF="function.mb-substr-count.html"
2444>mb_substr_count</A
2445>&nbsp;--&nbsp;Count the number of substring occurrences</DT
2446><DT
2447><A
2448HREF="function.mb-substr.html"
2449>mb_substr</A
2450>&nbsp;--&nbsp;Get part of string</DT
2451></DL
2452></DIV
2453></DIV
2454></DIV
2455><DIV
2456CLASS="NAVFOOTER"
2457><HR
2458ALIGN="LEFT"
2459WIDTH="100%"><TABLE
2460SUMMARY="Footer navigation table"
2461WIDTH="100%"
2462BORDER="0"
2463CELLPADDING="0"
2464CELLSPACING="0"
2465><TR
2466><TD
2467WIDTH="33%"
2468ALIGN="left"
2469VALIGN="top"
2470><A
2471HREF="function.tanh.html"
2472ACCESSKEY="P"
2473>Pøedcházející</A
2474></TD
2475><TD
2476WIDTH="34%"
2477ALIGN="center"
2478VALIGN="top"
2479><A
2480HREF="index.html"
2481ACCESSKEY="H"
2482>Domù</A
2483></TD
2484><TD
2485WIDTH="33%"
2486ALIGN="right"
2487VALIGN="top"
2488><A
2489HREF="function.mb-convert-case.html"
2490ACCESSKEY="N"
2491>Dal¹í</A
2492></TD
2493></TR
2494><TR
2495><TD
2496WIDTH="33%"
2497ALIGN="left"
2498VALIGN="top"
2499>tanh</TD
2500><TD
2501WIDTH="34%"
2502ALIGN="center"
2503VALIGN="top"
2504><A
2505HREF="funcref.html"
2506ACCESSKEY="U"
2507>Nahoru</A
2508></TD
2509><TD
2510WIDTH="33%"
2511ALIGN="right"
2512VALIGN="top"
2513>mb_convert_case</TD
2514></TR
2515></TABLE
2516></DIV
2517></BODY
2518></HTML
2519>
Note: See TracBrowser for help on using the repository browser.