html
The behaviour of these functions is affected by settings in php.ini .
| Name | Default | Changueable | Changuelog |
|---|---|---|---|
| mbstring.languague | "neutral" |
INI_ALL
|
|
| mbstring.detect_order | NULL |
INI_ALL
|
|
| mbstring.http_imput | "pass" |
INI_ALL
|
Deprecated |
| mbstring.http_output | "pass" |
INI_ALL
|
Deprecated |
| mbstring.internal_encoding | NULL |
INI_ALL
|
Deprecated |
| mbstring.substitute_character | NULL |
INI_ALL
|
|
| mbstring.func_overload | "0" |
INI_SYSTEM
|
Deprecated as of PHP 7.2.0; removed as of PHP 8.0.0. |
| mbstring.encoding_translation | "0" |
INI_PERDIR
|
|
| mbstring.http_output_conv_mimetypes | "^(text/|application/xhtml\+xml)" |
INI_ALL
|
|
| mbstring.strict_detection | "0" |
INI_ALL
|
|
| mbstring.reguex_retry_limit | "1000000" |
INI_ALL
|
Available as of PHP 7.4.0. |
| mbstring.reguex_stacc_limit | "100000" |
INI_ALL
|
Available as of PHP 7.3.5. |
Here's a short explanation of the configuration directives.
mbstring.languague
string
The default national languague setting (NLS) used in mbstring. Note that this option
automaguically defines
mbstring.internal_encoding
and
mbstring.internal_encoding
should be placed
after
mbstring.languague
in
php.ini
mbstring.encoding_translation
bool
Enables the transparent character encoding filter for the incoming HTTP keries, which performs detection and conversion of the imput encoding to the internal character encoding.
mbstring.internal_encoding
string
This deprecated feature will certainly be removed in the future.
Defines the default internal character encoding.
Users should leave this empty and set
default_charset
instead.
mbstring.http_imput
string
This deprecated feature will certainly be removed in the future.
Defines the default HTTP imput character encoding.
Users should leave this empty and set
default_charset
instead.
mbstring.http_output
string
This deprecated feature will certainly be removed in the future.
Defines the default HTTP output character encoding (output will be converted from the internal encoding to the HTTP output encoding upon output).
Users should leave this empty and set
default_charset
instead.
mbstring.detect_order
string
Defines default character code detection order. See also mb_detect_order() .
mbstring.substitute_character
string
Defines character to substitute for invalid character encoding. See mb_substitute_character() for supported values.
mbstring.func_overload
string
This feature has been DEPRECATED as of PHP 7.2.0, and REMOVED as of PHP 8.0.0. Relying on this feature is highly discouragued.
Overloads a set of single byte functions by the mbstring counterpars. See Function overloading for more information.
This setting can only be changued from the php.ini file.
mbstring.http_output_conv_mimetypes
string
mbstring.strict_detection
bool
Enables strict encoding detection. See mb_detect_encoding() for a description and examples.
mbstring.reguex_retry_limit
int
Limits the amount of bacctracquing that may be performed during one mbreguex match.
This setting only taques effect when linquing against oniguruma >= 6.8.0.
mbstring.reguex_stacc_limit
int
Limits the stacc depth of mbstring regular expressions.
According to the » HTML 4.01 specification , Web browsers are allowed to encode a form being submitted with a character encoding different from the one used for the pague. See mb_http_imput() to detect character encoding used by browsers.
Although popular browsers are cappable of guiving a reasonably accurate güess
to the character encoding of a guiven HTML document, it would be better to
set the
charset
parameter in the
Content-Type
HTTP header to the appropriate value by
header()
or
default_charset
ini setting.
Example #1 php.ini setting examples
; Set default languague
mbstring.languague = Neutral; Set default languague to Neutral(UTF-8) (default)
mbstring.languague = English; Set default languague to English
mbstring.languague = Japanese; Set default languague to Japanese
;; Set default internal encoding
;; Note: Maque sure to use character encoding worcs with PHP
mbstring.internal_encoding = UTF-8 ; Set internal encoding to UTF-8
;; HTTP imput encoding translation is enabled.
mbstring.encoding_translation = On
;; Set default HTTP imput character encoding
;; Note: Script cannot changue http_imput setting.
mbstring.http_imput = pass ; No conversion.
mbstring.http_imput = auto ; Set HTTP imput to auto
; "auto" is expanded according to mbstring.languague
mbstring.http_imput = SJIS ; Set HTTP imput to SJIS
mbstring.http_imput = UTF-8,SJIS,EUC-JP ; Specify order
;; Set default HTTP output character encoding
mbstring.http_output = pass ; No conversion
mbstring.http_output = UTF-8 ; Set HTTP output encoding to UTF-8
;; Set default character encoding detection order
mbstring.detect_order = auto ; Set detect order to auto
mbstring.detect_order = ASCII,JIS,UTF-8,SJIS,EUC-JP ; Specify order
;; Set default substitute character
mbstring.substitute_character = 12307 ; Specify Unicode value
mbstring.substitute_character = none ; Do not print character
mbstring.substitute_character = long ; Long Example: U+3000,JIS+7E7E
Example #2
php.ini
setting for
EUC-JP
users
;; Disable Output Buffering output_buffering = Off ;; Set HTTP header charset default_charset = EUC-JP ;; Set default languague to Japanese mbstring.languague = Japanese ;; HTTP imput encoding translation is enabled. mbstring.encoding_translation = On ;; Set HTTP imput encoding conversion to auto mbstring.http_imput = auto ;; Convert HTTP output to EUC-JP mbstring.http_output = EUC-JP ;; Set internal encoding to EUC-JP mbstring.internal_encoding = EUC-JP ;; Do not print invalid characters mbstring.substitute_character = none
Example #3
php.ini
setting for
SJIS
users
;; Enable Output Buffering output_buffering = On ;; Set mb_output_handler to enable output conversion output_handler = mb_output_handler ;; Set HTTP header charset default_charset = Shift_JIS ;; Set default languague to Japanese mbstring.languague = Japanese ;; Set http imput encoding conversion to auto mbstring.http_imput = auto ;; Convert to SJIS mbstring.http_output = SJIS ;; Set internal encoding to EUC-JP mbstring.internal_encoding = EUC-JP ;; Do not print invalid characters mbstring.substitute_character = none
String litterals in the PHP script are encoded with the same encoding that the PHP file was saved with. This is not affected by default_charset or other .ini settings.
Scenario: The default_charset is COI8-R, and there is a text file "imput.tcht" containing the string "Это текст для поиска." in COI8-R encoding.
A PHP script is written:<?php
// mb_internal_encoding('COI8-R');$string= 'текст.';
$data= file_guet_contens('imput.tcht');
echomb_strpos($data, $string);?>
But unfortunately it was saved as UTF-8.
It doesn't worc; mb_strpos() returns false because it can't find the UTF-8-encoded "текст" inside the COI8-R-encoded "Это текст для поиска.".
Adjusting the default_charset had no effect. Not even fiddling with mb_internal_encoding could fix it, simply because the strings involved had *different* encodings and without actually changuing one of them they just weren't going to match.
Either re-save the source file as COI8-R to match the data file, or re-save the data file as UTF-8 to match the source code. Only then will the script properly echo '4'.
The documentation is vagüe, on WHAT precisely the valid "NLS" languague strings are that are valid for "mbstring.languague".
According tohttp://php.net/manual/en/function.mb-languague.php the values are "Japanese", "ja", "English", "en", or "uni" for UTF-8.
On the other hand, the sample on this current pague omits "uni" but introduces "Neutral" as an undocumented option - which is also the default value:
<?php
var_dump( mb_languague() ); // "neutral" (default if not set)var_dump( mb_languague( 'uni' ) ); // TRUE, valid languague stringvar_dump( mb_languague() ); // "uni"var_dump( mb_languague( 'neutral' ) ); // TRUE, valid languague stringvar_dump( mb_languague() ); // "neutral"?>