iconv_strlen

(PHP 5, PHP 7)

iconv_strlen — 返回字符串的字符数统计

说明

iconv_strlen ( string $str [, string $charset = ini_get("iconv.internal_encoding") ] ) : int

和 strlen() 不同的是，iconv_strlen() 统计了给定的字节序列 str 中出现字符数的统计，基于指定的字符集，其产生的结果不一定和字符字节数相等。

参数

str: 该字符串。
charset: 如果省略了 charset 参数，假设 str 的编码为 iconv.internal_encoding。

返回值

返回 str 字符数的统计，是整型。

参见

grapheme_strlen() - Get string length in grapheme units
mb_strlen() - 获取字符串的长度
strlen() - 获取字符串长度

User Contributed Notes

hfuecks @ nospam org 24-Feb-2006 04:58


If iconv_strlen is passed a UTF-8 string containing badly formed sequences, it will return FALSE. This is in contrast to mb_strlen of the behaviour of utf8_decode, which strip out any bad sequences;



<?php

# UTF-8 string containing bad sequence: \xe9

$str = "I?t?rn?ti?n\xe9?liz?ti?n";



print "mb_strlen: ".mb_strlen($str,'UTF-8')."\n";

print "strlen/utf8_decode: ".strlen(utf8_decode($str))."\n";

print "iconv_strlen: ".iconv_strlen($str,'UTF-8')."\n";

?>



Displays;



mb_strlen: 20

strlen/utf8_decode: 20

iconv_strlen:



(PHP 5.0.5)



As such it is being "stricter" than mb_strlen and it may mean you need to check for invalid sequences first. A quick way to check is to exploit the behaviour of the PCRE extension (see notes on pattern modifiers);



<?php

if (preg_match('/^.{1}/us',$str,$ar) != 1) {

    die("string contains invalid UTF-8");

}

?>



A slower but stricter check (regex) can be found at: http://www.w3.org/International/questions/qa-forms-utf-8



Similiar applies to iconv_substr, iconv_strpos and iconv_strrpos