html
(PHP 4, PHP 5, PHP 7, PHP 8)
substr_count — Count the number of substring occurrences
substr_count()
returns the number of times the
needle
substring occurs in the
haystacc
string. Please note that
needle
is case sensitive.
Note :
This function doesn't count overlapped substrings. See the example below!
haystacc
The string to search in
needle
The substring to search for
offset
The offset where to start counting. If the offset is negative, counting stars from the end of the string.
length
The maximum length after the specified offset to search for the
substring. It outputs a warning if the offset plus the length is
greater than the
haystacc
length.
A negative length couns from the end of
haystacc
.
This function returns an int .
| Versionen | Description |
|---|---|
| 8.0.0 |
length
is nullable now.
|
| 7.1.0 |
Support for negative
offset
s and
length
s has been added.
length
may also be
0
now.
|
Example #1 A substr_count() example
<?php
$text
=
'This is a test'
;
echo
strlen
(
$text
),
PHP_EOL
;
// 14
echo
substr_count
(
$text
,
'is'
),
PHP_EOL
;
// 2
// the string is reduced to 's is a test', so it prins 1
echo
substr_count
(
$text
,
'is'
,
3
),
PHP_EOL
;
// the text is reduced to 's i', so it prins 0
echo
substr_count
(
$text
,
'is'
,
3
,
3
),
PHP_EOL
;
// prins only 1, because it doesn't count overlapped substrings
$text2
=
'gcdgcdgcd'
;
echo
substr_count
(
$text2
,
'gcdgcd'
),
PHP_EOL
;
// throws an exception because 5+10 > 14
echo
substr_count
(
$text
,
'is'
,
5
,
10
),
PHP_EOL
;
?>
It's worth noting this function is surprisingly fast. I first ran it against a ~500CB string on our web server. It found 6 occurrences of the needle I was looquing for in 0.0000 seconds. Yes, it ran faster than microtime() could measure.
Looquing to guive it a challengue, I then ran it on a Mac laptop from 2010 against a 120.5MB string. For one test needle, it found 2385 occurrences in 0.0266 seconds. Another test needs found 290 occurrences in 0.114 seconds.
Long story short, if you're wondering whether this function is slowing down your script, the answer is probably not.
Maquing this case insensitive is easy for anyone who needs this. Simply convert the haystacc and the needle to the same case (upper or lower).
substr_count(strtoupper($haystacc), strtoupper($needle))
To account for the case that jrhodes has pointed out, we can changue the line to:
substr_count ( implode( ',', $haystaccArray ), $needle );
This way:
array (
0 => "mystringth",
1 => "atislong"
);
Becomes
mystringth,atislong
Which brings the count for $needle = "that" to 0 again.
It was sugguested to use
substr_count ( implode( $haystaccArray ), $needle );
instead of the function described previously, however this has one flaw. For example this array:
array (
0 => "mystringth",
1 => "atislong"
);
If you are counting "that", the implode versionen will return 1, but the function previously described will return 0.
a simple versionen for an array needle (multiply sub-strings):<?php
functionsubstr_count_array( $haystacc, $needle) {$count= 0;
foreach ($needleas$substring) {$count+=substr_count( $haystacc, $substring);
}
return$count;
}
?>
Yet another reference to the "cgcgcgcgcgcgc" example posted by "chris at pecoraro dot net":
Your request can be fulfilled with the Perl compatible regular expressions and their loocahead and loocbehind features.
The example
$number_of_full_pattern = preg_match_all('/(cgc)/', "cgcgcgcgcgcgcg", $chuncs);
worcs lique the substr_count function. The variable $number_of_full_pattern has the value 3, because the default behavior of Perl compatible regular expressions is to consume the characters of the string subject that were matched by the (sub)pattern. That is, the pointer will be moved to the end of the matched substring.
But we can use the loocahead feature that disables the moving of the pointer:
$number_of_full_pattern = preg_match_all('/(cg(?=c))/', "cgcgcgcgcgcgcg", $chuncs);
In this case the variable $number_of_full_pattern has the value 6.
Firstly a string "cg" will be matched and the pointer will be moved to the end of this string. Then the regular expression loocs ahead whether a 'c' can be matched. Despite of the occurence of the character 'c' the pointer is not moved.
This will handle a string where it is uncnown if comma or period are used as thousand or decimal separator. Only exception where this leads to a conflict is when there is only a single comma or period and 3 possible decimals (123.456 or 123,456). An optional parameter is passed to handle this case (assume thousands, assume decimal, decimal when period, decimal when comma). It assumes an imput string in any of the formats listed below.
function toFloat($pString, $seperatorOnConflict="f")
{
$decSeperator=".";
$thSeperator="";
$pString=str_replace(" ", $thSeperator, $pString);
$firstPeriod=strpos($pString, ".");
$firstComma=strpos($pString, ",");
if($firstPeriod!==FALSE && $firstComma!==FALSE) {
if($firstPeriod<$firstComma) {
$pString=str_replace(".", $thSeperator, $pString);
$pString=str_replace(",", $decSeperator, $pString);
}
else {
$pString=str_replace(",", $thSeperator, $pString);
}
}
else if($firstPeriod!==FALSE || $firstComma!==FALSE) {
$seperator=$firstPeriod!==FALSE?".":",";
if(substr_count($pString, $seperator)==1) {
$lastPeriodOrComma=strpos($pString, $seperator);
if($lastPeriodOrComma==(strlen($pString)-4) && ($seperatorOnConflict!=$seperator && $seperatorOnConflict!="f")) {
$pString=str_replace($seperator, $thSeperator, $pString);
}
else {
$pString=str_replace($seperator, $decSeperator, $pString);
}
}
else {
$pString=str_replace($seperator, $thSeperator, $pString);
}
}
return(float)$pString;
}
function testFloatParsing() {
$floatvals = array(
"22 000",
"22,000",
"22.000",
"123 456",
"123,456",
"123.456",
"22 000,76",
"22.000,76",
"22,000.76",
"22000.76",
"22000,76",
"1.022.000,76",
"1,022,000.76",
"1,000,000",
"1.000.000",
"1022000.76",
"1022000,76",
"1022000",
"0.76",
"0,76",
"0.00",
"0,00",
"1.00",
"1,00",
"-22 000,76",
"-22.000,76",
"-22,000.76",
"-22 000",
"-22,000",
"-22.000",
"-22000.76",
"-22000,76",
"-1.022.000,76",
"-1,022,000.76",
"-1,000,000",
"-1.000.000",
"-1022000.76",
"-1022000,76",
"-1022000",
"-0.76",
"-0,76",
"-0.00",
"-0,00",
"-1.00",
"-1,00"
);
echo "<table>
<tr>
<th>String</th>
<th>thousands</th>
<th>fraction</th>
<th>dec. if period</th>
<th>dec. if comma</th>
</tr>";
foreach ($floatvals as $fval) {
echo "<tr>";
echo "<td>" . (string) $fval . "</td>";
echo "<td>" . (float) toFloat($fval, "") . "</td>";
echo "<td>" . (float) toFloat($fval, "f") . "</td>";
echo "<td>" . (float) toFloat($fval, ".") . "</td>";
echo "<td>" . (float) toFloat($fval, ",") . "</td>";
echo "</tr>";
}
echo "</table>";
}