Chapter 1. Strings
Introduction
Strings in PHP are sequences of bytes, such as “We hold these truths to be self-evident” or “Once upon a time” or even “111211211.” When you read data from a file or output it to a web browser, your data is represented as strings.
PHP strings are binary-safe (i.e., they can contain null bytes) and can grow and shrink on demand. Their size is limited only by the amount of memory that is available to PHP.
Warning
Usually, PHP strings are ASCII strings. You must do extra work to handle non-ASCII data like UTF-8 or other multibyte character encodings (see Chapter 19).
Similar in form and behavior to Perl and the Unix shell, strings can be initialized in three ways: with single quotes, with double quotes, and with the “here document” (heredoc) format. With single-quoted strings, the only special characters you need to escape inside a string are the backslash and the single quote itself. This example shows four single-quoted strings:
'I have gone to the store.'
;
'I\'ve gone to the store.'
;
'Would you pay $1.75 for 8 ounces of tap water?'
;
'In double-quoted strings, newline is represented by \n'
;
It prints:
I have gone to the store.
I've gone to the store.
Would you pay $1.75 for 8 ounces of tap water?
In double-quoted strings, newline is represented by \n
Caution
The preceding output shows what the raw output looks like. If you view it in a web browser, you will see all the sentences on the same line because HTML requires additional markup to insert line breaks.
Because PHP doesn’t check for variable interpolation or almost any escape sequences in single-quoted strings, defining strings this way is straightforward and fast.
Double-quoted strings don’t recognize escaped single quotes, but they do recognize interpolated variables and the escape sequences shown in Table 1-1.
Escape sequence | Character |
| Newline (ASCII 10) |
| Carriage return (ASCII 13) |
| Tab (ASCII 9) |
| Backslash |
| Dollar sign |
| Double quote |
| Octal value |
| Hex value |
Example 1-1 shows some double-quoted strings.
"I've gone to the store."
;
"The sauce cost
\$
10.25."
;
$cost
=
'$10.25'
;
"The sauce cost
$cost
."
;
"The sauce cost
\$\061\060
.
\x32\x35
."
;
Example 1-1 prints:
I've gone to the store.
The sauce cost $10.25.
The sauce cost $10.25.
The sauce cost $10.25.
The last line of Example 1-1 prints the price of sauce correctly because the character 1
is ASCII code 49 decimal and 061 octal. Character 0
is ASCII 48 decimal and 060 octal; 2
is ASCII 50 decimal and 32 hex; and 5
is ASCII 53 decimal and 35 hex.
Heredoc-specified strings recognize all the interpolations and escapes of double-quoted strings, but they don’t require double quotes to be escaped. Heredocs start with <<<
and a token. That token (with no leading or trailing whitespace), followed by a semicolon to end the statement (if necessary), ends the heredoc. Example 1-2 shows how to define a heredoc.
<<<
END
It
'
s
funny
when
signs
say
things
like
:
Original
"Root"
Beer
"Free"
Gift
Shoes
cleaned
while
"you"
wait
or
have
other
misquoted
words
.
END
;
Example 1-2 prints:
It's funny when signs say things like:
Original "Root" Beer
"Free" Gift
Shoes cleaned while "you" wait
or have other misquoted words.
Newlines, spacing, and quotes are all preserved in a heredoc. By convention, the end-of-string identifier is usually all caps, and it is case sensitive. Example 1-3 shows two more valid heredocs.
<<<
PARSLEY
It
'
s
easy
to
grow
fresh
:
Parsley
Chives
on
your
windowsill
PARSLEY
;
<<<
DOGS
If
you
like
pets
,
yell
out
:
DOGS
AND
CATS
ARE
GREAT
!
DOGS
;
Heredocs are especially useful for printing out HTML with interpolated variables because you don’t have to escape the double quotes that appear in the HTML elements. Example 1-4 uses a heredoc to print HTML.
if
(
$remaining_cards
>
0
)
{
$url
=
'/deal.php'
;
$text
=
'Deal More Cards'
;
}
else
{
$url
=
'/new-game.php'
;
$text
=
'Start a New Game'
;
}
<<<
HTML
There
are
<
b
>
$remaining_cards
</
b
>
left
.
<
p
>
<
a
href
=
"
$url
"
>
$text
</
a
>
HTML
;
In Example 1-4, the semicolon needs to go after the end-of-string delimiter to tell PHP the statement is ended. In some cases, however, you shouldn’t use the semicolon. One of these cases is shown in Example 1-5, which uses a heredoc with the string concatenation operator.
$html
=
<<<
END
<
div
class
=
"
$divClass
"
>
<
ul
class
=
"
$ulClass
"
>
<
li
>
END
.
$listItem
.
'</li></div>'
;
$html
;
Assuming some reasonable values for the $divClass
, $ulClass
, and $listItem
variables, Example 1-5 prints:
<div class="class1">>
<ul class="class2">
<li> The List Item </li></div>
In Example 1-5, the expression needs to continue on the next line, so you don’t use a semicolon. Note also that in order for PHP to recognize the end-of-string delimiter, the .
string concatenation operator needs to go on a separate line from the end-of-string delimiter.
Nowdocs are similar to heredocs, but there is no variable interpolation. So, nowdocs are to heredocs as single-quoted strings are to double-quoted strings. They’re best when you have a block of non-PHP code, such as JavaScript, that you want to print as part of an HTML page or send to another program.
For example, if you’re using jQuery:
$js
=
<<<'__JS__'
$.ajax({
'url': '/api/getStock',
'data': {
'ticker': 'LNKD'
},
'success': function( data ) {
$( "#stock-price" ).html( "<strong>$" + data + "</strong>" );
}
});
__JS__;
$js
;
Individual bytes in strings can be referenced with square brackets. The first byte in the string is at index 0. Example 1-6 grabs one byte from a string.
Example 1-6 prints:
d
Accessing Substrings
Problem
You want to know if a string contains a particular substring. For example, you want to find out if an email address contains a @
.
Solution
Use strpos()
, as in Example 1-7.
Discussion
The return value from strpos()
is the first position in the string (the “haystack”) at which the substring (the “needle”) was found. If the needle wasn’t found at all in the haystack, strpos()
returns false
. If the needle is at the beginning of the haystack, strpos()
returns 0 because position 0 represents the beginning of the string. To differentiate between return values of 0 and false
, you must use the identity operator (===
) or the not–identity operator (!==
) instead of regular equals (==
) or not-equals (!=
). Example 1-7 compares the return value from strpos()
to false
using ===
. This test only succeeds if strpos()
returns false, not if it returns 0 or any other number.
See Also
Documentation on strpos()
.
Extracting Substrings
Problem
You want to extract part of a string, starting at a particular place in the string. For example, you want the first eight characters of a username entered into a form.
Solution
Use substr()
to select your substring, as in Example 1-8.
Discussion
If $start
and $length
are positive, substr()
returns $length
characters in the string, starting at $start
. The first character in the string is at position 0. Example 1-9 has positive $start
and $length
.
substr
(
'watch out for that tree'
,
6
,
5
);
Example 1-9 prints:
out f
If you leave out $length
, substr()
returns the string from $start
to the end of the original string, as shown in Example 1-10.
substr
(
'watch out for that tree'
,
17
);
Example 1-10 prints:
t tree
If $start
is bigger than the length of the string, substr()
returns false
.
If $start
plus $length
goes past the end of the string, substr()
returns all of the string from $start
forward, as shown in Example 1-11.
substr
(
'watch out for that tree'
,
20
,
5
);
Example 1-11 prints:
ree
If $start
is negative, substr()
counts back from the end of the string to determine where your substring starts, as shown in Example 1-12.
substr
(
'watch out for that tree'
,
-
6
);
substr
(
'watch out for that tree'
,
-
17
,
5
);
Example 1-12 prints:
t tree
out f
With a negative $start
value that goes past the beginning of the string (for example, if $start
is −27 with a 20-character string), substr()
behaves as if $start
is 0.
If $length
is negative, substr()
counts back from the end of the string to determine where your substring ends, as shown in Example 1-13.
substr
(
'watch out for that tree'
,
15
,
-
2
);
substr
(
'watch out for that tree'
,
-
4
,
-
1
);
Example 1-13 prints:
hat tr
tre
See Also
Documentation on substr()
.
Replacing Substrings
Problem
You want to replace a substring with a different string. For example, you want to obscure all but the last four digits of a credit card number before printing it.
Solution
Use substr_replace()
, as in Example 1-14.
// Everything from position $start to the end of $old_string
// becomes $new_substring
$new_string
=
substr_replace
(
$old_string
,
$new_substring
,
$start
);
// $length characters, starting at position $start, become $new_substring
$new_string
=
substr_replace
(
$old_string
,
$new_substring
,
$start
,
$length
);
Discussion
Without the $length
argument, substr_replace()
replaces everything from $start
to the end of the string. If $length
is specified, only that many characters are replaced:
substr_replace
(
'My pet is a blue dog.'
,
'fish.'
,
12
);
substr_replace
(
'My pet is a blue dog.'
,
'green'
,
12
,
4
);
$credit_card
=
'4111 1111 1111 1111'
;
substr_replace
(
$credit_card
,
'xxxx '
,
0
,
strlen
(
$credit_card
)
-
4
);
My
pet
is
a
fish
.
My
pet
is
a
green
dog
.
xxxx
1111
If $start
is negative, the new substring is placed by counting $start
characters from the end of $old_string
, not from the beginning:
substr_replace
(
'My pet is a blue dog.'
,
'fish.'
,
-
9
);
substr_replace
(
'My pet is a blue dog.'
,
'green'
,
-
9
,
4
);
My
pet
is
a
fish
.
My
pet
is
a
green
dog
.
If $start
and $length
are 0, the new substring is inserted at the start of $old_string
:
substr_replace
(
'My pet is a blue dog.'
,
'Title: '
,
0
,
0
);
Title
:
My
pet
is
a
blue
dog
.
The function substr_replace()
is useful when you’ve got text that’s too big to display all at once, and you want to display some of the text with a link to the rest. Example 1-15 displays the first 25 characters of a message with an ellipsis after it as a link to a page that displays more text.
$r
=
mysql_query
(
"SELECT id,message FROM messages WHERE id =
$id
"
)
or
die
();
$ob
=
mysql_fetch_object
(
$r
);
printf
(
'<a href="more-text.php?id=%d">%s</a>'
,
$ob
->
id
,
substr_replace
(
$ob
->
message
,
' ...'
,
25
));
The more-text.php page referenced in Example 1-15 can use the message ID passed in the query string to retrieve the full message and display it.
See Also
Documentation on substr_replace()
.
Processing a String One Byte at a Time
Solution
Loop through each byte in the string with for
. Example 1-16 counts the vowels in a string.
Discussion
Processing a string a character at a time is an easy way to calculate the “Look and Say” sequence, as shown in Example 1-17.
function
lookandsay
(
$s
)
{
// initialize the return value to the empty string
$r
=
''
;
// $m holds the character we're counting, initialize to the first
// character in the string
$m
=
$s
[
0
];
// $n is the number of $m's we've seen, initialize to 1
$n
=
1
;
for
(
$i
=
1
,
$j
=
strlen
(
$s
);
$i
<
$j
;
$i
++
)
{
// if this character is the same as the last one
if
(
$s
[
$i
]
==
$m
)
{
// increment the count of this character
$n
++
;
}
else
{
// otherwise, add the count and character to the return value
$r
.=
$n
.
$m
;
// set the character we're looking for to the current one
$m
=
$s
[
$i
];
// and reset the count to 1
$n
=
1
;
}
}
// return the built up string as well as the last count and character
return
$r
.
$n
.
$m
;
}
for
(
$i
=
0
,
$s
=
1
;
$i
<
10
;
$i
++
)
{
$s
=
lookandsay
(
$s
);
"
$s
\n
"
;
}
Example 1-17 prints:
1
11
21
1211
111221
312211
13112221
1113213211
31131211131221
13211311123113112211
It’s called the “Look and Say” sequence because each element is what you get by looking at the previous element and saying what’s in it. For example, looking at the first element, 1, you say “one one.” So the second element is “11.” That’s two ones, so the third element is “21.” Similarly, that’s one two and one one, so the fourth element is “1211,” and so on.
See Also
Documentation on for
; more about the “Look and Say” sequence.
Reversing a String by Word or Byte
Solution
Use strrev()
to reverse by byte, as in Example 1-18.
Example 1-18 prints:
.emordnilap a ton si sihT
To reverse by words, explode the string by word boundary, reverse the words, and then rejoin, as in Example 1-19.
$s
=
"Once upon a time there was a turtle."
;
// break the string up into words
$words
=
explode
(
' '
,
$s
);
// reverse the array of words
$words
=
array_reverse
(
$words
);
// rebuild the string
$s
=
implode
(
' '
,
$words
);
$s
;
Example 1-19 prints:
turtle. a was there time a upon Once
Discussion
Reversing a string by words can also be done all in one line with the code in Example 1-20.
See Also
Processing Every Word in a File discusses the implications of using something other than a space character as your word boundary; documentation on strrev()
and array_reverse()
.
Generating a Random String
Solution
Use str_rand()
:
function
str_rand
(
$length
=
32
,
$characters
=
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
)
{
if
(
!
is_int
(
$length
)
||
$length
<
0
)
{
return
false
;
}
$characters_length
=
strlen
(
$characters
)
-
1
;
$string
=
''
;
for
(
$i
=
$length
;
$i
>
0
;
$i
--
)
{
$string
.=
$characters
[
mt_rand
(
0
,
$characters_length
)];
}
return
$string
;
}
Discussion
PHP has native functions for generating random numbers, but nothing for random strings. The str_rand()
function returns a 32-character string constructed from letters and numbers.
Pass in an integer to change the length of the returned string. To use an alternative set of characters, pass them as a string as the second argument. For example, to get a 16-digit Morse Code:
str_rand
(
16
,
'.-'
);
.--..-.-.--.----
See Also
Generating Random Numbers Within a Range for generating random numbers.
Expanding and Compressing Tabs
Problem
You want to change spaces to tabs (or tabs to spaces) in a string while keeping text aligned with tab stops. For example, you want to display formatted text to users in a standardized way.
Solution
Use str_replace()
to switch spaces to tabs or tabs to spaces, as shown in Example 1-21.
$rows
=
$db
->
query
(
'SELECT message FROM messages WHERE id = 1'
);
$obj
=
$rows
->
fetch
(
PDO
::
FETCH_OBJ
);
$tabbed
=
str_replace
(
' '
,
"
\t
"
,
$obj
->
message
);
$spaced
=
str_replace
(
"
\t
"
,
' '
,
$obj
->
message
);
"With Tabs: <pre>
$tabbed
</pre>"
;
"With Spaces: <pre>
$spaced
</pre>"
;
Using str_replace()
for conversion, however, doesn’t respect tab stops. If you want tab stops every eight characters, a line beginning with a five-letter word and a tab should have that tab replaced with three spaces, not one. Use the tab_expand()
function shown in Example 1-22 to turn tabs to spaces in a way that respects tab stops.
function
tab_expand
(
$text
)
{
while
(
strstr
(
$text
,
"
\t
"
))
{
$text
=
preg_replace_callback
(
'/^([^\t\n]*)(\t+)/m'
,
'tab_expand_helper'
,
$text
);
}
return
$text
;
}
function
tab_expand_helper
(
$matches
)
{
$tab_stop
=
8
;
return
$matches
[
1
]
.
str_repeat
(
' '
,
strlen
(
$matches
[
2
])
*
$tab_stop
-
(
strlen
(
$matches
[
1
])
%
$tab_stop
));
}
$spaced
=
tab_expand
(
$obj
->
message
);
You can use the tab_unexpand()
function shown in Example 1-23 to turn spaces back to tabs.
function
tab_unexpand
(
$text
)
{
$tab_stop
=
8
;
$lines
=
explode
(
"
\n
"
,
$text
);
foreach
(
$lines
as
$i
=>
$line
)
{
// Expand any tabs to spaces
$line
=
tab_expand
(
$line
);
$chunks
=
str_split
(
$line
,
$tab_stop
);
$chunkCount
=
count
(
$chunks
);
// Scan all but the last chunk
for
(
$j
=
0
;
$j
<
$chunkCount
-
1
;
$j
++
)
{
$chunks
[
$j
]
=
preg_replace
(
'/ {2,}$/'
,
"
\t
"
,
$chunks
[
$j
]);
}
// If the last chunk is a tab-stop's worth of spaces
// convert it to a tab; Otherwise, leave it alone
if
(
$chunks
[
$chunkCount
-
1
]
==
str_repeat
(
' '
,
$tab_stop
))
{
$chunks
[
$chunkCount
-
1
]
=
"
\t
"
;
}
// Recombine the chunks
$lines
[
$i
]
=
implode
(
''
,
$chunks
);
}
// Recombine the lines
return
implode
(
"
\n
"
,
$lines
);
}
$tabbed
=
tab_unexpand
(
$obj
->
message
);
Both functions take a string as an argument and return the string appropriately modified.
Discussion
Each function assumes tab stops are every eight spaces, but that can be modified by changing the setting of the $tab_stop
variable.
The regular expression in tab_expand()
matches both a group of tabs and all the text in a line before that group of tabs. It needs to match the text before the tabs because the length of that text affects how many spaces the tabs should be replaced with so that subsequent text is aligned with the next tab stop. The function doesn’t just replace each tab with eight spaces; it adjusts text after tabs to line up with tab stops.
Similarly, tab_unexpand()
doesn’t just look for eight consecutive spaces and then replace them with one tab character. It divides up each line into eight-character chunks and then substitutes ending whitespace in those chunks (at least two spaces) with tabs. This not only preserves text alignment with tab stops; it also saves space in the string.
See Also
Documentation on str_replace()
, on preg_replace_callback()
, and on str_split()
. Using a PHP Function in a Regular Expression has more information on preg_replace_callback()
.
Controlling Case
Problem
You need to capitalize, lowercase, or otherwise modify the case of letters in a string. For example, you want to capitalize the initial letters of names but lowercase the rest.
Solution
Use ucfirst()
or ucwords()
to capitalize the first letter of one or more words, as shown in Example 1-24.
ucfirst
(
"how do you do today?"
);
ucwords
(
"the prince of wales"
);
Example 1-24 prints:
How do you do today?
The Prince Of Wales
Use strtolower()
or strtoupper()
to modify the case of entire strings, as in Example 1-25.
strtoupper
(
"i'm not yelling!"
);
strtolower
(
'<A HREF="one.php">one</A>'
);
Example 1-25 prints:
I'M NOT YELLING!
<a href="one.php">one</a>
Discussion
Use ucfirst()
to capitalize the first character in a string:
ucfirst
(
'monkey face'
);
ucfirst
(
'1 monkey face'
);
This prints:
Monkey face
1 monkey face
Note that the second phrase is not “1 Monkey face.”
Use ucwords()
to capitalize the first character of each word in a string:
ucwords
(
'1 monkey face'
);
ucwords
(
"don't play zone defense against the philadelphia 76-ers"
);
This prints:
1 Monkey Face
Don't Play Zone Defense Against The Philadelphia 76-ers
As expected, ucwords()
doesn’t capitalize the “t” in “don’t.” But it also doesn’t capitalize the “e” in “76-ers.” For ucwords()
, a word is any sequence of nonwhitespace characters that follows one or more whitespace characters. Because both '
and -
aren’t whitespace characters, ucwords()
doesn’t consider the “t” in “don’t” or the “e” in “76-ers” to be word-starting characters.
Both ucfirst()
and ucwords()
don’t change the case of non–first letters:
ucfirst
(
'macWorld says I should get an iBook'
);
ucwords
(
'eTunaFish.com might buy itunaFish.Com!'
);
This prints:
MacWorld says I should get an iBook
ETunaFish.com Might Buy ItunaFish.Com!
The functions strtolower()
and strtoupper()
work on entire strings, not just individual characters. All alphabetic characters are changed to lowercase by strtolower()
and strtoupper()
changes all alphabetic characters to uppercase:
strtolower
(
"I programmed the WOPR and the TRS-80."
);
strtoupper
(
'"since feeling is first" is a poem by e. e. cummings.'
);
This prints:
i programmed the wopr and the trs-80.
"SINCE FEELING IS FIRST" IS A POEM BY E. E. CUMMINGS.
When determining upper- and lowercase, these functions respect your locale settings.
See Also
For more information about locale settings, see Chapter 19; documentation on ucfirst()
, ucwords()
, strtolower()
, and strtoupper()
.
Interpolating Functions and Expressions Within Strings
Solution
Use the string concatenation operator (.), as shown in Example 1-26, when the value you want to include can’t be inside the string.
Discussion
You can put variables, object properties, and array elements (if the subscript is unquoted) directly in double-quoted strings:
"I have
$children
children."
;
"You owe
$amounts[payment]
immediately."
;
"My circle's diameter is
$circle->diameter
inches."
;
Interpolation with double-quoted strings places some limitations on the syntax of what can be interpolated. In the previous example, $amounts['payment']
had to be written as $amounts[payment]
so it would be interpolated properly. Use curly braces around more complicated expressions to interpolate them into a string. For example:
"I have
{
$children
}
children."
;
"You owe
{
$amounts
[
'payment'
]
}
immediately."
;
"My circle's diameter is
{
$circle
->
getDiameter
()
}
inches."
;
Direct interpolation or using string concatenation also works with heredocs. Interpolating with string concatenation in heredocs can look a little strange because the closing heredoc delimiter and the string concatenation operator have to be on separate lines:
<<<
END
Right
now
,
the
time
is
END
.
strftime
(
'%c'
)
.
<<<
END
but
tomorrow
it
will
be
END
.
strftime
(
'%c'
,
time
()
+
86400
);
Also, if you’re interpolating with heredocs, make sure to include appropriate spacing for the whole string to appear properly. In the previous example, Right now, the time is
has to include a trailing space, and but tomorrow it will be
has to include leading and trailing spaces.
See Also
For the syntax to interpolate variable variables (such as ${"amount_$i"}
), see Creating a Dynamic Variable Name; documentation on the string concatenation operator.
Trimming Blanks from a String
Problem
You want to remove whitespace from the beginning or end of a string. For example, you want to clean up user input before validating it.
Solution
Use ltrim()
, rtrim()
, or trim()
. The ltrim()
function removes whitespace from the beginning of a string, rtrim()
from the end of a string, and trim()
from both the beginning and end of a string:
$zipcode
=
trim
(
$_GET
[
'zipcode'
]);
$no_linefeed
=
rtrim
(
$_GET
[
'text'
]);
$name
=
ltrim
(
$_GET
[
'name'
]);
Discussion
For these functions, whitespace is defined as the following characters: newline, carriage return, space, horizontal and vertical tab, and null.
Trimming whitespace off of strings saves storage space and can make for more precise display of formatted data or text within <pre>
tags, for example. If you are doing comparisons with user input, you should trim the data first, so that someone who mistakenly enters 98052
followed by a few spaces as their zip code isn’t forced to fix an error that really isn’t one. Trimming before exact text comparisons also ensures that, for example, “salami\n” equals “salami.” It’s also a good idea to normalize string data by trimming it before storing it in a database.
The trim()
functions can also remove user-specified characters from strings. Pass the characters you want to remove as a second argument. You can indicate a range of characters with two dots between the first and last characters in the range:
// Remove numerals and space from the beginning of the line
ltrim
(
'10 PRINT A$'
,
' 0..9'
);
// Remove semicolon from the end of the line
rtrim
(
'SELECT * FROM turtles;'
,
';'
);
This prints:
PRINT A$
SELECT * FROM turtles
PHP also provides chop()
as an alias for rtrim()
. However, you’re best off using rtrim()
instead because PHP’s chop()
behaves differently than Perl’s chop()
(which is deprecated in favor of chomp()
, anyway), and using it can confuse others when they read your code.
Generating Comma-Separated Data
Problem
You want to format data as comma-separated values (CSV) so that it can be imported by a spreadsheet or database.
Solution
Use the fputcsv()
function to generate a CSV-formatted line from an array of data. Example 1-27 writes the data in $sales
into a file.
$sales
=
array
(
array
(
'Northeast'
,
'2005-01-01'
,
'2005-02-01'
,
12.54
),
array
(
'Northwest'
,
'2005-01-01'
,
'2005-02-01'
,
546.33
),
array
(
'Southeast'
,
'2005-01-01'
,
'2005-02-01'
,
93.26
),
array
(
'Southwest'
,
'2005-01-01'
,
'2005-02-01'
,
945.21
),
array
(
'All Regions'
,
'--'
,
'--'
,
1597.34
)
);
$filename
=
'./sales.csv'
;
$fh
=
fopen
(
$filename
,
'w'
)
or
die
(
"Can't open
$filename
"
);
foreach
(
$sales
as
$sales_line
)
{
if
(
fputcsv
(
$fh
,
$sales_line
)
===
false
)
{
die
(
"Can't write CSV line"
);
}
}
fclose
(
$fh
)
or
die
(
"Can't close
$filename
"
);
Discussion
To print the CSV-formatted data instead of writing it to a file, use the special output stream php://output
, as shown in Example 1-28.
$sales
=
array
(
array
(
'Northeast'
,
'2005-01-01'
,
'2005-02-01'
,
12.54
),
array
(
'Northwest'
,
'2005-01-01'
,
'2005-02-01'
,
546.33
),
array
(
'Southeast'
,
'2005-01-01'
,
'2005-02-01'
,
93.26
),
array
(
'Southwest'
,
'2005-01-01'
,
'2005-02-01'
,
945.21
),
array
(
'All Regions'
,
'--'
,
'--'
,
1597.34
)
);
$fh
=
fopen
(
'php://output'
,
'w'
);
foreach
(
$sales
as
$sales_line
)
{
if
(
fputcsv
(
$fh
,
$sales_line
)
===
false
)
{
die
(
"Can't write CSV line"
);
}
}
fclose
(
$fh
);
To put the CSV-formatted data into a string instead of printing it or writing it to a file, combine the technique in Example 1-28 with output buffering, as shown in Example 1-29.
$sales
=
array
(
array
(
'Northeast'
,
'2005-01-01'
,
'2005-02-01'
,
12.54
),
array
(
'Northwest'
,
'2005-01-01'
,
'2005-02-01'
,
546.33
),
array
(
'Southeast'
,
'2005-01-01'
,
'2005-02-01'
,
93.26
),
array
(
'Southwest'
,
'2005-01-01'
,
'2005-02-01'
,
945.21
),
array
(
'All Regions'
,
'--'
,
'--'
,
1597.34
)
);
ob_start
();
$fh
=
fopen
(
'php://output'
,
'w'
)
or
die
(
"Can't open php://output"
);
foreach
(
$sales
as
$sales_line
)
{
if
(
fputcsv
(
$fh
,
$sales_line
)
===
false
)
{
die
(
"Can't write CSV line"
);
}
}
fclose
(
$fh
)
or
die
(
"Can't close php://output"
);
$output
=
ob_get_contents
();
ob_end_clean
();
See Also
Documentation on fputcsv()
; Buffering Output to the Browser has more information about output buffering.
Parsing Comma-Separated Data
Problem
You have data in comma-separated values (CSV) format—for example, a file exported from Excel or a database—and you want to extract the records and fields into a format you can manipulate in PHP.
Solution
If the CSV data is in a file (or available via a URL), open the file with fopen()
and read in the data with fgetcsv()
. Example 1-30 prints out CSV data in an HTML table.
$fp
=
fopen
(
$filename
,
'r'
)
or
die
(
"can't open file"
);
"<table>
\n
"
;
while
(
$csv_line
=
fgetcsv
(
$fp
))
{
'<tr>'
;
for
(
$i
=
0
,
$j
=
count
(
$csv_line
);
$i
<
$j
;
$i
++
)
{
'<td>'
.
htmlentities
(
$csv_line
[
$i
])
.
'</td>'
;
}
"</tr>
\n
"
;
}
"</table>
\n
"
;
fclose
(
$fp
)
or
die
(
"can't close file"
);
Discussion
By default, fgetcsv()
reads in an entire line of data. If your average line length is more than 8,192 bytes, your program may run faster if you specify an explicit line length instead of letting PHP figure it out. Do this by providing a second argument to fgetcsv()
that is a value larger than the maximum length of a line in your CSV file. (Don’t forget to count the end-of-line whitespace.) If you pass a line length of 0, PHP will use the default behavior.
You can pass fgetcsv()
an optional third argument, a delimiter to use instead of a comma (,
). However, using a different delimiter somewhat defeats the purpose of CSV as an easy way to exchange tabular data.
Don’t be tempted to bypass fgetcsv()
and just read a line in and explode()
on the commas. CSV is more complicated than that so that it can deal with field values that have, for example, literal commas in them that should not be treated as field delimiters. Using fgetcsv()
protects you and your code from subtle errors.
See Also
Documentation on fgetcsv()
.
Generating Fixed-Width Field Data Records
Solution
Use pack()
with a format string that specifies a sequence of space-padded strings. Example 1-31 transforms an array of data into fixed-width records.
$books
=
array
(
array
(
'Elmer Gantry'
,
'Sinclair Lewis'
,
1927
),
array
(
'The Scarlatti Inheritance'
,
'Robert Ludlum'
,
1971
),
array
(
'The Parsifal Mosaic'
,
'William Styron'
,
1979
)
);
foreach
(
$books
as
$book
)
{
pack
(
'A25A15A4'
,
$book
[
0
],
$book
[
1
],
$book
[
2
])
.
"
\n
"
;
}
Discussion
The format string A25A14A4
tells pack()
to transform its subsequent arguments into a 25-character space-padded string, a 14-character space-padded string, and a 4-character space-padded string. For space-padded fields in fixed-width records, pack()
provides a concise solution.
To pad fields with something other than a space, however, use substr()
to ensure that the field values aren’t too long and str_pad()
to ensure that the field values aren’t too short. Example 1-32 transforms an array of records into fixed-width records with .
-padded fields.
$books
=
array
(
array
(
'Elmer Gantry'
,
'Sinclair Lewis'
,
1927
),
array
(
'The Scarlatti Inheritance'
,
'Robert Ludlum'
,
1971
),
array
(
'The Parsifal Mosaic'
,
'William Styron'
,
1979
)
);
foreach
(
$books
as
$book
)
{
$title
=
str_pad
(
substr
(
$book
[
0
],
0
,
25
),
25
,
'.'
);
$author
=
str_pad
(
substr
(
$book
[
1
],
0
,
15
),
15
,
'.'
);
$year
=
str_pad
(
substr
(
$book
[
2
],
0
,
4
),
4
,
'.'
);
"
$title$author$year
\n
"
;
}
See Also
Documentation on pack()
and on str_pad()
. Storing Binary Data in Strings discusses pack()
format strings in more detail.
Parsing Fixed-Width Field Data Records
Solution
Use substr()
as shown in Example 1-33.
$fp
=
fopen
(
'fixed-width-records.txt'
,
'r'
,
true
)
or
die
(
"can't open file"
);
while
(
$s
=
fgets
(
$fp
,
1024
))
{
$fields
[
1
]
=
substr
(
$s
,
0
,
25
);
// first field: first 25 characters of the line
$fields
[
2
]
=
substr
(
$s
,
25
,
15
);
// second field: next 15 characters of the line
$fields
[
3
]
=
substr
(
$s
,
40
,
4
);
// third field: next 4 characters of the line
$fields
=
array_map
(
'rtrim'
,
$fields
);
// strip the trailing whitespace
// a function to do something with the fields
process_fields
(
$fields
);
}
fclose
(
$fp
)
or
die
(
"can't close file"
);
Or unpack()
, as shown in Example 1-34.
Discussion
Data in which each field is allotted a fixed number of characters per line may look like this list of books, titles, and publication dates:
$booklist
=<<<
END
Elmer
Gantry
Sinclair
Lewis
1927
The
Scarlatti
InheritanceRobert
Ludlum
1971
The
Parsifal
Mosaic
Robert
Ludlum
1982
Sophie
'
s
Choice
William
Styron
1979
END
;
In each line, the title occupies the first 25 characters, the author’s name the next 15 characters, and the publication year the next 4 characters. Knowing those field widths, you can easily use substr()
to parse the fields into an array:
$books
=
explode
(
"
\n
"
,
$booklist
);
for
(
$i
=
0
,
$j
=
count
(
$books
);
$i
<
$j
;
$i
++
)
{
$book_array
[
$i
][
'title'
]
=
substr
(
$books
[
$i
],
0
,
25
);
$book_array
[
$i
][
'author'
]
=
substr
(
$books
[
$i
],
25
,
15
);
$book_array
[
$i
][
'publication_year'
]
=
substr
(
$books
[
$i
],
40
,
4
);
}
Exploding $booklist
into an array of lines makes the looping code the same whether it’s operating over a string or a series of lines read in from a file.
The loop can be made more flexible by specifying the field names and widths in a separate array that can be passed to a parsing function, as shown in the fixed_width_substr()
function in Example 1-35.
function
fixed_width_substr
(
$fields
,
$data
)
{
$r
=
array
();
for
(
$i
=
0
,
$j
=
count
(
$data
);
$i
<
$j
;
$i
++
)
{
$line_pos
=
0
;
foreach
(
$fields
as
$field_name
=>
$field_length
)
{
$r
[
$i
][
$field_name
]
=
rtrim
(
substr
(
$data
[
$i
],
$line_pos
,
$field_length
));
$line_pos
+=
$field_length
;
}
}
return
$r
;
}
$book_fields
=
array
(
'title'
=>
25
,
'author'
=>
15
,
'publication_year'
=>
4
);
$book_array
=
fixed_width_substr
(
$book_fields
,
$booklist
);
The variable $line_pos
keeps track of the start of each field and is advanced by the previous field’s width as the code moves through each line. Use rtrim()
to remove trailing whitespace from each field.
You can use unpack()
as a substitute for substr()
to extract fields. Instead of specifying the field names and widths as an associative array, create a format string for unpack()
. A fixed-width field extractor using unpack()
looks like the fixed_width_unpack()
function shown in Example 1-36.
function
fixed_width_unpack
(
$format_string
,
$data
)
{
$r
=
array
();
for
(
$i
=
0
,
$j
=
count
(
$data
);
$i
<
$j
;
$i
++
)
{
$r
[
$i
]
=
unpack
(
$format_string
,
$data
[
$i
]);
}
return
$r
;
}
Because the A
format to unpack()
means space-padded string, there’s no need to rtrim()
off the trailing spaces.
Once the fields have been parsed into $book_array
by either function, the data can be printed as an HTML table, for example:
$book_array
=
fixed_width_unpack
(
'A25title/A15author/A4publication_year'
,
$books
);
"<table>
\n
"
;
// print a header row
'<tr><td>'
;
join
(
'</td><td>'
,
array_keys
(
$book_array
[
0
]));
"</td></tr>
\n
"
;
// print each data row
foreach
(
$book_array
as
$row
)
{
'<tr><td>'
;
join
(
'</td><td>'
,
array_values
(
$row
));
"</td></tr>
\n
"
;
}
"</table>
\n
"
;
Joining data on </td><td>
produces a table row that is missing its first <td>
and last </td>
. We produce a complete table row by printing out <tr><td>
before the joined data and </td></tr>
after the joined data.
Both substr()
and unpack()
have equivalent capabilities when the fixed-width fields are strings, but unpack()
is the better solution when the elements of the fields aren’t just strings.
If all of your fields are the same size, str_split()
is a handy shortcut for chopping up incoming data. It returns an array made up of sections of a string. Example 1-37 uses str_split()
to break apart a string into 32-byte pieces.
See Also
For more information about unpack()
, see Storing Binary Data in Strings and the PHP website; documentation on str_split()
; Turning an Array into a String discusses join()
.
Taking Strings Apart
Problem
You need to break a string into pieces. For example, you want to access each line that a user enters in a <textarea>
form field.
Solution
Use explode()
if what separates the pieces is a constant string:
$words
=
explode
(
' '
,
'My sentence is not very complicated'
);
Use preg_split()
if you need a Perl-compatible regular expression to describe the separator:
$words
=
preg_split
(
'/\d\. /'
,
'my day: 1. get up 2. get dressed 3. eat toast'
);
$lines
=
preg_split
(
'/[\n\r]+/'
,
$_POST
[
'textarea'
]);
Use the /i
flag to preg_split()
for case-insensitive separator matching:
$words
=
preg_split
(
'/ x /i'
,
'31 inches x 22 inches X 9 inches'
);
Discussion
The simplest solution of the bunch is explode()
. Pass it your separator string, the string to be separated, and an optional limit on how many elements should be returned:
$dwarves
=
'dopey,sleepy,happy,grumpy,sneezy,bashful,doc'
;
$dwarf_array
=
explode
(
','
,
$dwarves
);
This makes $dwarf_array
a seven-element array, so print_r($dwarf_array)
prints:
Array
(
[0] => dopey
[1] => sleepy
[2] => happy
[3] => grumpy
[4] => sneezy
[5] => bashful
[6] => doc
)
If the specified limit is less than the number of possible chunks, the last chunk contains the remainder:
$dwarf_array
=
explode
(
','
,
$dwarves
,
5
);
print_r
(
$dwarf_array
);
This prints:
Array
(
[0] => dopey
[1] => sleepy
[2] => happy
[3] => grumpy
[4] => sneezy,bashful,doc
)
The separator is treated literally by explode()
. If you specify a comma and a space as a separator, it breaks the string only on a comma followed by a space, not on a comma or a space.
With preg_split()
, you have more flexibility. Instead of a string literal as a separator, it uses a Perl-compatible regular expression engine. With preg_split()
, you can take advantage of various Perl-ish regular expression extensions, as well as tricks such as including the separator text in the returned array of strings:
$math
=
"3 + 2 / 7 - 9"
;
$stack
=
preg_split
(
'/ *([+\-\/*]) */'
,
$math
,
-
1
,
PREG_SPLIT_DELIM_CAPTURE
);
print_r
(
$stack
);
This prints:
Array
(
[0] => 3
[1] => +
[2] => 2
[3] => /
[4] => 7
[5] => -
[6] => 9
)
The separator regular expression looks for the four mathematical operators (+
, -
, /
, *
), surrounded by optional leading or trailing spaces. The PREG_SPLIT_DELIM_CAPTURE
flag tells preg_split()
to include the matches as part of the separator regular expression in parentheses in the returned array of strings. Only the mathematical operator character class is in parentheses, so the returned array doesn’t have any spaces in it.
See Also
Regular expressions are discussed in more detail in Chapter 23; documentation on explode()
and preg_split()
.
Wrapping Text at a Certain Line Length
Problem
You need to wrap lines in a string. For example, you want to display text by using <pre>
and </pre>
tags but have it stay within a regularly sized browser window.
Solution
Use wordwrap()
:
$s
=
"Four score and seven years ago our fathers brought forth on this continent
a
new
nation
,
conceived
in
liberty
and
dedicated
to
the
proposition
that
all
men
are
created
equal
.
";
print "
<
pre
>
\n
".wordwrap(
$s
)."
\n
</
pre
>
";
This prints:
<pre>
Four score and seven years ago our fathers brought forth on this continent
a new nation, conceived in liberty and dedicated to the proposition that
all men are created equal.
</pre>
Discussion
By default, wordwrap()
wraps text at 75 characters per line. An optional second argument specifies a different line length:
wordwrap
(
$s
,
50
);
This prints:
Four score and seven years ago our fathers brought
forth on this continent a new nation, conceived in
liberty and dedicated to the proposition that all
men are created equal.
Other characters besides \n
can be used for line breaks. For double spacing, use "\n\n"
:
wordwrap
(
$s
,
50
,
"
\n\n
"
);
This prints:
Four score and seven years ago our fathers brought
forth on this continent a new nation, conceived in
liberty and dedicated to the proposition that all
men are created equal.
There is an optional fourth argument to wordwrap()
that controls the treatment of words that are longer than the specified line length. If this argument is 1, these words are wrapped. Otherwise, they span past the specified line length:
wordwrap
(
'jabberwocky'
,
5
)
.
"
\n
"
;
wordwrap
(
'jabberwocky'
,
5
,
"
\n
"
,
1
);
This prints:
jabberwocky
jabbe
rwock
y
See Also
Documentation on wordwrap()
.
Storing Binary Data in Strings
Problem
You want to parse a string that contains values encoded as a binary structure or encode values into a string. For example, you want to store numbers in their binary representation instead of as sequences of ASCII characters.
Solution
Use pack()
to store binary data in a string:
$packed
=
pack
(
'S4'
,
1974
,
106
,
28225
,
32725
);
Use unpack()
to extract binary data from a string:
$nums
=
unpack
(
'S4'
,
$packed
);
Discussion
The first argument to pack()
is a format string that describes how to encode the data that’s passed in the rest of the arguments. The format string S4
tells pack()
to produce four unsigned short 16-bit numbers in machine byte order from its input data. Given 1974, 106, 28225, and 32725 as input on a little-endian machine, this returns eight bytes: 182, 7, 106, 0, 65, 110, 213, and 127. Each two-byte pair corresponds to one of the input numbers: 7 * 256 + 182 is 1974; 0 * 256 + 106 is 106; 110 * 256 + 65 = 28225; 127 * 256 + 213 = 32725.
The first argument to unpack()
is also a format string, and the second argument is the data to decode. Passing a format string of S4
, the eight-byte sequence that pack()
produced returns a four-element array of the original numbers. print_r($nums)
prints:
Array
(
[1] => 1974
[2] => 106
[3] => 28225
[4] => 32725
)
In unpack()
, format characters and their count can be followed by a string to be used as an array key. For example:
$nums
=
unpack
(
'S4num'
,
$packed
);
print_r
(
$nums
);
This prints:
Array
(
[num1] => 1974
[num2] => 106
[num3] => 28225
[num4] => 32725
)
Multiple format characters must be separated with /
in unpack()
:
$nums
=
unpack
(
'S1a/S1b/S1c/S1d'
,
$packed
);
print_r
(
$nums
);
This prints:
Array
(
[a] => 1974
[b] => 106
[c] => 28225
[d] => 32725
)
The format characters that can be used with pack()
and unpack()
are listed in Table 1-2.
Format character | Data type |
| NUL-padded string |
| Space-padded string |
| Hex string, low nibble first |
| Hex string, high nibble first |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| NUL byte |
| Back up one byte |
| NUL-fill to absolute position |
For a
, A
, h
, and H
, a number after the format character indicates how long the string is. For example, A25
means a 25-character space-padded string. For other format characters, a following number means how many of that type appear consecutively in a string. Use *
to take the rest of the available data.
You can convert between data types with unpack()
. This example fills the array $ascii
with the ASCII values of each character in $s
:
$s
=
'platypus'
;
$ascii
=
unpack
(
'c*'
,
$s
);
print_r
(
$ascii
);
This prints:
Array
(
[1] => 112
[2] => 108
[3] => 97
[4] => 116
[5] => 121
[6] => 112
[7] => 117
[8] => 115
)
Program: Downloadable CSV File
Combining the header()
function to change the content type of what your PHP program outputs with the fputcsv()
function for data formatting lets you send CSV files to browsers that will be automatically handed off to a spreadsheet program (or whatever application is configured on a particular client system to handle CSV files). Example 1-38 formats the results of an SQL SELECT
query as CSV data and provides the correct headers so that it is properly handled by the browser.
$db
=
new
PDO
(
'sqlite:/usr/local/data/sales.db'
);
$query
=
$db
->
query
(
'SELECT region, start, end, amount FROM sales'
,
PDO
::
FETCH_NUM
);
$sales_data
=
$db
->
fetchAll
();
// Open filehandle for fputcsv()
$output
=
fopen
(
'php://output'
,
'w'
)
or
die
(
"Can't open php://output"
);
$total
=
0
;
// Tell browser to expect a CSV file
header
(
'Content-Type: application/csv'
);
header
(
'Content-Disposition: attachment; filename="sales.csv"'
);
// Print header row
fputcsv
(
$output
,
array
(
'Region'
,
'Start Date'
,
'End Date'
,
'Amount'
));
// Print each data row and increment $total
foreach
(
$sales_data
as
$sales_line
)
{
fputcsv
(
$output
,
$sales_line
);
$total
+=
$sales_line
[
3
];
}
// Print total row and close file handle
fputcsv
(
$output
,
array
(
'All Regions'
,
'--'
,
'--'
,
$total
));
fclose
(
$output
)
or
die
(
"Can't close php://output"
);
Example 1-38 sends two headers to ensure that the browser handles the CSV output properly. The first header, Content-Type
, tells the browser that the output is not HTML, but CSV. The second header, Content-Disposition
, tells the browser not to display the output but to attempt to load an external program to handle it. The filename
attribute of this header supplies a default filename for the browser to use for the downloaded file.
If you want to provide different views of the same data, you can combine the formatting code in one page and use a query string variable to determine which kind of data formatting to do. In Example 1-39, the format
query string variable controls whether the results of an SQL SELECT
query are returned as an HTML table or CSV.
$db
=
new
PDO
(
'sqlite:/usr/local/data/sales.db'
);
$query
=
$db
->
query
(
'SELECT region, start, end, amount FROM sales'
,
PDO
::
FETCH_NUM
);
$sales_data
=
$db
->
fetchAll
();
$total
=
0
;
$column_headers
=
array
(
'Region'
,
'Start Date'
,
'End Date'
,
'Amount'
);
// Decide what format to use
$format
=
$_GET
[
'format'
]
==
'csv'
?
'csv'
:
'html'
;
// Print format-appropriate beginning
if
(
$format
==
'csv'
)
{
$output
=
fopen
(
'php://output'
,
'w'
)
or
die
(
"Can't open php://output"
);
header
(
'Content-Type: application/csv'
);
header
(
'Content-Disposition: attachment; filename="sales.csv"'
);
fputcsv
(
$output
,
$column_headers
);
}
else
{
echo
'<table><tr><th>'
;
echo
implode
(
'</th><th>'
,
$column_headers
);
echo
'</th></tr>'
;
}
foreach
(
$sales_data
as
$sales_line
)
{
// Print format-appropriate line
if
(
$format
==
'csv'
)
{
fputcsv
(
$output
,
$sales_line
);
}
else
{
echo
'<tr><td>'
.
implode
(
'</td><td>'
,
$sales_line
)
.
'</td></tr>'
;
}
$total
+=
$sales_line
[
3
];
}
$total_line
=
array
(
'All Regions'
,
'--'
,
'--'
,
$total
);
// Print format-appropriate footer
if
(
$format
==
'csv'
)
{
fputcsv
(
$output
,
$total_line
);
fclose
(
$output
)
or
die
(
"Can't close php://output"
);
}
else
{
echo
'<tr><td>'
.
implode
(
'</td><td>'
,
$total_line
)
.
'</td></tr>'
;
echo
'</table>'
;
}
Accessing the program in Example 1-39 with format=csv
in the query string causes it to return CSV-formatted output. Any other format
value in the query string causes it to return HTML output. The logic that sets $format
to CSV or HTML could easily be extended to other output formats such as JSON. If you have many places where you want to offer for download the same data in multiple formats, package the code in Example 1-39 into a function that accepts an array of data and a format specifier and then displays the right results.
Get PHP Cookbook, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.