Chapter 4. Strings
The string is one of the most widely used forms of web output. A string is simply a collection of text—letters, numbers, special characters, or a combination thereof. Strings can be manipulated, cut, trimmed, truncated, spliced, and concatenated with ease in PHP. We have already seen some examples of strings being sent out to the web browser in Chapters 1 and 2. In this chapter, we will spend a lot more time on the good parts of string manipulation.
String manipulation is important; think of all the websites you have visited this week and try to imagine how much of the content was text-based as opposed to image- or video-based. Even sites like YouTube and CNN are heavily dependent on text to ease communication with the visitor. So let’s first see what a string actually consists of and how we can get that content onto a web browser.
What Is a String?
As mentioned above, a string is simply a collection of characters. These
collections can be sent to the browser with either the echo
or print
PHP statements, but they have to be contained within defining markers
(usually single or double quotations) for PHP to know which collection of
characters you want displayed.
Note
Although there is very little difference between echo
and print
(print
returns a 1 when it has finished sending
its output and takes only one parameter, whereas echo
can take multiple parameters), I have
made the choice to always use the echo
command. You can execute the
echo
command with the short PHP tag and an equals
sign (=
) combination (if
short_open_tag
is turned on in the php.ini file, it’s off by default generally),
like this:
<?= "sending out some text" ; ?>.
It’s really a personal choice as to which one to use, and I recommend that once you make that choice, stick with it so that your code remains consistent in this regard.
You Can Quote Me
Strings can be contained within either single or double quotation marks or a combination of the two, and in a HEREDOC or NOWDOC (more on these later). If you are building a string that will incorporate the contents of a variable, you are best served by using double quotes. Consider the following short code sample:
$fname = "Peter" ; $lname = "MacIntyre" ; $string = "The first name of the author of this book is $fname and his last name is $lname"; echo $string ;
Here, in the creation of the $string
variable, the code makes reference to
two other variable names, and PHP interpolates (inserts) their contents.
We can then echo out the $string
variable. We can also accomplish this using single quotes, but we would
have to do some concatenation with the period operator, because using
single quotes do not allow for variable expansion (variable content
insertion). The following code uses the single quote approach:
$fname = "Peter" ; $lname = "MacIntyre" ; $string = 'The first name of the author of this book is ' . $fname . ' and his last name is ' . $lname ; echo $string ;
Again, you can also build strings with a combination of both single
and double quotes; just be aware of the interpolative characteristics of the double quotes that the
single quotes do not have. Also be aware of the new line directive within
a string and how it is interpreted by double quotes and single quotes.
When the new line directive (\n
) is
encased within a single quote string, it will not work, yet within double
quotes it will. The following code snippet demonstrates this:
echo 'This sentence will not produce a new line \n'; echo "But this one will \n";
There is room for flexibility in the combinations of quotes that you can use, so be brave and experiment with them to see what you can accomplish.
Another way to build a string is to use a construct called
HEREDOC. This construct is very similar to using double quotes in
the sense that it interpolates variables, but it also lends itself to
building longer strings, and therefore makes them more readable to the
programmer. Begin the HEREDOC with three less than (<
) signs followed by a name designation for
the string being built. After the string is complete, repeat the name
designation on its own line with a terminating semicolon. Here is an
example:
$string = <<< RightHERE Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce eget nisl a metus rhoncus placerat ac ac nisl. Fusce consectetur tempus "tincidunt. Proin congue dapibus neque", at congue lectus volutpat in. Duis commodo, est tempor aliquam molestie, odio dolor fringilla arcu, nec iaculis est libero vitae erat. RightHERE; echo $string ;
The output of the above HEREDOC code sample is shown in Figure 4-1.
Note
You will find that the use of the HEREDOC construct lends itself very well to building Structured Query Language (SQL) statements. This technique will be used extensively in Chapter 7, where we discuss databases.
As you can see, there are many ways in which to define strings, and
there are equally as many ways in which to manipulate them. Keep in mind
that strings can also be handed to your code as opposed to you building
them manually. Strings of alphanumeric text—arrays, first names,
last names, phone numbers, part codes, email addresses, and so on—can be
passed into a code file for processing by way of a form field, via the
$_POST
or $_GET
methods (see Chapter 2).
String Functions (Best of)
For the rest of this chapter, let’s look at the best and most useful string functions in PHP so that you will have the tools to manage most of what you will have to deal with. I have grouped these functions into semilogical categories and given code samples for most of them. In many of the examples, we will be looking for the proverbial needle in a haystack—the needle being the string we are looking for with each function, and the haystack being the overall content in which we are performing each operation.
Note
You will notice that many of the functions we are about to look at
in the next few chapters do not necessarily follow a common style or
naming convention. This is mostly a result of PHP being an open source
product that many people and many years have affected, but is also
caused by things like some functions being named after the C++
equivalents they are based upon. Sometimes you will see functions
defined with underscores, like strip_tags
, while a similar function is named
without them, such as stripslashes
.
This flexibility is simultaneously a strength and a weakness of
PHP.
String Trimmings
Strings are often passed around in code with either leading or
trailing whitespace. To make sure your strings are not carrying this extra
content, simply use the ltrim
or rtrim
functions (if
you know which end of the string has the extra content). If you want to
be sure to get the whitespace content from both ends of the string at
the same time, use the trim
function. Here is
a sample:
$string = " The quick brown fox jumps over the lazy dog " ; var_dump(ltrim($string)); echo "<br/>"; var_dump(rtrim($string)); echo "<br/>"; var_dump(trim($string));
The output of this code is:
string(48) “The quick brown fox jumps over the lazy dog ”
string(48) “ The quick brown fox jumps over the lazy dog”
string(43) “The quick brown fox jumps over the lazy dog”
Note
We are using the function var_dump
here for
producing output, because there is more information returned with this
function than simply echoing or printing the output to the
browser.
There are five spaces on either side of the text string, so you
can see that the first two trimmings are being reported as having the
same length, 48 characters, yet there is space remaining on the end of
the first output and space remaining on the front of the second output.
When we use trim
, the space is
removed from both the front and the end of this sample string, yielding
a result with only 43 characters.
When you are truly hunting a needle in a haystack, you can also
use the trim
function to return a
“needle” of supplied characters that are to be trimmed out of the
string, as in the following:
$string = "The quick brown fox jumps over the lazy dog" ; var_dump(trim($string, "Thedog"));
With the following output:
string(37) “ quick brown fox jumps over the lazy ”
The trim
function looks at both
ends of the string for the supplied characters and strips them out.
Notice that the spaces remain at the beginning and the end of this
string, which is a slight variation in functionality when the second
argument, specifying the characters to be stripped, is supplied to the
trim
function. This behavior is also
true for the ltrim
and the rtrim
functions. If you want the spaces
trimmed as well, you will need to specify them.
Character Case Management
The next grouping of string functions can manipulate the
capitalization of portions of a supplied string. Using the same sample string of text, we can affect the
initial case of each word within the string with the ucwords
function, as shown here:
$string = "The quick brown fox jumps over the lazy dog" ; var_dump ucwords($string) ;
The expected output is:
string(43) “The Quick Brown Fox Jumps Over The Lazy Dog”
You can manipulate the case of an entire string with the
strtoupper
and the
strtolower
functions.
These functions turn the entire string to uppercase and lowercase
characters, respectively. Look at this code and the resulting
output:
$string = "The quick brown fox jumps over the lazy dog" ; var_dump( strtoupper($string)) ; echo "<br/>" ; var_dump( strtolower($string)) ;
string(43) “THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG”
string(43) “the quick brown fox jumps over the lazy dog”
The next two functions are probably not as widely used as the
previous functions, yet they do have their place, especially in
manipulating content being saved from web form pages. The ucfirst
and lcfirst
functions
change just the first character of a string to uppercase and lowercase,
respectively. This can be very useful in handling data like a last name
that you want to ensure has a leading uppercase letter. Here is some
sample code:
$string = "smith" ; var_dump( ucfirst ($string)) ; echo "<br/>" ; $string = "SMITH" ; var_dump( lcfirst($string)) ;
The expected output is:
string(5) “Smith”
string(5) “sMITH”
Note
lcfirst
is available only in
PHP 5.3 and later.
String Content Searching
You will almost certainly be doing more content manipulation than just playing with the cases of your text strings, so here we will look at additional ways to alter the contents of a string and to look for that proverbial “needle.”
The first thing we’ll look at here allows us to count the size of
a string. This can come in handy if you are trying to enter data into a
database field that only takes a set number of characters, for instance.
There are two functions in this group: first is str_word_count
, which, as expected, counts the number of words in a given
string. Second, strlen
returns the
length of the provided string. Careful, though—strlen
counts spaces as
part of the length of the string as well, so you may want to trim a
string before you ask for its length. Here is some sample code:
$string = " The quick brown fox jumps over the lazy dog" ; echo "word count: " . str_word_count($string) ; echo "<br/>" ; echo "String length: " . strlen($string) ; echo "<br/>" ; echo "String length trimmed: " . strlen(trim($string)) ;
The expected output is:
Word count: 9
String length: 45
String length trimmed: 43
We can also ask PHP to query the provided string to see if a
specific portion of text (subset) is contained within it. There are two
functions for doing this. The first, strstr
, is case-sensitive, while the second, stristr
, will search
irrespective of case. Both of these functions will look through the
haystack for the specified needle and, if they find it, will return the
portion of the string from the beginning of the needle to the end of the
haystack. If the needle is not found, false
is returned. Here is some code that
demonstrates this:
$string = "The quick brown fox jumps over the lazy dog" ; $needle = "BROWN fox"; echo "strstr: " ; var_dump( strstr($string, $needle) ); echo "<br/>" ; echo "stristr: " ; var_dump(stristr($string, $needle) ); echo "<br/>" ; $needle = "the" ; echo "strstr: " ; var_dump( strstr($string, $needle) ); echo "<br/>" ; echo "stristr: " ; var_dump(stristr($string, $needle) );
strstr: bool(false)
stristr: string(33) “brown fox jumps over the lazy dog”
strstr: string(12) “the lazy dog”
stristr: string(43) “The quick brown fox jumps over the lazy dog”
The first attempt returns false
since the capitalized word “BROWN” is not in the provided string. But
when we search for it irrespective of case by using stristr
, we get the expected result. In the
second grouping, we change the needle to “the” and the resulting output
is also as expected: with case sensitivity, the output begins at the
first lowercase “the” and, without it, the output begins at the
beginning.
Next is a collection of functions that can find positions,
manipulate content, and extract needles from the haystack. You can
pinpoint the location of a needle (the content you are looking for)
within the haystack (a string) by using the strpos
function. If the
specified string is not found at all, strpos
will return false. This is not the same
as a returned 0, so be sure to test your returned values with =
= =
(triple equals test) to ensure accuracy within you results. You can
replace a subset of text within a string with the str_replace
function. Finally, you can extract a subset of text from within the
haystack into another variable with the substr
function. Some
of these functions work best together. For example, you might find the
starting position of a needle with strpos
and then, in the same line of code,
extract the contents for a set number of characters to another variable
with the substr
function. Consider
this sample code and its subsequent output:
$string = "The quick brown fox jumps over the lazy dog" ; $position = strpos($string, "fox"); echo "position of 'fox' $position <br/>" ; $result = substr($string, strpos($string, "fox"), 8); echo "8 characters after finding the position of 'fox': $result <br/>" ; $new_string = str_replace("the", "black", $string); echo $new_string;
position of ‘fox’ 16
8 characters after finding the position of ‘fox’: fox jump
The quick brown fox jumps over black lazy dog
String Modification
Another valuable collection of functions includes those that can
alter a string of HTML content. The strip_tags
function
removes embedded HTML tags from within a string. There is
also a condition to the function that allows us to retain a list of
allowable tags. Here is an example:
$string = "The <strong>quick</strong> brown fox <a href='jumping.php'>jumps</a> over the lazy dog" ; echo $string . "<br/>" ; echo strip_tags($string) . "<br/>" ; echo strip_tags($string, '<strong>') ;
The browser output will look like this:
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
And if you reveal the source of the displayed browser page, you will see this:
The <strong>quick</strong> brown fox <a href='jumping.php'>jumps</a> over the lazy dog<br/> The quick brown fox jumps over the lazy dog<br/> The <strong>quick</strong> brown fox jumps over the lazy dog
Note that the tags are completely removed from the string in the
second display and all tags except <strong>
are
removed from the third display.
The next two string functions can be thought of as a pair of
opposites, in that one reverses what the other accomplishes, depending
on how they are used. They are addslashes
and stripslashes
. If
you read The Great Escape, you’ll remember that you
can escape some special characters like the double quote or the
backslash by using a preceding backslash. The addslashes
function looks for those special
characters in the provided string and escapes them with an added
backslash. The reversal of that is accomplished with stripslashes
. Sample code follows:
$web_path = "I'm Irish and my name is O'Mally" ; echo addslashes($web_path) . "<br/>" ; echo stripslashes($web_path) ;
I\’m Irish and my name is O\’Mally
I’m Irish and my name is O’Mally
Note
If you are using addslashes
to escape a preexisting backslash, and then using the stripslashes
function on that same string,
all backslashes will be stripped out. This may not be what you
want.
HTML, as you probably know, is heavily dependent on markup tags for displaying items,
and sometimes these tags are better served in what I like to call their
“raw” state. For example, the less-than sign (<
) can be represented in HTML as <
, a greater-than sign (>
) as >
, the ampersand (&
) as &
, and so on. With the use of the
htmlentities
function, we can convert the contents of a supplied string
containing these characters to their “raw” state. This is often used for
security reasons when accepting data from an outside source into a web
system. If desired, we can reverse the effect with the html_entity_decode
function. Here is a sample:
$string = "The <strong>quick</strong> brown fox <a href='jumping.php'>jumps </a> over the lazy dog" ; echo htmlentities($string) . "<br/>" ; echo html_entity_decode($string) ;
The <strong>quick</strong> brown fox <a
href=‘jumping.php’>jumps</a> over the lazy dog<br/>
The <strong>quick</strong> brown fox <a href=‘jumping.php’>jumps</a> over the lazy
dog
This can be very useful in the case of someone commenting on a blog entry or signing a website guest book, for example. The supplied text can be intercepted, preventing it from containing any potentially actionable HTML markup, as all HTML is converted to “raw” nonworking entities.
There are two more string functions that I want to bring to your
attention, and these have great application in the security aspect of
web development when dealing with passwords. The first is str_shuffle
, which makes a random
reorganization of a supplied string. You can use this function if you
want to have PHP generate a randomly arranged string from a supplied
string (to make a password a little more difficult to guess, for
example). Alternately, you can use the MD5
function to really
scramble up a supplied string. The MD5
function is used to get a 32-bit
hexadecimal equivalent of the supplied string.
Note
MD5
always returns the same
hash result for a given string, while str_shuffle
randomly reorganizes
the string contents each time, so for extra security you could
randomize the string and then perform MD5
on it.
Here is some code with these functions in action:
$string = "The quick brown fox jumps over the lazy dog" ; echo str_shuffle($string) . "<br/>" ; echo md5($string) . "<br/>" ; echo md5(str_shuffle($string)) ;
Initial display in the browser produces this output:
dhuo p qr xnus hzeyveloftaiewbTojrg mock
9e107d9d372bb6826bd81d3542a419d6
f71d7b9a5880c06163ed8adbdee5b55e
Refreshing the browser gives this output:
ugn uiferlwckvxT thzrbh o mqo doeaoesjp y
9e107d9d372bb6826bd81d3542a419d6
1356809b12da9a25482891606ccfaa8f
The second line of output, the single use of MD5
, does not change on a page refresh, while
the other content does.
Note
There is more detailed discussion on the MD5
function (and its more secure cousin sha1
) in Chapter 9 on
security.
PHP provides many more string functions, and over time you may choose to become familiar with many of them. The string functions we have covered here are those that you are likely to find the most beneficial right away. In the next chapter, we will follow a similar pattern with a discussion of arrays.
Get PHP: The Good Parts now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.