O'Reilly logo

PHP: The Good Parts by Peter MacIntyre

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4. Strings

The string is one of the most widely used forms of web output. A string is simply a collection of text—letters, numbers, special characters, or a combination thereof. Strings can be manipulated, cut, trimmed, truncated, spliced, and concatenated with ease in PHP. We have already seen some examples of strings being sent out to the web browser in Chapters 1 and 2. In this chapter, we will spend a lot more time on the good parts of string manipulation.

String manipulation is important; think of all the websites you have visited this week and try to imagine how much of the content was text-based as opposed to image- or video-based. Even sites like YouTube and CNN are heavily dependent on text to ease communication with the visitor. So let’s first see what a string actually consists of and how we can get that content onto a web browser.

What Is a String?

As mentioned above, a string is simply a collection of characters. These collections can be sent to the browser with either the echo or print PHP statements, but they have to be contained within defining markers (usually single or double quotations) for PHP to know which collection of characters you want displayed.

Note

Although there is very little difference between echo and print (print returns a 1 when it has finished sending its output and takes only one parameter, whereas echo can take multiple parameters), I have made the choice to always use the echo command. You can execute the echo command with the short PHP tag and an equals sign (=) combination (if short_open_tag is turned on in the php.ini file, it’s off by default generally), like this:

<?= "sending out some text" ; ?>.

It’s really a personal choice as to which one to use, and I recommend that once you make that choice, stick with it so that your code remains consistent in this regard.

You Can Quote Me

Strings can be contained within either single or double quotation marks or a combination of the two, and in a HEREDOC or NOWDOC (more on these later). If you are building a string that will incorporate the contents of a variable, you are best served by using double quotes. Consider the following short code sample:

$fname = "Peter" ;
$lname = "MacIntyre" ;
$string = "The first name of the author of this book
is $fname and his last name is $lname";
echo $string ;

Here, in the creation of the $string variable, the code makes reference to two other variable names, and PHP interpolates (inserts) their contents. We can then echo out the $string variable. We can also accomplish this using single quotes, but we would have to do some concatenation with the period operator, because using single quotes do not allow for variable expansion (variable content insertion). The following code uses the single quote approach:

$fname = "Peter" ;
$lname = "MacIntyre" ;
$string = 'The first name of the author of this book is '
 . $fname . ' and his last name is ' . $lname ;
echo $string ;

Again, you can also build strings with a combination of both single and double quotes; just be aware of the interpolative characteristics of the double quotes that the single quotes do not have. Also be aware of the new line directive within a string and how it is interpreted by double quotes and single quotes. When the new line directive (\n) is encased within a single quote string, it will not work, yet within double quotes it will. The following code snippet demonstrates this:

echo 'This sentence will not produce a new line \n';
echo "But this one will \n";

There is room for flexibility in the combinations of quotes that you can use, so be brave and experiment with them to see what you can accomplish.

Another way to build a string is to use a construct called HEREDOC. This construct is very similar to using double quotes in the sense that it interpolates variables, but it also lends itself to building longer strings, and therefore makes them more readable to the programmer. Begin the HEREDOC with three less than (<) signs followed by a name designation for the string being built. After the string is complete, repeat the name designation on its own line with a terminating semicolon. Here is an example:

$string = <<< RightHERE
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Fusce eget nisl a metus rhoncus placerat ac ac nisl.
Fusce consectetur tempus "tincidunt. Proin congue
dapibus neque", at congue lectus volutpat in.
Duis commodo, est tempor aliquam molestie, odio dolor fringilla arcu,
nec iaculis est libero vitae erat.
RightHERE;

echo $string ;

The output of the above HEREDOC code sample is shown in Figure 4-1.

Sample HEREDOC browser output
Figure 4-1. Sample HEREDOC browser output

Note

You will find that the use of the HEREDOC construct lends itself very well to building Structured Query Language (SQL) statements. This technique will be used extensively in Chapter 7, where we discuss databases.

As you can see, there are many ways in which to define strings, and there are equally as many ways in which to manipulate them. Keep in mind that strings can also be handed to your code as opposed to you building them manually. Strings of alphanumeric text—arrays, first names, last names, phone numbers, part codes, email addresses, and so on—can be passed into a code file for processing by way of a form field, via the $_POST or $_GET methods (see Chapter 2).

String Functions (Best of)

For the rest of this chapter, let’s look at the best and most useful string functions in PHP so that you will have the tools to manage most of what you will have to deal with. I have grouped these functions into semilogical categories and given code samples for most of them. In many of the examples, we will be looking for the proverbial needle in a haystack—the needle being the string we are looking for with each function, and the haystack being the overall content in which we are performing each operation.

Note

You will notice that many of the functions we are about to look at in the next few chapters do not necessarily follow a common style or naming convention. This is mostly a result of PHP being an open source product that many people and many years have affected, but is also caused by things like some functions being named after the C++ equivalents they are based upon. Sometimes you will see functions defined with underscores, like strip_tags, while a similar function is named without them, such as stripslashes. This flexibility is simultaneously a strength and a weakness of PHP.

String Trimmings

Strings are often passed around in code with either leading or trailing whitespace. To make sure your strings are not carrying this extra content, simply use the ltrim or rtrim functions (if you know which end of the string has the extra content). If you want to be sure to get the whitespace content from both ends of the string at the same time, use the trim function. Here is a sample:

$string = "     The quick brown fox jumps over the lazy dog     " ;
var_dump(ltrim($string));
echo "<br/>";
var_dump(rtrim($string));
echo "<br/>";
var_dump(trim($string));

The output of this code is:

string(48) “The quick brown fox jumps over the lazy dog     ”
string(48) “     The quick brown fox jumps over the lazy dog”
string(43) “The quick brown fox jumps over the lazy dog”

Note

We are using the function var_dump here for producing output, because there is more information returned with this function than simply echoing or printing the output to the browser.

There are five spaces on either side of the text string, so you can see that the first two trimmings are being reported as having the same length, 48 characters, yet there is space remaining on the end of the first output and space remaining on the front of the second output. When we use trim, the space is removed from both the front and the end of this sample string, yielding a result with only 43 characters.

When you are truly hunting a needle in a haystack, you can also use the trim function to return a “needle” of supplied characters that are to be trimmed out of the string, as in the following:

$string = "The quick brown fox jumps over the lazy dog" ;
var_dump(trim($string, "Thedog"));

With the following output:

string(37) “ quick brown fox jumps over the lazy ”

The trim function looks at both ends of the string for the supplied characters and strips them out. Notice that the spaces remain at the beginning and the end of this string, which is a slight variation in functionality when the second argument, specifying the characters to be stripped, is supplied to the trim function. This behavior is also true for the ltrim and the rtrim functions. If you want the spaces trimmed as well, you will need to specify them.

Character Case Management

The next grouping of string functions can manipulate the capitalization of portions of a supplied string. Using the same sample string of text, we can affect the initial case of each word within the string with the ucwords function, as shown here:

$string = "The quick brown fox jumps over the lazy dog" ;
var_dump ucwords($string) ;

The expected output is:

string(43) “The Quick Brown Fox Jumps Over The Lazy Dog”

You can manipulate the case of an entire string with the strtoupper and the strtolower functions. These functions turn the entire string to uppercase and lowercase characters, respectively. Look at this code and the resulting output:

$string = "The quick brown fox jumps over the lazy dog" ;
var_dump( strtoupper($string)) ;
echo "<br/>" ;
var_dump( strtolower($string)) ;

string(43) “THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG”
string(43) “the quick brown fox jumps over the lazy dog”

The next two functions are probably not as widely used as the previous functions, yet they do have their place, especially in manipulating content being saved from web form pages. The ucfirst and lcfirst functions change just the first character of a string to uppercase and lowercase, respectively. This can be very useful in handling data like a last name that you want to ensure has a leading uppercase letter. Here is some sample code:

$string = "smith" ;
var_dump( ucfirst ($string)) ;
echo "<br/>" ;
$string = "SMITH" ;
var_dump( lcfirst($string)) ;

The expected output is:

string(5) “Smith”
string(5) “sMITH”

Note

lcfirst is available only in PHP 5.3 and later.

String Content Searching

You will almost certainly be doing more content manipulation than just playing with the cases of your text strings, so here we will look at additional ways to alter the contents of a string and to look for that proverbial “needle.”

The first thing we’ll look at here allows us to count the size of a string. This can come in handy if you are trying to enter data into a database field that only takes a set number of characters, for instance. There are two functions in this group: first is str_word_count, which, as expected, counts the number of words in a given string. Second, strlen returns the length of the provided string. Careful, though—strlen counts spaces as part of the length of the string as well, so you may want to trim a string before you ask for its length. Here is some sample code:

$string = "  The quick brown fox jumps over the lazy dog" ;
echo "word count: " . str_word_count($string) ;
echo "<br/>" ;
echo "String length: " . strlen($string) ;
echo "<br/>" ;
echo "String length trimmed: " . strlen(trim($string)) ;

The expected output is:

Word count: 9
String length: 45
String length trimmed: 43

We can also ask PHP to query the provided string to see if a specific portion of text (subset) is contained within it. There are two functions for doing this. The first, strstr, is case-sensitive, while the second, stristr, will search irrespective of case. Both of these functions will look through the haystack for the specified needle and, if they find it, will return the portion of the string from the beginning of the needle to the end of the haystack. If the needle is not found, false is returned. Here is some code that demonstrates this:

$string = "The quick brown fox jumps over the lazy dog" ;
$needle = "BROWN fox";
echo "strstr: " ;
var_dump( strstr($string, $needle) );
echo "<br/>" ;
echo "stristr: " ;
var_dump(stristr($string, $needle) );
echo "<br/>" ;
$needle = "the" ;
echo "strstr: " ;
var_dump( strstr($string, $needle) );
echo "<br/>" ;
echo "stristr: " ;
var_dump(stristr($string, $needle) );

strstr: bool(false)
stristr: string(33) “brown fox jumps over the lazy dog”
strstr: string(12) “the lazy dog”
stristr: string(43) “The quick brown fox jumps over the lazy dog”

The first attempt returns false since the capitalized word “BROWN” is not in the provided string. But when we search for it irrespective of case by using stristr, we get the expected result. In the second grouping, we change the needle to “the” and the resulting output is also as expected: with case sensitivity, the output begins at the first lowercase “the” and, without it, the output begins at the beginning.

Next is a collection of functions that can find positions, manipulate content, and extract needles from the haystack. You can pinpoint the location of a needle (the content you are looking for) within the haystack (a string) by using the strpos function. If the specified string is not found at all, strpos will return false. This is not the same as a returned 0, so be sure to test your returned values with = = = (triple equals test) to ensure accuracy within you results. You can replace a subset of text within a string with the str_replace function. Finally, you can extract a subset of text from within the haystack into another variable with the substr function. Some of these functions work best together. For example, you might find the starting position of a needle with strpos and then, in the same line of code, extract the contents for a set number of characters to another variable with the substr function. Consider this sample code and its subsequent output:

$string = "The quick brown fox jumps over the lazy dog" ;
$position = strpos($string, "fox");
echo "position of 'fox' $position <br/>" ;
$result = substr($string, strpos($string, "fox"), 8);
echo "8 characters after finding the position of 'fox':  $result <br/>" ;
$new_string = str_replace("the", "black", $string);
echo $new_string;

position of ‘fox’ 16
8 characters after finding the position of ‘fox’: fox jump
The quick brown fox jumps over black lazy dog

String Modification

Another valuable collection of functions includes those that can alter a string of HTML content. The strip_tags function removes embedded HTML tags from within a string. There is also a condition to the function that allows us to retain a list of allowable tags. Here is an example:

$string = "The <strong>quick</strong> brown fox <a href='jumping.php'>jumps</a> 
over the lazy dog" ;
echo $string . "<br/>" ;
echo strip_tags($string) . "<br/>" ;
echo strip_tags($string, '<strong>') ;

The browser output will look like this:

The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog

And if you reveal the source of the displayed browser page, you will see this:

The <strong>quick</strong> brown fox <a href='jumping.php'>jumps</a> 
over the lazy dog<br/>

The quick brown fox jumps over the lazy dog<br/>

The <strong>quick</strong> brown fox jumps over the lazy dog

Note that the tags are completely removed from the string in the second display and all tags except <strong> are removed from the third display.

The next two string functions can be thought of as a pair of opposites, in that one reverses what the other accomplishes, depending on how they are used. They are addslashes and stripslashes. If you read The Great Escape, you’ll remember that you can escape some special characters like the double quote or the backslash by using a preceding backslash. The addslashes function looks for those special characters in the provided string and escapes them with an added backslash. The reversal of that is accomplished with stripslashes. Sample code follows:

$web_path = "I'm Irish and my name is O'Mally" ;
echo addslashes($web_path) . "<br/>" ;
echo stripslashes($web_path) ;

I\’m Irish and my name is O\’Mally
I’m Irish and my name is O’Mally

Note

If you are using addslashes to escape a preexisting backslash, and then using the stripslashes function on that same string, all backslashes will be stripped out. This may not be what you want.

HTML, as you probably know, is heavily dependent on markup tags for displaying items, and sometimes these tags are better served in what I like to call their “raw” state. For example, the less-than sign (<) can be represented in HTML as &lt;, a greater-than sign (>) as &gt;, the ampersand (&) as &amp;, and so on. With the use of the htmlentities function, we can convert the contents of a supplied string containing these characters to their “raw” state. This is often used for security reasons when accepting data from an outside source into a web system. If desired, we can reverse the effect with the html_entity_decode function. Here is a sample:

$string = "The <strong>quick</strong> brown fox <a href='jumping.php'>jumps
</a> over the lazy dog" ;
echo htmlentities($string) . "<br/>" ;
echo html_entity_decode($string) ;

The &lt;strong&gt;quick&lt;/strong&gt; brown fox &lt;a
href=‘jumping.php’&gt;jumps&lt;/a&gt; over the lazy dog<br/>
The <strong>quick</strong> brown fox <a href=‘jumping.php’>jumps</a> over the lazy
dog

This can be very useful in the case of someone commenting on a blog entry or signing a website guest book, for example. The supplied text can be intercepted, preventing it from containing any potentially actionable HTML markup, as all HTML is converted to “raw” nonworking entities.

There are two more string functions that I want to bring to your attention, and these have great application in the security aspect of web development when dealing with passwords. The first is str_shuffle, which makes a random reorganization of a supplied string. You can use this function if you want to have PHP generate a randomly arranged string from a supplied string (to make a password a little more difficult to guess, for example). Alternately, you can use the MD5 function to really scramble up a supplied string. The MD5 function is used to get a 32-bit hexadecimal equivalent of the supplied string.

Note

MD5 always returns the same hash result for a given string, while str_shuffle randomly reorganizes the string contents each time, so for extra security you could randomize the string and then perform MD5 on it.

Here is some code with these functions in action:

$string = "The quick brown fox jumps over the lazy dog" ;
echo str_shuffle($string) . "<br/>" ;
echo md5($string) . "<br/>" ;
echo md5(str_shuffle($string)) ;

Initial display in the browser produces this output:

dhuo p qr xnus hzeyveloftaiewbTojrg mock
9e107d9d372bb6826bd81d3542a419d6
f71d7b9a5880c06163ed8adbdee5b55e

Refreshing the browser gives this output:

ugn uiferlwckvxT thzrbh o mqo doeaoesjp y
9e107d9d372bb6826bd81d3542a419d6
1356809b12da9a25482891606ccfaa8f

The second line of output, the single use of MD5, does not change on a page refresh, while the other content does.

Note

There is more detailed discussion on the MD5 function (and its more secure cousin sha1) in Chapter 9 on security.

PHP provides many more string functions, and over time you may choose to become familiar with many of them. The string functions we have covered here are those that you are likely to find the most beneficial right away. In the next chapter, we will follow a similar pattern with a discussion of arrays.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required