13.9. Reading Records with a Pattern Separator
Problem
You want to read in records from a file, in which each record is separated by a pattern you can match with a regular expression.
Solution
Read the entire file into a string and then split on the regular expression:
$filename = '/path/to/your/file.txt';
$fh = fopen($filename, 'r') or die($php_errormsg);
$contents = fread($fh, filesize($filename));
fclose($fh);
$records = preg_split('/[0-9]+\) /', $contents);Discussion
This breaks apart a numbered list and places the individual list items into array elements. So, if you have a list like this:
1) Gödel 2) Escher 3) Bach
You end up with a four-element array, with an empty opening element.
That’s because preg_split( )
assumes the delimiters are between items, but in this case, the
numbers are before items:
Array ( [0] => [1] => Gödel [2] => Escher [3] => Bach )
From one point of view, this can be a feature, not a bug, since the nth element holds the nth item. But, to compact the array, you can eliminate the first element:
$records = preg_split('/[0-9]+\) /', $contents);
array_shift($records);Another modification you might want is to strip new lines from the elements and substitute the empty string instead:
$records = preg_split('/[0-9]+\) /', str_replace("\n",'',$contents));
array_shift($records);PHP doesn’t allow you to change the input record separator to anything other than a newline, so this technique is also useful for breaking apart records divided by strings. However, if you ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access