The Tidy extension “cleans up” messy HTML and XML files into valid and pretty-looking documents. This feature is particularly useful when you’re serving lots of externally generated content.
For example, you want to allow visitors to enter HTML-enabled messages, but you don’t want them to be able to create an invalid page. Manually checking each post is quite laborious, but with Tidy you can automate this process.
Alternatively, Tidy can be used to reformat documents, either to reduce their file size or to make them easily understandable by humans. The first option saves you bandwidth, making your pages arrive more quickly and reducing your overall hosting costs. The second option simplifies your debugging process, as you’re not tracking down stray closing tags.
Tidy extension is bundled with PHP, but not enabled, because it
requires you to install the Tidy library. Download the Tidy library
--with-tidy=DIR to turn
on Tidy support in PHP.
Interacting with Tidy is a simple three step process. You parse the file, then clean its contents, and finally print or save the repaired file.
to read in a file for tidying:
$tidy = tidy_parse_file('index.html');
When your data is in a string, use
// This string is missing a closing </i> tag $tidy = tidy_parse_string('I am <b>bold and I am <i>bold and italic</b>');
Transform the document using the
$tidy = ...