XML documents are just too rich in syntax sugar to be processed by anything short of a full-blown XML parser. I've seen many hackish systems held together by string and bailing wire based on regular expressions, grep, sed, raw stream processing, and other tools. These are extremely brittle and rarely able to handle the full panoply of documents they encounter. Problems include:
Detecting the encoding, including handling multibyte character sets
Comments that contain tags
Processing instructions that contain tags
Unexpected placement of spaces and line breaks within tags
Default attribute values applied from the internal DTD subset
Character references like   and
Predefined entity references such ...