3.4 Error Handling in a Scanner

Generally, all errors are passed on to the parser. Usually the Scanner does not print anything. Errors are communicated to the parser by returning a special error token called ERROR. Note that you should ignore the token called error (in lowercase), used by the parser. There are several requirements for reporting and recovering from lexical errors:

  • When an invalid character (one that cannot begin any token) is encountered, a string containing just that character is returned as the error string. Resume scanning at the following character.
  • If a string contains a UN-escaped new line, that error is reported as “Unterminated string constant” and scanning is resumed at the beginning of the next line – we assume that ...

Get Compilers: Principles and Practice now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.