Web Application Defender's Cookbook

Recipe 5-3: Normalizing Unicode

This recipe demonstrates how to specify a Unicode code point for use in decoding transactional data.

Ingredients

OWASP AppSensor4
- Unexpected Encoding Used
ModSecurity
- SecUnicodeMapFile directive
- SecUnicodeCodePage directive

Best-Fit Mapping

How should an application handle input that is Unicode encoded using a character set that is outside of what is expected (such as non-ASCII)? This brings up the issue of best-fit mapping, in which an application internally maps characters to a character code point that looks visually similar. Why is this a security concern? Let’s look at how this can be leveraged as part of a filter evasion technique. The following Unicode encoded XSS payload uses various code points, including full-width characters:

%u3008scr%u0131pt%u3009%u212fval(%uFF07al%u212Frt(%22XSS%22)%u02C8)
%u2329/scr%u0131pt%u232A

This payload should be correctly Unicode-decoded to this:

〈scrıpt〉ℯval(＇alℯrt("XSS")ˈ)〈/scrıpt〉

This is simply a set of text; a web browser would not treat it as executable code. If the target web application is running Microsoft ASP classic, however, it tries to do best-fit mapping for the Unicode encoding characters. Here is a short example of some of the mappings ASP makes for the left angle bracket and single tick mark characters:

〈(0x2329) ~= <(0x3c)
〈(0x3008) ~= <(0x3c)
＜(0xff1c) ~= <(0x3c)
ʹ(0x2b9) ~= '(0x27)
ʼ(0x2bc) ~= '(0x27)
ˈ(0x2c8) ~= '(0x27)
′(0x2032) ~= '(0x27)
＇(0xff07) ~= '(0x27)

With this mapping, ...

Get Web Application Defender's Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Web Application Defender's Cookbook by Ryan C. Barnett, Jeremiah Grossman

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly