Uploaded image for project: 'Undertow'
  1. Undertow
  2. UNDERTOW-818

Shift-JIS encoding data is garbled

    XMLWordPrintable

Details

    • Hide

      1. Build and deploy the attached jboss-servlet-charset.war
      2. Access http://localhost:8080/jboss-servlet-charset/
      3. Click "submit" and see the result of next page

      Show
      1. Build and deploy the attached jboss-servlet-charset.war 2. Access http://localhost:8080/jboss-servlet-charset/ 3. Click "submit" and see the result of next page

    Description

      Shift-JIS encoding POST form-data is garbled because Undetow uses java.net.URLDecoder to decode POST data but java.net.URLDecoder does not always work correctly with Shift_JIS. A query parameter also has same issue even if setting url-charset="Shift_JIS" because Undertow's internal decode logic for query parameter has similar issue.

      For example, "ใƒ†ใ‚นใƒˆ"(\u30C6\u30B9\u30C8), which means "test" in Japanese:

      • java.net.URLEncoder encodes to %83%65%83%58%83%67
      • Browser (IE, Firefox, Chrome) encodes to %83e%83X%83g

      java.net.URLDecoder can decode an encoded parameter by java.net.URLEncoder (%83%65%83%58%83%67) correctly but it's unable to decode an encoded parameter by browser (%83e%83X%83g).

      In RFC-3986, it doesn't always need to encode alpha (uppercase and lowercase letters), digit, hyphen, period, underscore or tilde of multi-byte character's second or later byte-code. However, java.net.URLDecoder (and Undetow internal one) doesn't support this percent-encoding rule fully.

      Attachments

        Issue Links

          Activity

            People

              sdouglas1@redhat.com Stuart Douglas
              rhn-support-mmiura Masafumi Miura
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: