-
Feature Request
-
Resolution: Unresolved
-
Major
-
None
-
2.1.0.GA, 3.0.10.Final
-
None
-
None
In theory (and by specification, eg RFCs 2045 and 2183), all HTTP request headers are in US-ASCII only.
In practice, browsers commonly use 8-bit encodings in certain places, eg UTF-8 sequences in the "filename" parameter of Content-Disposition headers (when uploading files with multipart/form-data forms and if the local filename on the client contains national characters). Chrome and Firefox do this.
This non-ASCII information is lost in the normal process of decoding headers. However, the Apache mime4j library allows access to the raw undecoded header as a byte array through the org.apache.james.mime4j.parser.Field.getRaw() method.
Sadly, this information is inaccessible from within JAX-RS applications that use the org.jboss.resteasy.plugins.providers.multipart.MultipartFormDataInput interface, because InputPart only exposes the decoded String values (org.jboss.resteasy.plugins.providers.multipart.MultipartInputImpl, line 129, call to org.apache.james.mime4j.parser.Field.getBody()). In these decoded String values, the original UTF-8 sequences will are replaced with unicode replacement characters U+UUFD (by org.apache.james.mime4j.util.ContentUtil.decode(ByteSequence)).
Please provide access to the raw headers.
- is related to
-
RESTEASY-2148 Add the ability to disable Filename encoding in Content-Disposition
- Closed
- relates to
-
RESTEASY-1214 WildFly breaks utf-8 encoded strings in Content-Disposition header
- Open