2
votes

Currently there is final URL url = new URL(urlString); but I run into server not supporting non-ASCII in path.

Using Java (Android) I need to encode URL from

http://acmeserver.com/download/agc/fcms/儿子去哪儿/儿子去哪儿.png

to

http://acmeserver.com/download/agc/fcms/%E5%84%BF%E5%AD%90%E5%8E%BB%E5%93%AA%E5%84%BF/%E5%84%BF%E5%AD%90%E5%8E%BB%E5%93%AA%E5%84%BF.png

just like browsers do.

I checked URLEncoder.encode(s, "UTF-8"); but it also encodes / slashes

http%3A%2F%2acmeserver.com%2Fdownload%2Fagc%2Ffcms%2F%E5%84%BF%E5%AD%90%E5%8E%BB%E5%93%AA%E5%84%BF%2F%E5%84%BF%E5%AD%90%E5%8E%BB%E5%93%AA%E5%84%BF.png

Is there way to do it simply without parsing string that the method gets?

from http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars

B.2.1 Non-ASCII characters in URI attribute values Although URIs do not contain non-ASCII values (see [URI], section 2.1) authors sometimes specify them in attribute values expecting URIs (i.e., defined with %URI; in the DTD). For instance, the following href value is illegal:

<A href="http://foo.org/Håkon">...</A>

We recommend that user agents adopt the following convention for handling non-ASCII characters in such cases:

  1. Represent each character in UTF-8 (see [RFC2279]) as one or more bytes.
  2. Escape these bytes with the URI escaping mechanism (i.e., by converting each byte to %HH, where HH is the hexadecimal notation of the byte value).
3

3 Answers

6
votes

You should just encode the special characters and the parse them together. If you tried to encode the entire URI then you'd run into problems.

Stick with:

String query = URLEncoder.encode("apples oranges", "utf-8");
String url = "http://stackoverflow.com/search?q=" + query;

Check out this great guide on URL encoding.

That being said, a little bit of searching suggests that there may be other ways to do what you want:

Give this a try:

String urlStr = "http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4";
URL url = new URL(urlStr);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
url = uri.toURL();

(You will need to have those spaces encoded so you can use it for a request.)

This takes advantage of a couple features available to you in Android classes. First, the URL class can break a url into its proper components so there is no need for you to do any string search/replace work. Secondly, this approach takes advantage of the URI class feature of properly escaping components when you construct a URI via components rather than from a single string.

The beauty of this approach is that you can take any valid url string and have it work without needing any special knowledge of it yourself.

3
votes
final URL url = new URL( new URI(urlString).toASCIIString() );

worked for me.

2
votes

I did it as below, which is cumbersome

        //was: final URL url = new URL(urlString);
        String asciiString;
        try {
            asciiString = new URL(urlString).toURI().toASCIIString();
        } catch (URISyntaxException e1) {
            Log.e(TAG, "Error new URL(urlString).toURI().toASCIIString() " + urlString + " : " + e1);
            return null;
        }
        Log.v(TAG, urlString+" -> "+ asciiString );
        final URL url = new URL(asciiString);

url is later used in

        connection = (HttpURLConnection) url.openConnection();