Practical URL Encoding

A quick and dirty rundown of URL Encoding.

Jump to the URL Encoding/Decoding Tool if you know what to do.

What is URL Encoding

URL Encoding, or Percent Encoding, is a form of encoding data for use in a URL (commonly equivalent to a web address such as twitter.com/intent/tweet?text=content).

This technique allows a link to encode more information than the end address. For example in this post, I will be using a Twitter Intent as an example. This is a URL that presents a way for us to pre-populate a tweet for a user.

Structure

All links of this nature follow a pattern:

https://server.example.com?parameter1=500&parameter2=mittens

Break this down into two main chunks.

Address

The address is simple and never changes: this is merely the destination page for the user. In this twitter example, the URL is specified as:

https://twitter.com/intent/tweet

Parameters

A parameter, or argument, is a chunk of data. Delimit between the address and parameters with the question mark (?) character. Parameters follow after that with a name=value style, and any number of parameters can be chained with the ampersand (&) character. With standards compliant URL parameters, the chunk order is irrelevant.

Example

Let's build up the twitter link. For simplicity, we will only set the text and url of the tweet.

Start with the simple URL.

https://twitter.com/intent/tweet

Add the question mark (?) character to signify parameters.

https://twitter.com/intent/tweet?

Attach the argument named 'text' with the value of Mittens.

https://twitter.com/intent/tweet?text=Mittens

Append the ampersand (&) character to chain another parameter.

https://twitter.com/intent/tweet?text=Mittens&

Now attach the argument named 'url' with the value of https://codex10.com

https://twitter.com/intent/tweet?text=Mittens&url=https%3A%2F%2Fcodex10.com

Note the percents are the encoded characters. This converts the invalid url value into valid URL characters. The page twitter hosted by twitter is aware of this and will decoded the url for appropriate use. If not encoded, the resulting URL would look like this:

https://twitter.com/intent/tweet?text=Mittens&url=https://codex10.com

Perhaps most unsettling, that link will work in most modern browsers. This stems from very forgiving parsers, some of which will even handle spaces. While this is convenient, it also fosters a false sense of security. To ensure proper operations, I recommend encoding the values in the url for the foreseeable future.

Tool

Codex10 has put up a simple URL Encoding/Decoding Tool for working with this text. An alternative tool can be found here.

For the JavaScript savvy, all browsers support the methods encodeURIComponent and decodeURIComponent, which accept strings and return the obvious result. Just open up the JavaScript console on your favorite browser and run:

encodeURIComponent('Mittens Mittens');
decodeURIComponent('Mittens%20Mittens');

Happy Linking!