URL Encoder / Decoder

Enter the text that you wish to encode or decode:




About URL Encoder / Decoder

What is URL Encode and Decode Tool

Use the above-mentioned online tool to encode or decode a text string. URIs must be encoded uniformly for global interoperability. A two-step process is used to map the wide range of characters used globally into the 60 or so allowed characters in a URI:

  • Convert the character string to a byte sequence using the UTF-8 encoding.
  • Convert every byte that isn't an ASCII letter or digit to percent HH, where HH is the byte's hexadecimal value.

For example, the string François would be encoded as: Fran% C3% A7ois

(The "ç" is encoded in UTF-8 as two bytes C3 (hex) and A7 (hex), which are then written as the three characters "% c3" and "% a7" respectively.) This can result in a rather long URI (up to 9 ASCII characters for a single Unicode character), but the intention is that browsers only need to display the decoded form, and many protocols can send UTF-8 without the % HH escaping.

What exactly is URL encoding?

URL encoding refers to the process of replacing specific characters in a URL with one or more character triplets composed of the percent character " percent " followed by two hexadecimal digits. The numeric value of the replaced character is represented by the triplet's two hexadecimal digits.

The term URL encoding is a bit misleading because the encoding procedure can be applied to any URI (Uniform Resource Identifier), including URNs (Uniform Resource Names). As a result, the term percent-encoding should be used instead.

Which Characters Can Be Used in a URL?

A URI's characters are either reserved or unreserved (or a percent character as part of a percent-encoding). Reserved characters are those who have special meaning at times, whereas unreserved characters do not. Characters that would otherwise be forbidden are represented by allowed characters when using percent-encoding. With each revision of the specifications that govern URIs and URI schemes, the sets of reserved and unreserved characters, as well as the circumstances under which certain reserved characters have special meaning, have changed slightly.

RFC 3986 requires that the characters in a URL be drawn from a predefined set of unreserved and reserved ASCII characters. Other characters are not permitted in a URL.

The unreserved characters can, but should not, be encoded. The unreserved characters are as follows:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 - _ . ~

Only under certain conditions must reserved characters be encoded. The reserved characters are as follows:

! * ' ( ) ; : @ & = + $ , / ? % # [ ]

Encoding/Decoding a Piece of Text

RFC 3986 does not specify which character encoding table should be used to encode non-ASCII characters (such as the umlauts ä, ö, ü). Because URL encoding involves a pair of hexadecimal digits, and a pair of hexadecimal digits is equivalent to 8 bits, one of the 8-bit code pages could theoretically be used for non-ASCII characters (e.g. ISO-8859-1 for umlauts).

On the other hand, because each language has its own 8-bit code page, managing all of these different 8-bit code pages would be a tedious task. Some languages are too large to fit into an 8-bit code page (e.g. Chinese). As a result, RFC 3629 recommends using the UTF-8 character encoding table for non-ASCII characters. This is taken into account by the following tool, which allows you to select between the ASCII character encoding table and the UTF-8 character encoding table. If the ASCII character encoding table is selected, a warning message will appear if the URL encoded/decoded text contains non-ASCII characters.

When and why should URL encoding be used?

When data entered into HTML forms is submitted, the form field names and values are encoded and sent to the server in an HTTP request message using method GET or POST, or, more traditionally, via email. The default encoding is based on an early version of the general URI percent-encoding rules, with a few modifications such as newline normalization and replacing spaces with "+" rather than "percent 20." The MIME type of data encoded in this manner is application/x-www-form-urlencoded, which is currently defined (albeit in an extremely outdated manner) in the HTML and XForms specifications.

Furthermore, the CGI specification includes rules for how web servers decode this type of data and make it available to applications.

Application/x-www-form-urlencoded data is included in the query component of the request URI when sent in an HTTP GET request. When data is sent via HTTP POST or email, it is placed in the message's body, and the name of the media type is included in the message's Content-Type header.



Latest Articles