Type or paste text in any of the green or grey shaded boxes and click on the button Convert
button
above it. Alternative representations will appear in all the other boxes. You can then cut & paste the results
into your document. selects all the text in a box. See the notes at the bottom of the page for other options.
The text in the Mixed input
field below now contains a variety of escapes. Normally you would click
on the Convert
button just above the field to show the various escape formats below. Note, however,
that this (unusually convoluted) text represents one character using just a hex code point number - therefore you
should instead click on the button labelled Hex code points
to convert that number as well as the
other escapes in this particular example.
To continue with this example, click on Hex code points
button.
You should now see the conversion results. You can use checkboxes with many fields to tweak the results. For more information see the notes in the lower part of the page.
Convert numbers as
Convert \x
|
|
Characters
|
|
HTML/XML
Escape invisible characters
Convert bidi controls to HTML markup
|
|
Hex NCRs
Show ascii
Latin1
|
Decimal NCRs
Show ascii
Latin1
|
JavaScript
C-style
ES6
\n etc
|
Rust
\n etc
|
CSS
|
|
Percent encoding for URIs
|
|
U+hex
Show ascii
Latin1
|
0x... notation
Show ascii
Latin1
|
UTF-8 code units
|
UTF-16 code units
|
Hexadecimal
Show ascii
Latin1
|
Decimal
Show ascii
Latin1
|
Base64
|
Notes:
Updated Sun 26 Jun 2016 • tags converter, scriptnotes
See release notes for version 8. See github commit list.
Standard use. Most of the time you will probably want to drop the text to be converted into
the Mixed input
field, and hit the associated Convert
button. This will convert all
escapes to characters, then convert that into each of the forms listed against the boxes below.
If your text contains bare numbers that you also want to convert, use one of the convert buttons to the right. (Be aware, however, that in this case something like 'ab' could be interpreted as a hex number.)
Note, also, that escapes of the form \x, where x is one of a-zA-Z0-9 are not recognised by default. If you check
the box next to Convert \x
only the special JavaScript escapes are recognised (eg. \b, \n, \t, \",
etc.) For full CSS behaviour here, use the CSS input field.
Special use. If you only want to convert a specific type of escape and leave all others
untouched, paste the text into one of the other boxes and hit its associated Convert
button.
Checkboxes. Several of the output fields have checkboxes that allow you to slightly alter the results of a conversion. If an output field already contains a result when you click on a checkbox, you'll often see a change happen as you click. In some cases, however, this doesn't happen, since it is not possible to produce good results.
Invoking via URL. You can also pass a string to the page using the q parameter in the URI. For example, http://r12a.github.io/apps/conversion/?q=Crêpes. You can also pass a string with escapes in it, but you will need to be careful to percent escape characters such as &, + and # which affect the URI syntax. For example, http://r12a.github.io/apps/conversion/?q=CrU%2B00EApes.
The following describe how the various boxes work, including what happens if you paste or type text into the
named field and hit Convert
, and the output in the named field if you hit Convert
elsewhere.
If you start a conversion from here: Everything is treated as characters, eg. U+1234 is not treated as an escape for the purposes of conversion.
When conversion puts something here: Everything is displayed as characters.
You can view more detail for each character by clicking on View in UniView
.
If you start a conversion from here: Use HTML or XML markup. Numeric character references or
HTML character entities other than <
>
"
and &
are
converted to ordinary characters during conversion.
When conversion puts something here: Ordinary characters will appear by default, except
that <
>
"
and &
are converted to character entities. This is useful for
preparing examples of sample code for HTML or XML.
By default the control Escape invisible characters
is checked. This causes certain invisible
characters (such as RLM) or ambiguous characters (such as NO-BREAK SPACE)
to be converted to escaped form. The characters affected will be added to over time.
If Convert bidi controls to HTML markup
is selected RLE, LRE, RLI, LRI, FSI, PDF and PDI are converted to
HTML markup based on a span
element.
Hint: if you want to get the result into source code form, once the initial conversion has been
done just click Convert
above this text area, and then look in the Characters
text area
Note that if your text contains RLO or LRO plus PDF, the PDF will incorrectly be converted to
</span>
at the moment. I may fix this (and thereby allow RLO/LRO conversion too) at a later
date.
If you start a conversion from here: It can be a mix of text and escapes. Only hexadecimal NCRs are converted.
When conversion puts something here: By default, everything except ASCII characters is converted.
You can use the checkboxes to specify whether ANSI (Latin1) characters remain unchanged, or whether all characters are converted.
If you start a conversion from here: It can be a mix of text and escapes. Only decimal NCRs are converted.
When conversion puts something here: By default, everything except ASCII characters is converted.
You can use the checkboxes to specify whether ANSI (Latin1) characters remain unchanged, or whether all characters are converted.
If you start a conversion from here: It can be a mix of text and escapes. Only JavaScript escapes are converted. Accepts escapes as used in JavaScript (old style and ES6), Java and C.
When conversion puts something here: By default, everything except visible ASCII characters is converted to numeric escapes, and the following escapes are substituted for ASCII characters: \0, \b, \t, \v, \f, \\.
The default output to this field is specifically JavaScript compliant, though this is valid Java code too (a
small number of Java-only named escapes such as \e
are rendered as numeric
escapes).
If C-style
is checked, supplementary characters are rendered by a single number, eight digits long,
rather than two adjacent surrogate code point numbers.
If ES6-style
is checked, supplementary characters are also rendered as a single number but using the
new format described by EcmaScript 6.
If \n etc
is checked, line feeds, tabs, and quotation marks are also escaped.
If you start a conversion from here: It can be a mix of text and escapes. Only Rust escapes are converted
When conversion puts something here: By default, everything except visible ASCII characters is converted to numeric escapes, and the following escapes are substituted for ASCII characters: \0, \b, \t, \v, \f, \\. Output for other characters in the ranges U+0001-U+001F and U+0080-U+009F (ie. invisible control characters) uses the \x.. escape format.
If \n etc
is checked, line feeds, tabs, and quotation marks are also escaped.
If you start a conversion from here: It can be a mix of text and escapes.
When conversion puts something here: It does not escape non-control ASCII characters. Output content uses 6-digit escape forms followed by a space for supplementary characters, and 4-digit escapes followed by a space for all other escaped characters.
If you start a conversion from here: It can be a mix of text and escapes. Only percent escapes are converted.
When conversion puts something here: Characters allowed in URI syntax are not converted.
If you start a conversion from here: It can be a mix of text and escapes. Only U+hex escapes are converted.
When conversion puts something here: By default, everything except ASCII characters is converted.
You can use the checkboxes to specify whether ANSI (Latin1) characters remain unchanged, or whether all characters are converted. Adjacent escapes (only) are separated by a space.
Note: These checkboxes only work during conversions, they don't change text already in the output field.
Hint: to separate a sequence of characters by spaces, paste the characters into
the Mixed
field or Characters
field and click Convert
. Then click Convert
immediately in the Unicode U+hex notation
field and look in the Characters
field for the
result.
If you start a conversion from here: It can be a mix of text and hexadecimal 0x... escapes. Only 0x...escapes are converted.
When conversion puts something here: By default, everything except ASCII characters is converted. You can use the checkboxes to specify whether ANSI (Latin1) characters remain unchanged, or whether all characters are converted. Adjacent escapes (only) are separated by a space.
Note: These checkboxes only work during conversions, they don't change text already in the output field.
Hint: to separate a sequence of characters by spaces, paste the characters into
the Mixed
field or Characters
field and click Convert
. Then click Convert
immediately in the 0x... notation
field and look in the Characters
field for the result.
If you start a conversion from here: It can be a mix of text and hex numbers. Only hex numbers are converted.
Note that a sequence of two or more characters in the range a-f, such as cafe, will be treated as a hexadecimal number representing a character.
When conversion puts something here: By default, you'll see Hex numbers only, all separated by spaces. If you use the checkbox to specify whether ASCII or Latin1 (ANSI) characters remain unchanged, a space is inserted before a code point if the character just before it is in the range [A-Za-z0-9]. (
Note: These checkboxes only work during conversions, they don't change text already in the output field.
Note: After sending output to this box you will get a different result in the
other boxes if you immediately click on Convert
above this box.
If you start a conversion from here: It can be a mix of text and decimal numbers. Only decimal numbers are converted.
When conversion puts something here: By default, you'll see decimal numbers only, all separated by spaces.
If you use the checkbox to specify whether ASCII or Latin1 (ANSI) characters remain unchanged, a space is inserted before a code point if the character just before it is in the range [A-Za-z0-9].
Note: These checkboxes only work during conversions, they don't change text already in the output field.
Note: After sending output to this box you will get a different result in the
other boxes if you immediately click on Convert
above this box.
If you start a conversion from here: It must be hexadecimal byte codes only, separated by spaces.
When conversion puts something here: You'll see pairs of 2-digit hexadecimal numbers representing the bytes that make up the text when encoded in UTF-8.
If you start a conversion from here: It must be hexadecimal code units only, separated by spaces.
When conversion puts something here: You'll see hexadecimal numbers of 1 to 4 digits representing the UTF-16 code units for the text converted. Supplementary characters are represented by two code units.