HTML Encoder & Decoder
Convert special characters to HTML entities and decode them back. Named entities, numeric codes, live preview, 80+ entity support.
Entity Quick-Insert
Click any entity to append it to the input field. Useful for inserting special characters you don't have on your keyboard.
How to Use This Tool
Everything this HTML encoder and decoder does — and how to get the most out of each feature.
<, >, &, ", and ' are instantly converted to safe HTML entities.&), decimal (&), and hex (&) codes are all resolved.©), Encode all non-ASCII for maximum compatibility, and toggle Encode quotes when working with HTML attributes.What's Built In
A complete HTML entity tool — no extensions, no accounts, no data ever leaving your browser.
©) and hexadecimal (©) numeric entities — any Unicode character, not just those with names.Frequently Asked Questions
Common questions about HTML encoding, entities, and how this tool works.
HTML encoding converts characters that have special meaning in HTML into entity sequences that are displayed as text instead of parsed as markup. For example, < becomes < so the browser shows a literal less-than sign rather than treating it as the start of an HTML tag. This matters in two scenarios: security (preventing XSS attacks by neutralising injected markup) and correctness (displaying code snippets, math expressions, or special symbols in a web page).
The five characters with special HTML meaning are: & → &, < → <, > → >, " → ", and ' → '. Of these, &, <, and > are critical everywhere. Quotes need encoding primarily inside HTML attribute values. Failing to encode any of these when inserting user-supplied content is a classic XSS vulnerability.
Named entities use a memorable shorthand — © for ©, & for &, — for —. Numeric entities use the Unicode code point in decimal (©) or hex (©) for the same ©. Both formats are valid HTML and render identically in all browsers. Named entities are easier to read in source code; numeric entities work for any Unicode character, even those without a named equivalent.
Yes — when applied correctly. HTML encoding neutralises injected HTML by converting < and > to < and >, preventing a browser from parsing attacker-controlled strings as tags or script elements. However, encoding alone is not sufficient in all contexts: JavaScript event handlers require JavaScript escaping, CSS values need CSS escaping, and URL parameters need URL encoding. The right encoding must match the context where the data appears.
Use (non-breaking space) when you need two words to stay on the same line and never wrap — for example, "10 km", "Dr. Smith", or a number with its unit. The non-breaking space prevents the browser from inserting a line break at that position. For general paragraph spacing, use CSS margin or padding — adding multiple characters for visual indentation is a bad practice that affects accessibility and screen readers.
Yes. All encoding and decoding happens entirely inside your browser using JavaScript. Your HTML is never sent to any server, never stored, and never logged. You can safely paste confidential markup, internal template code, or HTML containing personal data without any risk of it being intercepted or stored externally.
Checking "Encode all non-ASCII" converts every character with a Unicode code point above 127 to a numeric entity (&#[number];). This produces a fully ASCII-safe output string — useful when the target system doesn't support UTF-8, when embedding HTML inside another encoding context, or when you need to ensure the output contains only the 128 basic ASCII characters. For normal HTML documents served with a UTF-8 charset, this option is unnecessary.
They serve different typographic purposes. The en dash (– –) is roughly the width of the letter N and is used for ranges (pages 10–20, 2020–2026) and to connect related items. The em dash (— —) is the width of the letter M and is used as a strong parenthetical break — like this — or to signal an interruption in speech. Neither should be confused with a hyphen (-), which is a keyboard character used for compound words and word breaks.
HTML encoding always increases the text length because single characters are replaced by multi-character sequences. The ampersand & (1 character) becomes & (5 characters). The copyright symbol © (1 character) becomes © (6 characters). This is expected — the longer encoded form is how the browser knows to display the character rather than parse it as markup. The rendered output looks identical to the original in a browser.
For symbols like ©, €, and em dash, you can write them directly in a UTF-8 HTML document and they display fine. The only characters you must always encode are <, >, and & — these must be entities regardless of the character encoding because the HTML parser treats them as markup delimiters. In attribute values, you also need to encode " or ' depending on which quote character delimits the attribute.