Base64 is one of those things developers encounter constantly — in JWT tokens, CSS data URIs, HTTP Basic Auth, email attachments, and API responses — yet few people have actually sat down and understood how it works at the bit level. This guide covers everything from the underlying algorithm to real-world usage patterns and the mistakes that trip people up most often.

Encode or decode Base64 right now
Our free browser-based Base64 tool supports text, Unicode, files, URL-safe mode, and batch processing. Nothing leaves your device.
Open Base64 Tool

What Is Base64?

Base64 is a binary-to-text encoding scheme. It takes arbitrary binary data — bytes that might include any value from 0 to 255 — and represents that data using only 64 specific printable ASCII characters. Those 64 characters are: A–Z (26), a–z (26), 0–9 (10), plus + and /. That's where the "64" in the name comes from.

The reason this exists is historical. Many protocols and systems were designed for text — things like SMTP for email, HTTP headers, HTML attributes, and XML documents. These systems either couldn't handle raw binary bytes, or they assigned special meaning to certain byte values that made binary data corrupted or unreadable in transit. Base64 sidesteps the problem by converting everything to characters those systems can handle safely.

The specification

Base64 is formally defined in RFC 4648 (2006), which covers both the standard alphabet (+/) and the URL-safe alphabet (-_). The older RFC 2045 defined Base64 for MIME email specifically.

How the Algorithm Works

Understanding the actual bit-manipulation behind Base64 makes everything else — including the 33% size overhead and the padding characters — immediately obvious.

Step 1: Take 3 bytes of input

Base64 processes input in 3-byte (24-bit) groups. Each byte is 8 bits, so 3 bytes gives us 24 bits total.

Input: "Man" → bytes 77, 97, 110
M01001101
a01100001
n01101110
Combined 24-bit stream: 010011 010110 000101 101110

Step 2: Split into four 6-bit groups

Those 24 bits are re-grouped into four chunks of 6 bits each. 6 bits can represent values 0–63 — exactly 64 possible values, one for each character in the Base64 alphabet.

010011= 19 → T
010110= 22 → W
000101= 5 → F
101110= 46 → u
Output: TWFu

Step 3: Look up each 6-bit value in the alphabet

Each 6-bit number maps to a character in the Base64 alphabet:

A0
B1
C2
D3
E4
F5
G6
H7
I8
J9
K10
L11
M12
N13
O14
P15
Q16
R17
S18
T19
U20
V21
W22
X23
Y24
Z25
a26
b27
c28
d29
e30
f31
g32
h33
i34
j35
k36
l37
m38
n39
o40
p41
q42
r43
s44
t45
u46
v47
w48
x49
y50
z51
052
153
254
355
456
557
658
759
860
961
+62
/63

Why Base64 Adds 33% to File Size

The size increase follows directly from the algorithm. Every 3 bytes of input produces 4 output characters. Since each ASCII character is 1 byte, you're storing 3 bytes of data using 4 bytes of text — a ratio of 4/3 = 1.333, or ~33% overhead.

Original sizeBase64 sizeOverhead
1 KB~1.37 KB+370 B
10 KB~13.7 KB+3.7 KB
100 KB PNG~137 KB+37 KB
1 MB image~1.37 MB+370 KB

For small files — icons, small logos, inline SVGs — the overhead is acceptable and you save an HTTP request. For anything above roughly 10–20 KB, a separate file request with proper HTTP caching is more efficient.

Padding Characters: What = Means

Base64 processes input 3 bytes at a time. When the input length isn't a multiple of 3, the last group has 1 or 2 bytes instead of 3. Padding characters (=) fill the gap so the output length is always a multiple of 4.

1 leftover byte → 2 output chars + ==
"M"  → "TQ=="
(1 byte → 2 chars + padding)
2 leftover bytes → 3 output chars + =
"Ma" → "TWE="
(2 bytes → 3 chars + padding)

Decoders use the number of = characters to figure out how many bytes to expect at the end. Some implementations (URL-safe Base64 in particular) omit padding entirely, since the decoder can calculate it from the output length.

Standard vs URL-Safe Base64

Standard Base64 uses + for value 62 and / for value 63. Both characters carry special meaning in URLs: + is interpreted as a space in query strings, and / separates URL path segments. Placing a standard Base64 string in a URL either corrupts it or requires percent-encoding (%2B, %2F).

URL-safe Base64 (RFC 4648 §5) solves this by substituting: +- and /_. Padding is typically stripped too.

FeatureStandard Base64URL-safe Base64
Value 62+-
Value 63/_
Padding=Often omitted
Safe in URLsNo — needs escapingYes
Safe in filenamesNo — / breaks pathsYes
Used inMIME email, data URIsJWT, OAuth, cookies, filenames
Quick rule

If the Base64 string goes in a URL, a cookie, a filename, or an HTTP header that doesn't allow + and /, use URL-safe. If it goes in a data URI or MIME email body, use standard.

Base64 in JavaScript

Every browser and Node.js environment provides btoa() and atob() for encoding and decoding:

btoa / atob — ASCII only
// Encode
btoa("Hello")      // → "SGVsbG8="
btoa("Man")        // → "TWFu"

// Decode
atob("SGVsbG8=")   // → "Hello"
atob("TWFu")       // → "Man"

The catch: btoa() only accepts characters in the Latin-1 range (code points 0–255). Any character above that — emoji, Chinese, Arabic, accented letters like ü — throws DOMException: Failed to execute 'btoa': The string to be encoded contains characters outside of the Latin1 range.

Unicode-safe encoding
// Encode Unicode (encode to UTF-8 bytes first)
function b64Encode(str) {
  return btoa(unescape(encodeURIComponent(str)));
}

// Decode Unicode
function b64Decode(str) {
  return decodeURIComponent(escape(atob(str)));
}

b64Encode("Hello 🌍")  // → "SGVsbG8g8J+MjQ=="
b64Decode("SGVsbG8g8J+MjQ==")  // → "Hello 🌍"
URL-safe variant in JavaScript
function b64UrlEncode(str) {
  return b64Encode(str)
    .replace(/\+/g, '-')
    .replace(/\//g, '_')
    .replace(/=+$/, '');  // strip padding
}

function b64UrlDecode(str) {
  // Re-add padding
  let s = str.replace(/-/g, '+').replace(/_/g, '/');
  while (s.length % 4) s += '=';
  return b64Decode(s);
}

Base64 in Python

Python base64 module
import base64

# Standard encode/decode
encoded = base64.b64encode(b"Hello")
# → b"SGVsbG8="

decoded = base64.b64decode("SGVsbG8=")
# → b"Hello"

# URL-safe variant
url_enc = base64.urlsafe_b64encode(b"Hello")
# → b"SGVsbG8="  (same here, differs when +/present)

# Encode a string (not bytes)
s = "Hello 🌍"
encoded = base64.b64encode(s.encode('utf-8'))

# Decode back to string
decoded = base64.b64decode(encoded).decode('utf-8')
# → "Hello 🌍"

Data URIs: Embedding Files in HTML and CSS

A data URI embeds a file's complete contents as a Base64 string directly in an attribute or stylesheet. The format is:

Data URI format
data:[mediatype][;base64],[data]

// Example — small PNG in an img tag
<img src="data:image/png;base64,iVBORw0KGgo..." />

// Example — font in CSS
@font-face {
  font-family: 'MyFont';
  src: url('data:font/woff2;base64,d09GMgAB...');
}

When data URIs make sense

  • Email templates — email clients often block external image requests; embedding is the only reliable way to display images
  • Single-file HTML — self-contained pages that must include assets without external files
  • Small icons — tiny SVG icons or 1×1 pixel tracking images where the HTTP request overhead is larger than the image
  • Critical above-the-fold images — embedding a tiny hero thumbnail to avoid render-blocking

When they don't

  • Large images — the 33% size increase and no HTTP caching means users re-download the full embedded image on every page load
  • Images shared across multiple pages — a referenced file is cached once; an embedded one is duplicated in each page's HTML
  • CSS background images for frequently visited pages — separate files with long cache headers are significantly faster

Base64 and JWTs

JSON Web Tokens are one of the most common places developers encounter Base64 in practice. A JWT has three URL-safe Base64-encoded parts separated by dots:

JWT structure
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
.eyJzdWIiOiJ1c2VyXzEyMyIsIm5hbWUiOiJBbGljZSIsImlhdCI6MTcxOTQwMDAwMH0
.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

Decoded header:  {"alg":"HS256","typ":"JWT"}
Decoded payload: {"sub":"user_123","name":"Alice","iat":1719400000}
Signature: HMAC-SHA256 of header + "." + payload

The header and payload are plain JSON encoded with URL-safe Base64 (no padding). They are not encrypted. Anyone can decode them with atob() or any Base64 decoder. The signature is what proves the token hasn't been tampered with — it requires the secret key to verify.

Critical misunderstanding

Never store secrets, passwords, or sensitive PII in a JWT payload assuming the Base64 encoding "hides" it. The payload is fully readable by anyone who has the token. Use JWE (JSON Web Encryption) if the payload itself needs to be confidential.

HTTP Basic Authentication

HTTP Basic Auth sends credentials in the Authorization header. The format is the username and password joined with a colon, then Base64-encoded:

Basic Auth header
// Credentials: alice:mysecretpassword
btoa("alice:mysecretpassword")
// → "YWxpY2U6bXlzZWNyZXRwYXNzd29yZA=="

// HTTP header
Authorization: Basic YWxpY2U6bXlzZWNyZXRwYXNzd29yZA==

// Anyone who intercepts this header can run:
atob("YWxpY2U6bXlzZWNyZXRwYXNzd29yZA==")
// → "alice:mysecretpassword"

As the code above shows, Basic Auth credentials are trivially reversible from the header value. This is why HTTP Basic Auth must only be used over HTTPS. Without TLS, the credentials are exposed to anyone on the network.

Base64 Is Not Security

This is the most common misconception. Base64 is purely a representation change — it moves the same bits around into a different character set. It adds zero confidentiality, zero integrity protection, and zero authentication.

Here's the full picture:

PropertyBase64Encryption (AES-256)Hashing (bcrypt)
ReversibleYes — by anyoneYes — with key onlyNo
Hides contentNoYesYes (one-way)
Requires keyNoYesYes (salt)
Use for passwordsNeverNoYes
Primary purposeBinary → text transportConfidentialityVerification
Do not do this

Storing passwords as Base64 in a database is as insecure as storing them in plain text. Any attacker who reads the database can decode every password instantly. Use bcrypt, scrypt, or Argon2 for passwords.

5 Common Base64 Mistakes

  1. Using btoa() with Unicode input. btoa() only accepts Latin-1 characters. Strings with emoji or non-Latin characters throw exceptions. Use btoa(unescape(encodeURIComponent(str))) instead.
  2. Forgetting URL-safe substitutions. Standard Base64 in a URL or cookie silently corrupts when the browser interprets + as a space. Always use the -_ variant in URLs.
  3. Treating Base64 as encryption. Covered above — it provides no security. Don't use it to "obscure" sensitive values.
  4. Embedding large files as data URIs. Any image over ~10 KB is better served as a separate file with HTTP caching. Embedding bypasses the browser cache and forces a re-download every page load.
  5. Forgetting to re-add padding when decoding URL-safe Base64. URL-safe Base64 often strips the trailing = characters. When decoding, add them back: pad until str.length % 4 === 0.

Base64 vs Hex Encoding

Hex encoding is another common way to represent binary data as text. Instead of 64 characters, it uses 16 (0–9 and a–f), representing each byte as two hex digits.

PropertyBase64Hex
Characters used64 (A–Z, a–z, 0–9, +, /)16 (0–9, a–f)
Size overhead~33%~100% (doubles)
ReadabilityOpaqueEasier to inspect bytes
Common usesFiles, tokens, data URIs, JWTCryptographic digests, color codes, memory addresses
URL-safe variantYes (RFC 4648)Yes — no special chars

Choose Base64 when compactness matters (API tokens, data URIs). Choose hex when human readability of individual bytes matters (cryptographic hashes, debugging binary protocols).

Quick Reference

JavaScript — production-ready encode/decode
// Unicode-safe encode
const encode = str => btoa(unescape(encodeURIComponent(str)));

// Unicode-safe decode
const decode = str => decodeURIComponent(escape(atob(str)));

// URL-safe encode (no padding)
const urlEncode = str => encode(str)
  .replace(/\+/g, '-').replace(/\//g, '_').replace(/=+$/, '');

// URL-safe decode
const urlDecode = str => {
  let s = str.replace(/-/g, '+').replace(/_/g, '/');
  while (s.length % 4) s += '=';
  return decode(s);
};
Node.js — Buffer-based (more reliable for binary)
// Encode
Buffer.from('Hello 🌍').toString('base64');
// → "SGVsbG8g8J+MjQ=="

// Decode
Buffer.from('SGVsbG8g8J+MjQ==', 'base64').toString('utf8');
// → "Hello 🌍"

// URL-safe (Node 16+)
Buffer.from('Hello').toString('base64url');
// → "SGVsbG8"