All articles

Base64 Encoding: What It Is and When to Use It

4 min read
Base64encoding

Base64 is an encoding scheme that converts binary data into a string of ASCII characters. It appears everywhere in software development — email attachments, data URIs, HTTP Basic Auth, JWT tokens, and countless other protocols — yet it is often misunderstood. Here is what it actually does, why it exists, and when to use it.

The problem Base64 solves

Many protocols and systems were designed to handle text, not arbitrary binary data. SMTP (email), older HTTP headers, and many databases can carry text reliably but may corrupt binary data — they might strip high bytes, normalise line endings, or truncate at null bytes.

Base64 solves this by encoding binary data as a string that uses only 64 safe ASCII characters: the uppercase letters A–Z, the lowercase letters a–z, the digits 0–9, plus + and /. Every byte of binary data can be represented using only these characters, which are safe to transmit over any text-oriented channel.

How the encoding works

Base64 works by taking 3 bytes of input (24 bits) and splitting them into 4 groups of 6 bits each. Each 6-bit group maps to one of the 64 characters in the alphabet. This means:

  • Every 3 bytes of input produce 4 characters of output
  • Base64-encoded data is approximately 33% larger than the original
  • If the input is not a multiple of 3 bytes, padding (= or ==) is added to make the output a multiple of 4 characters

For example, the string Man (3 bytes: 77, 97, 110) encodes to TWFu. The string Ma (2 bytes) encodes to TWE= (one padding character).

Base64 vs Base64URL

Standard Base64 uses + and /, which are meaningful characters in URLs and file paths. Base64URL is a variant that replaces + with - and / with _, and omits padding. This makes the output safe to use in URLs, query parameters, filenames, and JWT tokens without further encoding.

When decoding a token or URL parameter that looks like Base64, check which variant it uses. Mixing them up is a common source of "invalid character" errors.

Common uses in web development

  • HTTP Basic Authentication. The Authorization: Basic header carries credentials as Base64(username:password). This does not encrypt the credentials — it only encodes them. HTTPS is required for any security.
  • Data URIs. Inline images in HTML and CSS can be embedded as data:image/png;base64,iVBOR.... Useful for small icons that you want to avoid an extra HTTP request for, though it increases the size of the HTML.
  • JWT tokens. All three parts of a JWT — header, payload, and signature — are Base64URL encoded. The payload is not encrypted; decoding it reveals the claims.
  • File uploads in APIs. Some APIs accept file contents as Base64-encoded strings in JSON payloads rather than as multipart form data. This simplifies the request structure but increases payload size.
  • Storing binary data in text fields. When you need to put an image, PDF, or cryptographic key into a database field that only accepts text, Base64 is the usual answer.

What Base64 is not

Base64 is an encoding, not encryption. Decoding Base64 requires no key — anyone can reverse it instantly. Do not use Base64 as a way to hide sensitive data. It is purely a format conversion.

Similarly, Base64 does not compress data — it makes data larger. Use gzip or Brotli for compression; use Base64 when you need binary data in a text channel.

Encoding and decoding in code

In the browser (JavaScript): btoa(string) encodes and atob(string) decodes. Note that btoa only handles Latin-1 characters — for Unicode strings you need to encode to UTF-8 bytes first.

In Node.js: Buffer.from(str).toString('base64') encodes; Buffer.from(b64, 'base64').toString() decodes.

For a quick one-off conversion — decoding a data URI, checking what is in an API token, encoding a small file — a browser tool is the fastest option and keeps your data local.