Website Spec
← Foundations
Required

<meta charset>

Declare UTF-8 as the document character encoding in the first 1024 bytes of the HTML, so browsers parse text correctly before they hit any non-ASCII content.

What it is

<meta charset> tells the browser how to decode the bytes of the HTML document into characters. In 2026 there is only one correct value:

<meta charset="utf-8" />

It must appear inside <head>, and the entire <meta> element must fit within the first 1024 bytes of the response. Browsers stop sniffing after that point; anything declared later is ignored.

Why it matters

Before the browser can parse a single character of your page, it has to decide which encoding to apply to the byte stream. Without an explicit declaration, it guesses — based on the Content-Type HTTP header, a byte-order mark, or heuristics over the first chunk of bytes. Guessing goes wrong:

UTF-8 is the only encoding you should use. It is a superset of ASCII, supports every script (Latin, Cyrillic, Arabic, Chinese, emoji), is the default for JSON and XML, and is what every modern build tool produces. Legacy encodings (iso-8859-1, windows-1252, shift_jis) exist only as compatibility for old documents — do not create new content in them.

How to implement

Put the charset declaration as the very first child of <head>, before <title> or any other tag that could contain non-ASCII text:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>The Website Specification</title>
    ...
  </head>
</html>

Save the file itself as UTF-8 (most editors do this by default). The declaration in the HTML and the actual bytes on disk must match.

Also set the HTTP Content-Type header on the response. The header takes precedence over the meta tag, so if they conflict, the header wins:

Content-Type: text/html; charset=utf-8

If both agree on UTF-8, you are covered for every loader: browsers, scrapers, RSS readers, and crawlers that read the bytes directly.

The older XHTML form <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> still works but is verbose and unnecessary. Use the short form.

Common mistakes

Verification

Sources