The top search result on "Encryption vs Encoding vs Hashing" is wrong

"Hashing ensures integrity" is misleading: hashing is a broad concept, and most hashes are not security primitives. If you need integrity against an attacker, you often want HMAC or signatures, not "just a hash."

TL;DR

  • “Hashing ensures integrity” is misleading: hashing is a broad concept, and most hashes are not security primitives.
  • If you need integrity against an attacker, you often want HMAC or signatures, not “just a hash.”
  • Use encoding for representation, encryption for secrecy, and cryptographic hashes/KDFs when you’re doing security work.

It’s always a little surreal to Google an “intro” topic in a field you already know—especially when it’s something people use for interviews—and see what floats to the top.

I did that with Encryption vs Encoding vs Hashing and found a popular result that presents hashing in a way that’s at best misleading.

The article claims:

Hashing serves the purpose of ensuring integrity…

That sentence reads as if integrity is what hashing “is for.” It isn’t.

What hashing is (in the general sense)

A hash function maps input data of arbitrary size to a fixed-size output.

That’s it. No security requirement is implied.

Hashes show up all over computing for non-security reasons:

  • hash tables / dictionaries
  • indexing and bucketing
  • deduplication
  • checksums and quick comparisons

Many hashes are designed for speed and distribution—not for resisting an attacker.

Where integrity does come in

If you’re talking about cryptographic hash functions, now you’re in a specific subset of hashes chosen for properties like resisting collisions and being hard to invert.

Even then, “integrity” depends on how you’re using the hash.

  • If you download a file and compare its SHA-256 to a trusted value you obtained out-of-band, you can detect changes.
  • If an attacker can change both the file and the displayed hash, a plain hash won’t save you.

So when people say “hashing ensures integrity,” what they often mean is:

  • Integrity against accidents: checksums or hashes can detect corruption.
  • Integrity against an attacker: you usually want a keyed construction (like HMAC) or a digital signature, not “just a hash”.

A concrete example of the confusion

CRC32 is a hash function. It’s fast, useful, and common. It’s also not a security primitive.

That doesn’t make CRC32 “bad.” It just means it’s not what you pick if your threat model includes adversaries.

Practical guidance

If you’re choosing a tool:

  • Need representation / transport (not security)? → encoding
  • Need confidentiality (a secret)? → encryption
  • Need fast lookup / bucketing / checksums? → hashing (non-crypto)
  • Need security properties? → cryptographic hashes, and often HMAC/signatures depending on the threat model
  • Need to turn a password into key material? → KDFs (PBKDF2/scrypt/Argon2, etc.)

I may expand on this later, but I wanted a clearer resource for anyone learning—because these mix-ups happen everywhere, including in professional codebases.

Backlinks

No backlinks yet.

Similar