An Introduction to Hashing Algorithms
What cryptographic hash functions do, and where they earn their keep.
A cryptographic hash function takes an input of any size and boils it down to a fixed-size output, often called a hash or digest. They're one of the quiet workhorses of computing and security, sitting underneath far more systems than most people realize. Here's what makes them tick and what they're used for. There's a hashing tool on this site if you'd like to see one in action.
Key properties
A good cryptographic hash function is built around a few properties. The same input always gives the same output. Change the input even slightly and the output looks completely different. It runs one way only, so you can't practically work backward from the output to recover the original input. And it should be impractical to find two different inputs that land on the same output, a clash known as a collision. Together, these are what make hash functions so handy for verifying data and storing certain information safely.
Common algorithms
A few algorithms come up again and again. SHA-256, part of the SHA-2 family, produces a 256-bit output and is the one you'll see most today. SHA-512 is its close relative with a larger output. Older names like MD5 and SHA-1 were once everywhere, but they're no longer considered safe where collision resistance matters, because researchers have found ways to produce collisions for them. You'll still bump into them in legacy systems or in non-security roles such as checksums, where that weakness doesn't bite.
Verifying data integrity
One of the most common jobs for a hash is confirming that data hasn't changed. When a file is published, its hash is often posted right alongside it. Once you've downloaded the file, you can compute its hash yourself and compare it to the published value. If they match, the file is almost certainly intact. If they don't, something has corrupted or altered it along the way. Notice that none of this requires the hash to be secret.
Storing passwords
Hashing also plays a big part in how passwords are stored. Rather than keeping the password itself, a system stores a hash of it. At login, it hashes whatever the user typed and checks that against the stored hash. There's an important wrinkle here: a plain general-purpose hash isn't enough on its own. The recommended approach uses specialized password-hashing functions that are deliberately slow and that fold in a unique random value called a salt, both of which make large-scale guessing attacks far more expensive.
What hashing is not
It helps to be clear about what hashing isn't. It's not encryption. Encryption is meant to be reversed by someone holding the right key, so the original data can be recovered. Hashing is one-way by design and was never meant to be undone. It's also not compression in any recoverable sense, because you simply can't rebuild the original input from the hash.
Summary
To sum up, a cryptographic hash function turns an input into a fixed-size output in a way that's consistent, sensitive to the smallest change, and not practical to reverse. SHA-256 and SHA-512 are the common choices today, while MD5 and SHA-1 are best left to legacy or non-security work. You'll see hashes used to verify data integrity and, with the help of specialized password-hashing methods, to store passwords.