Denny's Blog: Cksum

Create Fingerprints Using Cksum

cksum(1) is a very important utility since it
can figure out the fingerprint/message digests
using several key algorithms employed in
cryptography.

     1. cksum
     2. md4
     3. md5
     4. rmd160
     5. sha1
     6. sha256
     7. sha384
     8. sha512
     9. sum
    10. sysvsum

sha512 is the best algorithm to use since
it gives the longest output and there is
very little chance of collision.

I think this article is not going to make
much sense without explaining the rationale
and the math behind the idea of cryptographic
hashes of message digests.

The basic goal is quite easy to state and
understand. The idea of a message digest is
to create a fixed length "fingerprint" from
any input data of any length, be it 2 bytes
or 2 Terabytes. This is done in a such a way
that the output varies significantly for
slight changes in input data.

All that is fine and dandy but the most
important aspect of the checksum algorithm
is its ability to avoid collisions.
Collisions are input values for which the
checksum algorithm produces the same output.
This can be quite dangerous and defeats the
very purpose of having a checksum in the
first place.

But mathematically speaking, nature enforces
a limit to the probability or possibilty of
collisions. But in practice this works quite
well as long as your output sample space is
quite big. Which is the case with sha512
digests. MD4 is broken. Don't use it. MD5 is
weak too.

The importance of cryptographic hashing comes
from many angles. First thing is that it is key
to generating digital signatures. A signature
is a private key encrypted message digest of
the input message. Simple and straight.

Then you have something called HMAC or hashed
message authentication code where a secret key
is used for generating message digests.
Normally message digests do not employ any
secret information. It is completely open.
Anyone can generate cryptographic hashes since
the algorithm is well known, there are no keys
and given the input, the output is fixed.

This is alright when we want to detect accidental
changes or integrity of file transfers. But this
does not protect us from malicious tampering. For
that we normally encrypt the hash with a secret key.
Or append it with the message and encrypt it. That
way we can detect tampering.

However HMAC is different. In this method,
the cryptographic hash is protected with a
secret key and only if you possess the secret
key you can generate the hash.

HMAC is widely used in TLS or SSL web security.
We have already seen many applications for
message digests or cryptographic hashes.

There is one important detail however. All
the public key cryptosystems in particular
the most widely used RSA algorithm relies
on cryptographic hashes in a interesting way.

RSA is a little complicated to explain in
this article but my idea is to illustrate
that cryptographic hashes have a much bigger
role to play than simple integrity checking.

Cryptographic hash functions are also known
as one way hash functions. Which is to say
that the function is not reversible. There
is no inverse of the function. You can only
get an output from input, never the other
way round.

RSA is nothing but a one way hash function
of the input data with a key. RSA relies on
the prime number factorization problem. So
the idea here is that you can multiply two
prime numbers trivially but you cannot divide
them. You can of course but not without a
significant computational overhead.

Now that we have seen enough theory, let us
get to the practical side and figure out how
it can help us in real life. After all math
has a great real life significance.

$ cksum /etc/passwd
3171604895 3646 /etc/passwd

$ cksum -a sha512 /etc/passwd
SHA512 (/etc/passwd) =
b4d6a742cada5305686832f1037b60f79b56fe6dfdf99
04e6070295e74c2341535db26b731e27e04a73f0cb70b
b589d31b8e9e18e207a8aae5aa81d06ea29f5a
(above cksum output line wrapped)

$ cksum -a sha256 /etc/passwd
SHA256 (/etc/passwd) =
fd2626c043a288c0a25bdc9772af4b19e001c890e9317
0a4de40043a9516e94a
(above cksum output line wrapped)

$ cksum -a rmd160 /etc/passwd
RMD160 (/etc/passwd) =
70455b60aad955b556aa4052af017ecf61bfe5a1

$ cksum -a md5 /etc/passwd
MD5 (/etc/passwd) =
e7818836fe36fa1fde1ab6198ee2da77

$ cksum -a sha1 /etc/passwd
SHA1 (/etc/passwd) =
9c80fdd62b69909a705c64fe79b6294d778ffef6

You can avoid the above mentioned issue of
collisions with the cksum(1) utility since
you have access to several state of the art
checksum algorithms from one command/utility.
So if you are paranoid kindly compare the sha1
and sha256 sums of the same file at both sides
after transfer. That way you can avoid issues
with collision.

openssl(1) also comes built in with access to
several checksum algorithms and so can sha1 and
md5 commands help under OpenBSD. But cksum has
an advantage of supporting many algorithms. Moreover
all these utilities come with base OpenBSD. There
is never a need to install any specific package.
In other words you are guaranteed to find them,
on any OpenBSD box!

Note on Authorship:
This article was contributed by Girish Venkatachalam
who is also a co-author on Denny's OpenBSD Newbies Blog.

Labels: cksum, cryptographic algorithms, encryption, hashes, md4, md5, openbsd, rmd160, security, sha1, sha256, sha384, sha512, sum, sysvsum