Although hash functions in general have many uses in computer programs,
in cryptography they are used to generate a small string (the message
digest) that can represent securely a much larger string, such as a
file or message. Since the hash functions are faster than the signing
functions, it is much more efficient to compute a digital signature
using a document's message digest, which is small, than using the
arbitrarily large document itself. Additionally, a digest can be made
public without revealing the contents of the document from which it
derives. This is important in digital time-stamping, where, using hash
functions, one can get a document time-stamped without revealing its
contents to the time-stamping service (see Question 3.3.18).
A hash function used for digital authentication must have certain
properties that make it secure enough for cryptographic use.
Specifically, it must be infeasible to find a message that hashes to a
given value and it must be infeasible to find two distinct messages
that hash to the same value. The ability to find a message hashing to a
given value would enable an attacker to substitute a fake message for a
real message that was signed. It would also enable someone to falsely
disown a message by claiming that he or she actually signed a different
message hashing to the same value, thus violating the non-repudiation
property of digital signatures. The ability to find two distinct
messages hashing to the same value could enable an attack whereby
someone is tricked into signing a message which hashes to the same
value as another message with a quite different meaning. The digest
must therefore be long enough to prevent an attacker from doing an
exhaustive search for a collision. For example, if a hash function
produces 100-bit strings, exhaustive search would take 2100 attempts
on average to match a given value, and approximately 250 attempts on
average to find two inputs producing the same digest.
A digital signature system can be broken by attacking either the difficult mathematical problem on which the signature method is based or the hash function used to create the message digests. When choosing an authentication system, it is generally a good idea to choose a signature method and a hash function that require comparable efforts to break; any extra security in one of the two components is wasted, since attacks will be directed at the weaker component. Actually, attacking the hash function is harder in practice, since it requires a large amount of memory and the ability to trick the victim into signing a special message. With 264 operations, an attacker can find two messages that hash to the same digest under any of the MD hash functions; this effort is comparable to that necessary to break 512-bit RSA; thus MD5 is a good choice when using RSA with a 512-bit modulus. However, those with greater security needs, such as certifying authorities, should use a longer modulus and a hash function that produces a longer message digest; either SHS (160-bit digest) or a modified version of MD4 that produces a 256-bit digest would suffice.