A quick-and-dirty cryptographic hash function

Cryptographic hash functions sit at the core of many secure applications. They are vital to digital signatures for example: rather than signing a message, the hash of the message is signed. A cryptographic hash may be considered as a sort of digital fingerprint: it identifies a message in the same way that a fingerprint identifies a person by associating a message – no matter its size – with a string of bits of fixed length. To do this, such a hash must satisfy the following:

  1. It must be fast to compute.
  2. Given a hash value h, it should not be possible to find a message m for which h=H(m). This is called pre-image resistance.
  3. Given a message m and its hash value h, it should not be possible to find another message m'ne m for which h=H(m'). This is called weak collision resistance.
  4. It should not be possible to find any two different messages m and m' for which H(m)=H(m'). This is called strong collision resistance.

Most cryptographic hash functions are based on bit operations, and are designed to be blisteringly fast, as well as satisfying the other properties above. See the links on the wikipedia page for some examples.

However, if we are prepared to trade speed for simplicity, a very nice hash function was developed by Adi Shamir. Take two large primes p and q, and a number a whose multiplicative order is large in the ring mathbb{Z}/ mathbb{Z}_{pq}. If a message m can be encoded as a large integer, the hash is defined as


By the discrete logarithm problem, this is pre-image resistant. It is also collision resistant, for if


then we would require


But determining phi(pq) requires factoring pq.

This hash function can be easily implemented. First we have a little function which turns any ASCII text into a large number, by treating the ASCII values of each character as digits in base 256:

def str2num(s):
    return ZZ(map(ord,s),256)

Next, to find a which has a large multiplicative order. The largest possible order modulo pq is mbox{lcm}(p-1,q-1), and this will be obtained if a is a primitive root of both p and q. Given that a number can be tested to be a primitive root by checking that

a^{(p-1)/d}ne 1

for all prime factors d of p-1, we have the following little program for finding a common primitive root:

def common_primroot(p,q):
    # Finds a primitive root common to both p and q
    ps = [f for f,e in factor(p-1)]
    qs = [f for f,e in factor(q-1)]
    while true:
        outp = true
        for d in ps:
            if power_mod(a,(p-1)//d,p)==1:
        outq =  true
        for d in qs:
            if power_mod(a,(q-1)//d,q)==1:
        if outp and outq:
            return a

Now the hash, for say, 160 bits. This can be computed with these steps:

  1. Choose two primes p and q and a suitable value a.
  2. Append the string “xx” to the text. This will ensure that the empty string has a non-trivial hash.
  3. Turn the text into a number m and compute


  4. Take the residue modulo 2^{160}, turn the result into hexadecimal, and pad out to 40 hexadecimal characters.

All this can be done in one fell swoop:

def qd_hash(pl,a,p,q):
    # Implements Shamir's hash function h = a^m (mod p*q) where a is a
    # number of high multiplicative order mod p*q.  This works well if
    # a is chosen to be a common primitive root of both p and q
    return hex(power_mod(a,str2num(pl+'xx'),p*q)%(2^160)).zfill(40)

Let’s try it out, we’ll choose two 150-bit primes, and find a common primitive root:

sage: p=next_prime(randint(2^150,2^151));p
sage: q=next_prime(randint(2^150,2^151));q
sage[142]: a=common_primroot(p,q); a

For the message text, we choose the first few lines of Shakespeare’s “Richard III”. Also take another text differing by one character, and the empty string:

sage: pl="Now is the winter of our discontent/ Made glorious summer by this
sun of York;/ And all the clouds that lour'd upon our house/ In the deep
bosom of the ocean buried."
sage: pl1="Now is the winter of our discontent/ Made glorious summer by this
sun of York;/ And all the clouds that lour'd upon our house/ In the deep
bosom of the ocean buried-"
sage: pl2=''

Now hash them:

sage: qd_hash(pl,a,p,q)
sage: qd_hash(pl1,a,p,q)
sage: qd_hash(pl2,a,p,q)

A tiny change in the plaintext causes a complete change in the hash, as we would hope.

One thought on “A quick-and-dirty cryptographic hash function

  1. I know this is an old post, but I just came across it. I like! Strange that this is the first time I’ve run into it. What’s the avalanche look like? Pretty uniform distribution collision potential? One thing to point out though is that it’s implementation in the real-world would be incredibly slow compared to other hashing algorithms as–no matter what you do–it depends on division operations. And of course, if we break out of native int limitations (undoubtedly as a target hash would need to be at least 128 bits to be even remotely useful), that’s a whole new world of slow. Simply ratcheting up the size of something like SHA (aka 512) or implementing an ISAAC derived hasher, etc, etc, would be far faster and, in the real-world, undoubtedly just as secure. Still, great food for thought though and I like how the algorithm demonstrates some of the fundamentals of modern cryptography (DH key exchange immediately pops in my mind).

Leave a Reply

Your email address will not be published. Required fields are marked *