Making a hash of things

The following is the article I submitted as part of my application for the Richard Casement internship at The Economist. My hope was to demonstrate an ability to deal with a very technical subject in a comprehensible way. This post will be automatically published once the contest has closed in all time zones.

Making a hash of things

A contest to replace a workhorse of computer security is announced

While Julius Caesar hoped to prevent the hostile interception of his orders through the use of a simple cipher, modern cryptography has far more applications. One of the key drivers behind that versatility is an important but little-known tool called a hash function. These consist of algorithms that take a particular collection of data and generate a smaller ‘fingerprint’ from it. That can later be used to verify the integrity of the data in question, which could be anything from a password to digital photographs collected at a crime scene. Hash functions are used to protect against accidental changes to data, such as those caused by file corruption, as well as intentional efforts at fraud. Cryptographer and security expert Bruce Schneier calls hash functions “the workhorse of cryptography” and explains that: “Every time you do something with security on the internet, a hash function is involved somewhere.” As techniques for digital manipulation become more accessible and sophisticated, the importance of such verification tools becomes greater. At the same time, the emergence of a significant threat to the most commonly used hashing algorithm in existence has prompted a search for a more secure replacement.

Hash functions modify data in ways subject to two conditions: that it be impossible to work backward from the transformed or ‘hashed’ version to the original, and that multiple originals not produce the same hashed output. As with standard cryptography (in which unencrypted text is passed through an algorithm to generate encrypted text, and vice versa), the standard of ‘impossibility’ is really one of impracticability, given available computing resources and the sensitivity of the data in question. The hashed ‘fingerprint’ can be compared with a file and, if they still correspond, the integrity of the file is affirmed. Also, computer systems that store hashed versions of passwords do not pose the risk of yielding all user passwords in plain text form, if the files containing them are accidentally exposed of maliciously infiltrated. When users enter passwords to be authenticated, they can be hashed and compared with the stored version, without the need to store the unencrypted form. Given the frequency of ‘insider’ attacks within organizations, such precautions benefit both the users and owners of the systems in question.

Given their wide range of uses, the integrity of hash functions has become important for many industries and applications. For instance, they are used to verify the integrity of software security updates distributed automatically over the Internet. If malicious users were able to modify a file in a way that did not change the ‘fingerprint,’ as verified through a common algorithm, it could open the door to various kinds of attack. Alternatively, malicious users who could work backward from hashed data to the original form could compromise systems in other ways. They could, for instance, gain access to the unencrypted form of all the passwords in a large database. Since most people use the same password for several applications, such an attack could lead to further breaches. The SHA-1 algorithm, which has been widely used since 1995, was significantly compromised in February 2005. This was achieved by a team led by Xiaoyun Wang and primarily based at China’s Shandong University. In the past, the team had demonstrated attacks against MD5 and SHA: hash functions prior to SHA-1. Their success has prompted calls for a more durable replacement.

The need for such a replacement has now led the U.S. National Institute of Standards and Technology to initiate a contest to devise a successor. The competition is to begin in the fall of 2008, and continue until 2011. Contests like the one ongoing have a promising history in cryptography. Notably, the Advanced Encryption Standard, which was devised as a more secure replacement to the prior Data Encryption Standard, was decided upon by means of an open competition between fifteen teams of cryptographers between 1997 and 2000. At least some of those disappointed in that contest are now hard at work on what they hope will become one of the standard hash functions of the future.

Author: Milan

In the spring of 2005, I graduated from the University of British Columbia with a degree in International Relations and a general focus in the area of environmental politics. In the fall of 2005, I began reading for an M.Phil in IR at Wadham College, Oxford. Outside school, I am very interested in photography, writing, and the outdoors. I am writing this blog to keep in touch with friends and family around the world, provide a more personal view of graduate student life in Oxford, and pass on some lessons I've learned here.

24 thoughts on “Making a hash of things”

  1. Well done, though you say nothing about how hash functions actually work.

  2. Very interesting and understandable too. How do the hash functions work?

  3. I have no idea how they work. Just as I do not know how coronary bypass operations work.

    The mechanics are not the important thing here.

  4. “The National Institute of Standards and Technology has opened a public competition for the development of a new cryptographic hash algorithm, which will be called Secure Hash Algorithm-3 (SHA-3), and will augment the current algorithms specified in the Federal Information Processing Standard (FIPS) 180-2. This is in response to serious attacks reported in recent years against cryptographic hash algorithms, including SHA-1, and because SHA-1 and the SHA-2 family share a similar design. Submissions are being accepted through October 2008, and the competition timeline indicates that a winner will be announced in 2012.”

    Read more of this story at Slashdot.

  5. Making a hash of it

    Dec 11th 2007
    A cast-iron way of identifying documents is looking a little rusty

    “Of course, it is all an illusion. He would certainly not have time, once the result is known, to construct a document containing the name of the winner in such a way that its hash would come out just right. That would be what cryptographers call a “pre-image attack”, and no way of mounting such an attack is known. Instead, Dr de Weger’s group has concentrated its efforts on the other property of hashes: that it is hard to find two documents that have the same hash. Hard, but as it turns out, not impossible. Constructing two such co-incidental documents is known as a “collision attack”. And it is this trick that the researchers have pulled off. Indeed, they have created not merely two, but 12 documents that have the same hash. Each of these documents contains the name of one of the 12 leading presidential candidates, so it is just a question of posting the right one once the result of the election is known.”

  6. “The point of all this is to expose the weakness of MD5 hashing. You could, for instance, present your boss with a document for him to sign. If this all happened electronically, the document could then be hashed to make sure it was not altered after the signing. But if you have a suitably prepared collision attack at your disposal, and have thus created two very different documents with the same hash, then your boss is at your mercy. Bear that in mind, next time you are negotiating a pay rise.”

  7. Ever Better Cryptanalytic Results Against SHA-1

    By Bruce Schneier

    The SHA family (which, I suppose, should really be called the MD4 family) of cryptographic hash functions has been under attack for a long time. In 2005, we saw the first cryptanalysis of SHA-1 that was faster than brute force: collisions in 2^69 hash operations, later improved to 2^63 operations. A great result, but not devastating. But remember the great truism of cryptanalysis: attacks always get better, they never get worse. Last week, devastating got a whole lot closer. A new attack can, at least in theory, find collisions in 2^52 hash operations — well within the realm of computational possibility. Assuming the cryptanalysis is correct, we should expect to see an actual SHA-1 collision within the year.

  8. MD6 Withdrawn from SHA-3 Competition

    By Bruce Schneier

    In other SHA-3 news, Ron Rivest seems to have withdrawn MD6 from the SHA-3 competition. From an e-mail to a NIST mailing list:

    We suggest that MD6 is not yet ready for the next SHA-3 round, and we also provide some suggestions for NIST as the contest moves forward.

    Basically, the issue is that in order for MD6 to be fast enough to be competitive, the designers have to reduce the number of rounds down to 30-40, and at those rounds, the algorithm loses its proofs of resistance to differential attacks.

  9. Generating Fast MD5 Collisions With ATI Video Cards

    “Yesterday at Black Hat USA 2009, a talk entitled MD5 Chosen-Prefix Collisions on GPUs (whitepaper) (Both PDFs) presented an implementation written in assembly language for ATI video cards that achieves 1.6 billion MD5 hash/sec, or 2.2 billion MD5 hash/sec with reversing, on an ATI Radeon HD 4850 X2. This is faster than the much-publicized 1.4-1.9 billion hash/sec figure that was supposedly reached on a PlayStation 3 by Nick Breese at Black Hat Europe 2008 (he later noticed an error in his benchmarking tool). Compared to the cluster of 215 PlayStation 3s that was used to create a rogue CA in December 2008, Marc Bevand claimed a cluster of 12 machines with 24 video cards would be a bit faster, consume 5 times less power, and be 10 times cheaper.”

  10. Dmitry Sklyarov and co. crack Canon’s “image verification” anti-photoshopping tool

    Cory Doctorow at 8:18 AM Tuesday, Nov 30, 2010

    Dmitry Sklyarov and his colleagues at Elcomsoft have cracked the “image verification” system in high-end Canon cameras; this system digitally signs the photos you take so any alternations, “touch ups” or other modifications can be detected. Sklyarov (who became a cause celebre when he broke the DRM on Adobe’s ebooks and was thrown in jail by the FBI at Adobe’s behest) and his team have a sense of humor — they’ve produced correctly signed images of astronauts planting the Soviet flag on the moon and the Statue of Liberty holding a sickle, among others.

  11. OSK-E3 is proved useless

    The credibility of photographic evidence becomes vital in numerous situations for insurance companies and courts, as they may accept digital image as indisputable evidence if it can be proven genuine. However, the discovered vulnerability in Canon Original Data Security system proves that verification data can be forged and, thus, the whole verification system cannot be relied upon.

    In brief, modern DSLR (Digital Single-Lens Reflex) cameras produced by Canon feature Original Data Security system which is meant to securely validate the authenticity of image data and prove image genuineness. Accordingly, one can use OSK-E3 (Canon Original Data Security Kit) which comprises smart card and special software to verify a digitally signed image.

    ElcomSoft discovered the vulnerability which allows producing images that will be positively validated by Canon’s own Original Data Security Kit (OSK-E3) regardless of whether or not the images are, in fact, genuine.

  12. France’s new data retention law requires online service providers to retain databases of their users’ addresses, real names and passwords, and to supply these to police on demand. Leaving aside the risk of retaining all this personal information (identity thieves, stalkers, etc — that which isn’t stored can’t be stolen and leaked), there’s the risk of requiring providers to store plaintext unhashed passwords, as Bruce Schneier points out.

    Well-designed systems don’t store passwords; rather, they take the password you supply and run it through a cryptographic hashing algorithm that turns it into another string (in theory, this string can’t be turned back into the password). When you re-visit the website and supply your password, it is run through the algorithm again, and then the result is compared to the stored version. That way, no one — not even the provider — knows your password (except you). Again, that which isn’t stored can’t be leaked. Requiring French online services to keep a record of unhashed passwords is a reversal of decades of best practices in security.

  13. “Elcomsoft claims to have broken Nikon’s Image Authentication system which — apparently only in theory — ensures that a photograph is authentic and not tampered with through a digital signature. They were able to extract the signing key from a camera and use it to have a modified image pass the software verification, rendering the rather expensive feature mostly marketed to law enforcement all but useless. So far Nikon has not given a statement. Canon’s competing system was cracked by the same company last December.”


    A new community-enhanced version of John the Ripper adds support for GPUs via CUDA and OpenCL, currently focusing on slow-to-compute hashes and ciphers such as Fedora’s and Ubuntu’s sha512crypt, OpenBSD’s bcrypt, encrypted RAR archives, WiFi WPA-PSK. A 5x speedup over AMD FX-8120 CPU per-chip is achieved for sha512crypt on NVIDIA GTX 570, whereas bcrypt barely reaches the CPU’s speed on an AMD Radeon HD 7970 (a high-end GPU). This result reaffirms that bcrypt is a better current choice than sha512crypt (let alone sha256crypt) for operating systems, applications, and websites to move to, unless they already use one of these ‘slow’ hashes and until a newer/future password hashing method such as one based on the sequential memory-hard functions concept is ready to move to. The same John the Ripper release also happens to add support for cracking of many additional and diverse hash types ranging from IBM RACF’s as used on mainframes to Russian GOST and to Drupal 7’s as used on popular websites — just to give a few examples — as well as support for Mac OS X keychains, KeePass and Password Safe databases, Office 2007/2010 and ODF documents, Firefox/Thunderbird/SeaMonkey master passwords, more RAR archive kinds, WPA-PSK, VNC and SIP authentication, and it makes greater use of AMD Bulldozer’s XOP extensions

  15. Today, more than 20 years after of SHA-1 was first introduced, we are announcing the first practical technique for generating a collision. This represents the culmination of two years of research that sprung from a collaboration between the CWI Institute in Amsterdam and Google. We’ve summarized how we went about generating a collision below. As a proof of the attack, we are releasing two PDFs that have identical SHA-1 hashes but different content.

    For the tech community, our findings emphasize the necessity of sunsetting SHA-1 usage. Google has advocated the deprecation of SHA-1 for many years, particularly when it comes to signing TLS certificates. As early as 2014, the Chrome team announced that they would gradually phase out using SHA-1. We hope our practical attack on SHA-1 will cement that the protocol should no longer be considered secure.

    We hope that our practical attack against SHA-1 will finally convince the industry that it is urgent to move to safer alternatives such as SHA-256.

Leave a Reply

Your email address will not be published. Required fields are marked *