Hash

Hash[expr]

gives an integer hash code for the expression expr.

Hash[expr,"type"]

gives an integer hash code of the specified type for expr.

Hash[expr,"type","format"]

gives a hash code in the specified format.

Details

  • Hash[expr,] will always give the same result for the same expression expr.
  • Possible hash code types include:
  • "Adler32"Adler 32-bit cyclic redundancy check
    "CRC32"32-bit cyclic redundancy check
    "MD2"128-bit MD2 code
    "MD4"128-bit MD4 code
    "MD5"128-bit MD5 code
    "RIPEMD160"160-bit RIPEMD code
    "RIPEMD160SHA256"RIPEMD-160 following SHA-256 (as used in Bitcoin)
    "SHA"160-bit SHA-1 code
    "SHA256"256-bit SHA code
    "SHA256SHA256"double SHA-256 code (as used in Bitcoin)
    "SHA384"384-bit SHA code
    "SHA512"512-bit SHA code
    "SHA3-224"224-bit SHA3 code
    "SHA3-256"256-bit SHA3 code
    "SHA3-384"384-bit SHA3 code
    "SHA3-512"512-bit SHA3 code
    "Keccak224"224-bit Keccak code
    "Keccak256"256-bit Keccak code
    "Keccak384"384-bit Keccak code
    "Keccak512"512-bit Keccak code
    "Expression"expression hash code (default)
  • The "Expression" hash code is computed from the internal representation of an expression and may vary between computer systems and from one version of the Wolfram Language to another.
  • For hash code types (such as "SHA") that operate on sequences of bytes, Hash[expr,] first converts expr to bytes according to:
  • exprbytes based on ToString[FullForm[expr]]
    "string"bytes in the UTF-8 representation of string
    ByteArray[]literal bytes in the byte array
  • Possible formats include:
  • "Integer"integer (default)
    "DecimalString"decimal string
    "HexString"hexadecimal string
    "HexStringLittleEndianhexadecimal string with little-endian byte order
    "Base36String"base-36 alphanumeric string
    "Base64Encoding"Base64 encoding
    "ByteArray"hash code as an explicit byte array

Examples

open allclose all

Basic Examples  (3)

Hash a string:

Digital fingerprint of data:

SHA256 hash given in hexadecimal form:

Scope  (9)

Hash a general expression:

Equivalently:

Compare all the different hash codes:

512-bit SHA code given as an integer:

512-bit SHA code given as a decimal string, including leading zeroes:

Compare the different string representations of a hash:

The double SHA code given as a ByteArray:

The byte array contains the 256 bits of the result:

View the individual bytes in the array:

When using ByteArray or a string, literal bytes are hashed:

For non-ASCII characters, UTF-8 representation is used for hashing:

Compute a cryptographic hash of zero bytes:

Applications  (2)

Provide a "checksum" to validate data integrity:

Change some of the data:

The checksum has changed:

A concatenated cryptographic hash function:

Hash code of "abcdef":

Properties & Relations  (12)

The hash is always the same for identical expressions:

Distinct hash codes come from distinct inputs:

The default hash code is "Expression":

The "Expression" hash fits in a machine word:

The leading bit is zero:

"Integer" is the default format:

"DecimalString" is the string version of "Integer", padded with zeros if necessary:

"HexString" is a base 16 representation, padded with zeros if necessary:

"Base36String" is a base 36 representation, padded with zeros if necessary:

"Base64Encoding" encodes bytes of the result using Base64 encoding:

"ByteArray" is a base 256 representation:

Convert from base 256 to an integer:

The result is the same:

Repeated hash can be obtained by using ByteArray as an intermediate result:

FileHash[file,code] is effectively equivalent to Hash[ReadByteArray[file],code]:

Possible Issues  (2)

Hash of a List of integers uses a serialized version of the list as an expression:

To hash literal bytes, use a ByteArray:

ASCII strings can be used to hash 7-bit byte values:

The Hash of an expression and a string containing the FullForm of the expression are different:

Neat Examples  (2)

Hash collisions are very rare, but possible. Here are two lists of bytes:

They are not the same:

They differ in two locations:

Hash the two sequences of bytes:

Their hashes are identical:

Distribution of hash values for different types:

Introduced in 1988
 (1.0)
 |
Updated in 2007
 (6.0)
2016
 (11.0)
2018
 (11.3)
2019
 (12.0)