home     blog     about     contact
published: 2020-07-06
started: 2020-07-06
last change: 2020-07-06

C++ std::hash for GMP's big integer types mpz_class and mpz_t

Contents

Purpose
Code
Compile and Run
Approach
License

Purpose

Fast hashing algorithm to hash GMP's big integer type mpz_class and mpz_t so that these types can be used as keys for an unordered_map. The code runs on 32 bit and on 64 bit systems, and it is easily extensible for 128 bit.

Code

#ifndef HASH_MPZ_H_
#define HASH_MPZ_H_

#include <cstddef>
#include <gmpxx.h>

template<> struct std::hash<mpz_srcptr> {
    size_t operator()(const mpz_srcptr x) const;
};

template<> struct std::hash<mpz_t> {
    size_t operator()(const mpz_t x) const;
};

template<> struct std::hash<mpz_class> {
    size_t operator()(const mpz_class &x) const;
};

#endif /* HASH_MPZ_H_ */

Snippet 1: hash_mpz.h

#include "hash_mpz.h"
#include <string_view>

static constexpr size_t pi_size_t() {
    if (sizeof(size_t) == 4) {
        return 0xc90fdaa2; // floor(pi/4 * 2^32)
    } else if (sizeof(size_t) == 8) {
        return 0xc90fdaa22168c234; // floor(pi/4 * 2^64)
    } else {
        throw std::logic_error(
            "this sizeof(size_t) is not supported. only 32 or 64 bits are supported.");
    }
}

static inline size_t scramble(size_t v) {
    return v ^ (pi_size_t() + (v << 6) + (v >> 2));
}

size_t std::hash<mpz_srcptr>::operator()(const mpz_srcptr x) const {
    std::string_view view { reinterpret_cast<char*>(x->_mp_d), abs(x->_mp_size)
            * sizeof(mp_limb_t) };
    size_t result = hash<std::string_view> { }(view);

    // produce different hashes for negative x
    if (x->_mp_size < 0) {
        result = scramble(result);
    }

    return result;
}

size_t std::hash<mpz_t>::operator()(const mpz_t x) const {
    return hash<mpz_srcptr> { }(static_cast<mpz_srcptr>(x));
}

size_t std::hash<mpz_class>::operator()(const mpz_class &x) const {
    return hash<mpz_srcptr> { }(x.get_mpz_t());
}

Snippet 2: hash_mpz.cpp

#include <iostream>
#include <gmpxx.h>
#include <unordered_map>

#include "hash_mpz.h"

using std::cout;
using std::hash;
using std::unordered_map;

int main() {
    mpz_class a;

    mpz_ui_pow_ui(a.get_mpz_t(), 168, 16);

    cout << "a       : " << a << "\n";
    cout << "hash( a): " << (hash<mpz_class> { }(a)) << "\n";
    cout << "hash(-a): " << (hash<mpz_class> { }(-a)) << "\n";

    unordered_map<mpz_class, int> map;
    map[a] = 2;
    cout << "map[a]  : " << map[a] << "\n";
}

Snippet 3: main.cpp

Download all code in one zip archive: hash_gmp_mpz_1_0_0.zip.

Compile and Run

me@hostname:~/Downloads$ unzip hash_gmp_mpz_1_0_0.zip 
Archive:  hash_gmp_mpz_1_0_0.zip
   creating: hash_gmp_mpz_1_0_0/
  inflating: hash_gmp_mpz_1_0_0/main.cpp  
  inflating: hash_gmp_mpz_1_0_0/hash_mpz.cpp  
  inflating: hash_gmp_mpz_1_0_0/hash_mpz.h  
me@hostname:~/Downloads/hash_gmp_mpz_1_0_0$ cd hash_gmp_mpz_1_0_0/
me@hostname:~/Downloads/hash_gmp_mpz_1_0_0$ g++ -std=c++17 *.cpp -o main -lgmp -lgmpxx
hash_mpz.h:5:10: fatal error: gmpxx.h: No such file or directory
 #include <gmpxx.h>
          ^~~~~~~~~
[...]
me@hostname:~/Downloads/hash_gmp_mpz_1_0_0$ sudo apt-get install libgmp-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Suggested packages:
  gmp-doc libmpfr-dev
The following NEW packages will be installed:
  libgmp-dev
[...]
me@hostname:~/Downloads/hash_gmp_mpz_1_0_0$ g++ -std=c++17 *.cpp -o main -lgmp -lgmpxx
me@hostname:~/Downloads/hash_gmp_mpz_1_0_0$ ./main 
a       : 402669288768856477614113920779288576
hash( a): 776996915127542148
hash(-a): 8581308568227109393
map[a]  : 2
me@hostname:~/Downloads/hash_gmp_mpz_1_0_0$ █
Snippet 4: Compiling and Running the Code

Approach

Since C++17, the standard library provides the specialization hash<string_view> which is used to produce the initial hash value.
First, the magnitude data of the big integer is wrapped into a string_view and then the string_view's hash value is calculated using hash<string_view>. This produces an initial hash value which only depends on the magnitude, but not on the sign, of the big integer.
To keep the hashes of positive and negative big integers different, for negative big integers only the initial hash value is scrambled once.

License

The code on this page and in the .zip file is distributed under the following licenses:

GNU LGPL v3
GNU GPL v2 or any later version of the GNU GPL

You can choose the license which is most suitable for your project. These licenses make the code free to use, share, and improve, and allow you to pass on the results. They also cover the licenses used by the GMP project, so there are no license compatibility issues.