Hash functions for MPI datatypes.
Julien Langou, George Bosilca, Graham Fagg and Jack Dongarra.
-
Presentation at ICL lunch presentation (09/23/2004
(PDF)
-
Technical paper (UT-CS-05-552) presented at Euro PVM-MPI 2005
(PDF)
In the Proceedings of the 12th European PVM/MPI Users'
Group Meeting, Sorrento, Italy, September 2005. LCNS-3666:76-83, 2005.
-
Code used to make the experiments of the paper (under modified BSD licence)
08/01/2006 hash-MPI-datatype--0.9.2.tgz : add modified BSD licence
05/03/2005 hash-MPI-datatype--0.9.1.tgz : first public release
-------------------------------------------------------------------------------------------
Addition Multiplication
------------------------------------------- -----------------------------------------------
mmm_bs1 + modulo (2^w)-1 left bitshift circulaire 1(<=> * modulo (2^w)-1 )
xor_bs1 xor left bitshift circulaire 1
gfd + in GF(2^w) * in GF(2^w)
crc cyclic redundancy check cyclic redundancy check
-------------------------------------------------------------------------------------------
--------------------------------------------
TESTING OF THE PROPERTIES OF CONCATENATION
--------------------------------------------
to create the executables: make testing
./xxx_test_concat_mmm_bs1
./xxx_test_concat_xor_bs1
check that
f( [ X1, X2 ] ) = f(X1) o f(X2)
for various value of X1, X2 (see OPTIONS)
and check that
f( [ X1, X2 , X3, X4 ] ) = ((f(X1) o f(X2))o(f(X3) o f(X4)))
for various value of X1, X2, X3, X4 (see OPTIONS)
OPTIONS:
-wg <WG>
the number of predefined datatype
DEFAULT: WG=48 (ask George why)
-w <W> -m <M>
the size of the signatures will be W*M bits
W*M should be smaller or equal to 32
CORRECT VALUE: (W=16,M=1) (W=8,M=2)
DEFAULT: (W=8,M=2)
-nx <NX>
nx is the length of the word to be encoded
the first and the second words are both of size nx/2
if -nx1 or -nx2, this option is ignored
DEFAULT: NX=100
-nx1 <NX1>
the size of the first word X1
if nx1 is defined and nx2 is not, nx2=nx1 whatever nx is
DEFAULT: NX1=50
-nx2 <NX2>
the size of the second word X2
if nx2 is defined and nx1 is not, nx1=nx2 whatever nx is
DEFAULT: 50
-seed <SEED>
the seed to generate the first random number
DEFAULT: 1
-iter <ITER>
the number of times the operations are performed
DEFAULT: 10
---------------------------
HASHING QUALITY
---------------------------
to create the executables: make quality
./xxx_bourrin_main_16
./xxx_bourrin_main_32
./xxx_bourrin_random_16
./xxx_bourrin_random_16
NO OPTIONS AVAILABLE
those programs analyze the quality of the hash of different hash functions.
Namely we compute the number of collisions and duplicates for the following hash:
1. crc 16 bit CRC/CCITT
2. crc XMODEM
3. crc ARC 1. crc 32 bit CRC/CCITT
4. xor bs1 (w=16,m= 1) 2. xor bs1 (w=16,m= 2)
5. mmm bs1 (w=16,m= 1) 3. mmm bs1 (w=16,m= 2)
6. gfd (w=16,m= 1) 4. gfd (w=16,m= 2)
7. xor bs1 (w= 8,m= 2) 5. xor bs1 (w= 8,m= 4)
8. mmm bs1 (w= 8,m= 2) 6. mmm bs1 (w= 8,m= 4)
9. gfd (w= 8,m= 2) 7. gfd (w= 8,m= 4)
for the following set of type signatures:
[[ (N:x) means N times the datatype x ]]
For all x for 1 to WG=13,
0. ( N1*N2:x )
For all x from 1 to WG=13 and for all y from x+1 to WG=13
1. ( N2:(1:x,(N1-1):y))
2. (1:x) ((N2-1):(1:x, (N1-1):y)) (1:x,(N1-2):y)
3. (floor(N1*N2/2):x) (ceil(N1*N2/2):y)
4. if N1 is different from N2, N1 and N2 are interchanged in 1.
5. if N1 is different from N2, N1 and N2 are interchanged in 2.
[[ For example N1=5,N2=3: ]]
[[ 0. xxxxxxxxxxxxxxx ]]
[[ 1. xyyyyxyyyyxyyyy ]]
[[ 2. xxyyyyxyyyyxyyy ]]
[[ 3. xxxxxxxyyyyyyyy ]]
[[ 4. yyxxxxyxxxxyxxx ]]
[[ 5. yyyyyyyyxxxxxxx ]]
In this codes (N1,N2) takes the following values:
(N1=15,N2=15) ; (N1=20,N2= 2) ; (N1=20,N2=10) ; (N1=20,N2=15) ;
(N1=20,N2=20) ; (N1=21,N2= 2) ; (N1=21,N2=21) ; (N1=31,N2=17) ;
(N1=32,N2=16) ; (N1=33,N2= 2) ;
This amounts in comparing 6994 hash values.
It reproduces the tabular presented in Table 1 in UT-CS-05-552.
Datatypes of the kind 0., 1. and 2. are first introduced in Gropp, Euro PVM MPI (2000)
---------------------------
TIMING QUALITY
---------------------------
to create the executables: make timing
./xxx_chrono_16
<!warning!> : line 37 of chrono_main_16.c, please hard code the frequency of your machine
otherwise will not have the correct timings
This programs performs three timings:
1- encoding of X in f(X).
X is a sentence of NX (or NX1*2, NX2*2, NX1+NX2) words, each word is chosen randomly among WG,
f(X) is of size W*M
2- encoding of X=(X1,X2) knowing f(X1) and f(X2), X1 is of size NX1, X2 is of size NX2
3- encoding of X knowing X=uni(X1,n), X1 is of size NX1
The signature f is gfd_signature or crc_signature.
For 2 and 3, we also check that the answer is correct; if it is not an error message is produced.
-------------------------------------
OPTIONS AVAILABLE IN THE COMMAND LINE
-------------------------------------
-wg <WG>
the number of predefined datatype
DEFAULT: WG=48 (ask George why)
-w <W> -m <M>
the size of the signatures will be W*M bits
W*M should be smaller or equal to 32
CORRECT VALUE: (W=16,M=1) (W=8,M=2)
DEFAULT: (W=8,M=2)
-nx <NX>
nx is the length of the word to be encoded
the first and the second words are both of size nx/2
if -nx1 or -nx2, this option is ignored
DEFAULT: NX=100
-nx1 <NX1>
the size of the first word X1
if nx1 is defined and nx2 is not, nx2=nx1 whatever nx is
DEFAULT: NX1=50
-nx2 <NX2>
the size of the second word X2
if nx2 is defined and nx1 is not, nx1=nx2 whatever nx is
DEFAULT: 50
-seed <SEED>
the seed to generate the first random number
DEFAULT: 1
-iter <ITER>
the number of times the operations are performed
DEFAULT: 10
Example of use:
./xxx_test_concat_mmm_bs1 -nx1 10 -nx2 5 -nx3 1 -nx4 2 -wg 255 -w 8 -m 3 -iter 10 -seed 1