Documentation
¶
Overview ¶
Package bbhash implements the BBHash algorithm for minimal perfect hash functions.
Index ¶
- Variables
- func Keys(hashFunc func([]byte) uint64, chunks iter.Seq[[]byte]) []uint64
- func ReadChunks(readerInfo io.Reader, bufSz int) iter.Seq[[]byte]
- type BBHash
- func (bb BBHash) AppendBinary(buf []byte) (_ []byte, err error)
- func (bb BBHash) BitVectors(varName string) string
- func (bb BBHash) BitsPerKey() float64
- func (bb BBHash) Find(key uint64) uint64
- func (bb BBHash) Key(index uint64) uint64
- func (bb BBHash) LevelVectors() [][]uint64
- func (bb BBHash) Levels() int
- func (bb BBHash) MarshalBinary() ([]byte, error)
- func (bb BBHash) String() string
- func (bb *BBHash) UnmarshalBinary(data []byte) error
- type BBHash2
- func (b2 BBHash2) AppendBinary(buf []byte) (_ []byte, err error)
- func (bb BBHash2) BitVectors(varName string) string
- func (bb BBHash2) BitsPerKey() float64
- func (bb BBHash2) Find(key uint64) uint64
- func (bb BBHash2) Key(index uint64) uint64
- func (bb BBHash2) LevelVectors() [][][]uint64
- func (b2 BBHash2) MarshalBinary() ([]byte, error)
- func (bb BBHash2) MaxMinLevels() (max, min int)
- func (bb BBHash2) Partitions() int
- func (bb BBHash2) SinglePartition() *BBHash
- func (bb BBHash2) String() string
- func (b2 *BBHash2) UnmarshalBinary(data []byte) error
- type Options
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var FastHashFunc = func(buf []byte) uint64 { return fast.Hash64(123, buf) }
var SHA256HashFunc = func(buf []byte) uint64 { h := sha256.New() h.Write(buf) return binary.LittleEndian.Uint64(h.Sum(nil)) }
Functions ¶
Types ¶
type BBHash ¶
type BBHash struct {
// contains filtered or unexported fields
}
BBHash represents a minimal perfect hash for a set of keys.
func (BBHash) AppendBinary ¶
AppendBinary implements the encoding.BinaryAppender interface.
func (BBHash) BitVectors ¶
BitVectors returns a Go slice for BBHash's per-level bit vectors. This is intended for testing and debugging; no guarantees are made about the format.
func (BBHash) BitsPerKey ¶
BitsPerKey returns the number of bits per key in the minimal perfect hash.
func (BBHash) Find ¶
Find returns a unique index representing the key in the minimal hash set.
The return value is meaningful ONLY for keys in the original key set (provided at the time of construction of the minimal hash set).
If the key is in the original key set, the return value is guaranteed to be in the range [1, len(keys)].
If the key is not in the original key set, two things can happen: 1. The return value is 0, representing that the key was not in the original key set. 2. The return value is in the expected range [1, len(keys)], but is a false positive.
Example ¶
package main
import (
"fmt"
"github.com/relab/bbhash"
)
func main() {
keys := []uint64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
bb, err := bbhash.New(keys, bbhash.Gamma(1.5))
if err != nil {
panic(err)
}
for _, key := range keys {
hashIndex := bb.Find(key)
fmt.Printf("%d, ", hashIndex)
}
fmt.Println()
}
Output: 2, 6, 1, 4, 9, 3, 8, 5, 10, 7,
func (BBHash) Key ¶
Key returns the key for the given index. The index must be in the range [1, len(keys)], otherwise 0 is returned.
func (BBHash) LevelVectors ¶
LevelVectors returns a slice representation of the BBHash's per-level bit vectors.
func (BBHash) MarshalBinary ¶
MarshalBinary implements the encoding.BinaryMarshaler interface.
func (BBHash) String ¶
String returns a string representation of BBHash with overall and per-level statistics.
func (*BBHash) UnmarshalBinary ¶
UnmarshalBinary implements the encoding.BinaryUnmarshaler interface.
type BBHash2 ¶
type BBHash2 struct {
// contains filtered or unexported fields
}
BBHash2 represents a minimal perfect hash for a set of keys.
func New ¶
New creates a new BBHash2 for the given keys. The keys must be unique. Creation is configured using the provided options. The default options are used if none are provided. Available options include: Gamma, InitialLevels, Partitions, Parallel, and WithReverseMap. With fewer than 1000 keys, the sequential version is always used.
func (BBHash2) AppendBinary ¶
AppendBinary implements the encoding.BinaryAppender interface.
func (BBHash2) BitVectors ¶
BitVectors returns a Go slice for BBHash2's per-partition, per-level bit vectors. This is intended for testing and debugging; no guarantees are made about the format.
func (BBHash2) BitsPerKey ¶
BitsPerKey returns the number of bits per key in the minimal perfect hash.
func (BBHash2) Find ¶
Find returns a unique index representing the key in the minimal hash set.
The return value is meaningful ONLY for keys in the original key set (provided at the time of construction of the minimal hash set).
If the key is in the original key set, the return value is guaranteed to be in the range [1, len(keys)].
If the key is not in the original key set, two things can happen: 1. The return value is 0, representing that the key was not in the original key set. 2. The return value is in the expected range [1, len(keys)], but is a false positive.
func (BBHash2) Key ¶
Key returns the key for the given index. The index must be in the range [1, len(keys)], otherwise 0 is returned.
func (BBHash2) LevelVectors ¶
LevelVectors returns a slice representation of BBHash2's per-partition, per-level bit vectors.
func (BBHash2) MarshalBinary ¶
MarshalBinary implements the encoding.BinaryMarshaler interface.
func (BBHash2) MaxMinLevels ¶
MaxMinLevels returns the maximum and minimum number of levels across all partitions.
func (BBHash2) Partitions ¶
Partitions returns the number of partitions in the BBHash2. This is mainly useful for testing and may be removed in the future.
func (BBHash2) SinglePartition ¶
SinglePartition returns the underlying BBHash if it contains a single partition. If there are multiple partitions, it returns nil.
func (*BBHash2) UnmarshalBinary ¶
UnmarshalBinary implements the encoding.BinaryUnmarshaler interface.
type Options ¶
type Options func(*options)
func Gamma ¶
Gamma sets the gamma parameter for creating a BBHash. The gamma parameter is the expansion factor for the bit vector; the paper recommends a value of 2.0. The larger the value the more space will be consumed by the BBHash.
func InitialLevels ¶
InitialLevels sets the initial number of levels to use when creating a BBHash.
func Parallel ¶
func Parallel() Options
Parallel creates a BBHash by sharding the keys across multiple goroutines. This option is not compatible with the Partitions option.
func Partitions ¶
Partitions sets the number of partitions to use when creating a BBHash2. The keys are partitioned into the given the number partitions. Setting partitions to less than 2 results in a single BBHash, wrapped in a BBHash2.
func WithReverseMap ¶
func WithReverseMap() Options
WithReverseMap creates a reverse map when creating a BBHash.