A very performant, fairly lightweight HashSet
implementation in C.
Note
This HashSet
implementation solely relies on a "Open Adressing" implementation. A "Array of Linked List
s" implementation and "Treeifying" a node after a certain threshold yielded a negative performance impact from the extensive testing needed. Therefore, the current implementation uses "Open Adressing". This HashSet
implementation does all memory management and cleanup internally.
This document provides an overview and detailed description of the functions available in this HashSet
implementation.
This HashSet
implementation provides a flexible hash set with functions to create, manipulate, and destroy hash sets. It supports both default and custom hash functions.
typedef struct HashSet *HashSet;
typedef size_t (*hash)(const void *key, size_t key_size);
HashSet hs_create(size_t hs_capacity, size_t key_size, hash hash_func);
int hs_destroy(HashSet hs);
bool hs_contains(HashSet hs, const void *key);
int hs_add(HashSet hs, const void *key);
size_t hs_size(HashSet hs);
int hs_remove(HashSet hs, const void *key);
HashSet hs_create(size_t hs_capacity, size_t key_size, hash hash_func)
Description: Creates a new HashSet
with a hash function.
Tip
Recommended way to create a HashSet
is to use a custom hash function to avoid unwanted collisions.
-
Parameters:
hs_capacity
: The initial capacity of theHashSet
.key_size
: The size of the key.hash_func
: The hash function be be used, orNULL
for generic hashing.
-
Returns: A new
HashSet
.
int hs_destroy(HashSet hs);
Description: Destroys the HashSet
.
-
Parameters:
hs
: TheHashSet
to be destroyed.
-
Returns: A success code.
bool hs_contains(HashSet hs, const void *key);
Description: Tests if the HashSet
contains the specified key.
-
Parameters:
hs
: TheHashSet
.key
: The key to be tested.
-
Returns: True or Falsehood.
int hs_add(HashSet hs, const void *key);
Description: Adds a new key to the HashSet
.
-
Parameters:
hs
: TheHashSet
.key
: The key to be added.
-
Returns: A sucess code.
size_t hs_size(HashSet hs);
Description: Returns the size of the HashSet
.
-
Parameters:
hs
: TheHashSet
.
-
Returns: The size.
int hs_remove(HashSet hs, const void *key);
Description: Removes the key from the HashSet.
-
Parameters:
hs
: TheHashSet
.key
: The key to be removed.
-
Returns: A success code.
0
: Success.- Non-zero values indicate errors.
Warning
Micro benchmarks can be misleading, and performance can vary from system to system.
Note
All implementation used a struct/class
containg a randomly generated string
as the key. Same hashing function was used in each benchmark (DJB2). Each micro benchmark is was run 100 times and the average time for each operation was taken.
Note
Used gcc -std=c2x -Ofast hashset.h hashset.c main.c for compilation.
Operation | Average Time (100,000 elements) |
---|---|
Insertion | 0.001872 seconds |
Lookup | 0.001362 seconds |
Deletion | 0.001498 seconds |
Note
Used g++ -Ofast main.cpp.
Operation | Average Time (100,000 elements) |
---|---|
Insertion | 0.00908522 seconds |
Lookup | 0.000315724 seconds |
Deletion | 0.005854 seconds |
Note
Used OpenJDK 22.
Operation | Average Time (100,000 elements) |
---|---|
Insertion | 0.006076 seconds |
Lookup | 0.001066 seconds |
Deletion | 0.001115 seconds |
Note
Used CPython 3.11.9.
Operation | Average Time (100,000 elements) |
---|---|
Insertion | 0.007460 seconds |
Lookup | 0.003630 seconds |
Deletion | 0.003446 seconds |
Here is a basic example demonstrating how to use the HashSet
functions:
#include <stdio.h>
#include "HashSet.h"
int main()
{
// Create a HashSet
HashSet hs = hs_create(10, sizeof(int), NULL);
// Add a key-value pair
int key = 1;
hs_add(hs, &key);
// Check if the HashSet contains the key
printf("%s\n", hs_contains(hs, &key) ? "true" : "false");
// Destroy the HashSet
hs_destroy(hs);
return 0;
}