Alboin wrote:
- You run your key through the hash function. (I'm using FNV.)
- The hash is then the index to the array.
- Access the element in the array.
- Do chaining if collision occurs.
I'm somewhat confused by the whole thing:
How does one keep an arrays with hashed keys so large? For instance, "Hello World!" hashed is 1754836950. Am I then supposed to mod that by the array size? How would one go about this?
int modulated_hash = hash(text) % table_size;
Quote:
And then, with the whole chaining and open addressing situation: If implementing an associative array, would collisions be met? If so, how would one determine what element in the list (Considering I went with chaining.) was wanted?
struct item *curitem = items[modulated_hash];
while (curitem != 0 && curitem->key != requested_item_key) curitem = curitem->next;
if (curitem) return curitem->value;
return 0;
This is with closed addressing (the names confuse, yes). It's closed since given an item, you know in which box it must be. You could also use open addressing:
end_hash = modulated_hash;
second_hash = hash(text) % small_prime;
while (items[modulated_hash] != 0 && items[modulated_hash]->key != requested_item_key && ((modulated_hash + second_hash) != end_hash))
modulated_hash = modulated_hash + second_hash % table_size;
if (curitem && curitem->key == requested_item_key) return curitem->value;
return 0;
It's better to limit the hash during generation to the given limit instead of to an arbitrary other limit since you will get an uneven distribution given this method. Admitted, the difference is fairly negligible until you get to table sizes somewhat closer to your original hash size. For instance, if you have a hash function that gives 65536 equally distributed hashes and you use that as a hash on a 32767 entry table the first two entries will be a lot larger than the rest.
Any other specific questions?