Array-Lists, Linked Lists, Hash Tables (Dictionaries)

Dynamic arrays, or array-lists, have the ability to scale depending on how much memory is required to actually store all the data. Since a single array has a fixed size and cannot be resized easily, as in recreation of the array is necessary, the array-list is an abstract datatype that can be used to emulate lists through getter/setter functions. [1. Kay, 2019]

A linked list is a different take of making a dynamically resizable list as an abstract datatype. Instead of using larger arrays of fixed size and aggregating them, linked lists work by having a series of nodes each with a pointer to the “next” or “prev” node. When you step through a linked list, you can have theoretically infinite items as well as be able to remove/resize the list easily by just changing pointers to point to other nodes, perhaps even skipping some. The starting node, called the head, needs to be recognised as the root of the list. The final node, often called the tail, has a NULL (Or NoneType in Python) pointer, which signifies the end of the linked list. [2. Kay, 2019] [3. Garg, n.d.]

Single-linked-lists only have one “next” pointer per node.
Double-linked-lists have both a “next” and a “previous” pointer per node.

A hash table works by using a hash function to generate a UID for each object that you’d like to store. When you’d like to lookup the value, you can simply use the hash function to find the UID and access the item directly, wasting little time. (Ignoring the fact that hash functions tend to be somewhat computationally intensive, to counter this, a weaker hash function is usually used first to quickly lookup items, followed by a longer hash function with much much lower chance of collisions occurring, to verify that it’s the correct object. The chances of having a collision in both the weak & strong hash algorithm are exponentially minute, example given in the next paragraph.) [4. Kay, 2019] [5. Garg, n.d.]

Image Source (CC-BY-SA 3.0)

From the research I have done into the Python programming language, it uses the last 15 bits of the object the “weak” hashing function and then later uses a stronger hashing algorithm afterwards. One of the side-effects of this, is that you can get incredibly regular patterns for integers and the source code has a comment which I will paraphrase & condense three paragraphs into one for brevity: [6. Peters, 2001]

Having a function that collides somewhat frequently is useful to be able to fill up consecutive gaps in-between your objects. It’s also silly to slow your hash table down for the rare occurrences and it’s best to keep the initial check dirt-cheap, hence we just take the last 15 bits as our first check, which is also 100% collision-less for contiguous integers. Failing this, collision resolution has two different ways to also calculate hashes.

Python v3.8.0 Source Code
[python/cpython; /Objects/dictobject.c#L135]
Primary source found using ‘git blame’. [6. Peters, 2001]
As you can see in the example I’ve run on my machine, a range of integers [1, 2, 3, 4, …] maps directly to the hashes [1, 2, 3, 4, …]

References & Sources

1. Kay, A. (2018) The ‘Array-List’ Data Structure. [online] Data Structures and Algorithms. Published by: Birmingham City University, Available at: https://dsaa.werp.site/post/the-array-list-data-structure/ [Accessed 22 Feb. 2019].
2. Kay, A. (2018) The ‘The Linked List’ Data Structure. [online] Data Structures and Algorithms. Published by: Birmingham City University, Available at: https://dsaa.werp.site/post/the-linked-list-data-structure/ [Accessed 22 Feb. 2019].
3. Garg, P. (n.d.). Singly Linked List | Data Structures. [online] HackerEarth. Available at: https://www.hackerearth.com/practice/data-structures/linked-list/singly-linked-list/tutorial/ [Accessed 22 Feb. 2019]
4. Kay, A. (2018) The ‘The Hash Map’ Data Structure. [online] Data Structures and Algorithms. Published by: Birmingham City University, Available at: https://dsaa.werp.site/post/the-hash-map-data-structure/ [Accessed 22 Feb. 2019].
5. Garg, P. (n.d.). Basics of Hash Tables | Data Structures. [online] HackerEarth. Available at: https://www.hackerearth.com/practice/data-structures/hash-tables/basics-of-hash-tables/tutorial/ [Accessed 22 Feb. 2019]
6. Peters, T; et al. (Open-source-software, Accessed: 2019-05-02). Python v2.2 – v3.8.0 Source Code [online] Python Software Foundation. Available at: https://github.com/python/cpython/blob/4631da1242fc96002a3c0462a87d087e567368aa/Objects/dictobject.c#L135 | Published with commit on 2001-05-27: https://github.com/python/cpython/commit/15d4929ae4ad5206e2f7fa947d632bfba9b8bfd1#diff-2131209d0deb0e50c93a88ec6c7b0d52R34

Creative Commons Licence

Leave a comment