src.pydasa.structs.tables.scht#

Module to represent the SCHashTable data structure for the Hash Table in PyDASA.

Classes:

Bucket: Represents a bucket in the hash table using a single linked list. SCHashTable: Implements a hash table with separate chaining for collision resolution, supporting dynamic resizing and customizable comparison functions.

IMPORTANT: based on the implementations proposed by the following authors/books:

# . Algorithms, 4th Edition, Robert Sedgewick and Kevin Wayne. # . Data Structure and Algorithms in Python, M.T. Goodrich, R. Tamassia, M.H. Goldwasser.

Attributes#

DFLT_SC_ALPHA

Default load factor (alpha) for the SCHashTable, by default is 4.0.

MAX_SC_ALPHA

Maximum load factor (alpha) for the SCHashTable, by default is 8.0.

MIN_SC_ALPHA

Minimum load factor (alpha) for the SCHashTable, by default is 2.0.

Classes#

Bucket

Bucket class to represent a bucket in the Hash Table with the Separate Chaining method. The structure is based (inherits) on a custom singly linked list (SingleLinkedList) for PyDASA.

SCHashTable

Abstract base class for generic types.

Module Contents#

src.pydasa.structs.tables.scht.DFLT_SC_ALPHA: float = 4.0#

Default load factor (alpha) for the SCHashTable, by default is 4.0.

src.pydasa.structs.tables.scht.MAX_SC_ALPHA: float = 8.0#

Maximum load factor (alpha) for the SCHashTable, by default is 8.0.

src.pydasa.structs.tables.scht.MIN_SC_ALPHA: float = 2.0#

Minimum load factor (alpha) for the SCHashTable, by default is 2.0.

class src.pydasa.structs.tables.scht.Bucket#

Bases: Generic[pydasa.structs.types.generics.T], pydasa.structs.lists.sllt.SingleLinkedList[pydasa.structs.tables.htme.MapEntry[pydasa.structs.types.generics.T]]

Bucket class to represent a bucket in the Hash Table with the Separate Chaining method. The structure is based (inherits) on a custom singly linked list (SingleLinkedList) for PyDASA.

Parameters:
  • SingleLinkedList (dataclass) – PyDASA custom class for a single linked list.

  • Generic (T) – Generic type for a Python data structure.

__str__()#

__str__() function to return a string representation of the Bucket. It also extends the Node class.

Returns:

string representation of the Bucket.

Return type:

str

__repr__()#

__repr__() function to return a string representation of the Bucket. It also extends the Node class.

Returns:

string representation of the Bucket.

Return type:

str

class src.pydasa.structs.tables.scht.SCHashTable#

Bases: Generic[pydasa.structs.types.generics.T]

Abstract base class for generic types.

A generic type is typically declared by inheriting from this class parameterized with one or more type variables. For example, a generic mapping type might be defined as:

class Mapping(Generic[KT, VT]):
    def __getitem__(self, key: KT) -> VT:
        ...
    # Etc.

This class can then be used as follows:

def lookup_name(mapping: Mapping[KT, VT], key: KT, default: VT) -> VT:
    try:
        return mapping[key]
    except KeyError:
        return default
rehashable: bool = True#

Boolean to indicate if the hash table can be rehashed. By default is True.

nentries: int = 1#

Inicial number of entries (n) for the hash table. By default is 1, but should be set according to the number of entries expected to be stored.

NOTE: the reserved space (n) is NOT the capacity (M) of the hash table.

mcapacity: int = 1#

The capacity (M) of the hash table. By default is 1, but should be set according to the number of entries expected to be stored.

alpha: float | None = 4.0#

Load factor (alpha) for the hash table. By default is 4.0.

NOTE: alpha = n/M (n: number of expected entries, M: capacity of the hash table).

cmp_function: Callable[[Any, pydasa.structs.tables.htme.MapEntry[pydasa.structs.types.generics.T]], int] | None = None#

Customizable comparison function for SCHashTable and its MapEntry objects. Defaults to dflt_cmp_function_ht() from PyDASA, but can be overridden by the user.

hash_table: pydasa.structs.lists.arlt.ArrayList[Bucket[pydasa.structs.types.generics.T]] | None = None#

Index of the hash table where the Buckets are stored. By default is an empty ArrayList initialized with the configured capacity (M).

key: str | None = '_idx'#

Customizable key name for identifying elements in the SCHashTable. Defaults to DFLT_DICT_KEY = ‘_id’ from PyDASA, but can be overridden by the user.

prime: int = 109345121#

Prime number (P) for the MAD compression function. By default is 109345121, but can be overridden by the user.

NOTE: the MAD compression function is: h(k) = ((a*k + b) mod P) mod M, where a and b are two random integers, P is a prime number and M is the hash table capacity.

min_alpha: float = 2.0#

Minimum load factor (alpha) for the hash table. By default is 2.0. But can be overridden by the user.

max_alpha: float = 8.0#

Maximum load factor (alpha) for the hash table. By default is 8.0. But can be overridden by the user.

iodata: List[pydasa.structs.types.generics.T] | None = None#

Optional Python list for loading external data intho the SCHashTable. Defaults to None but can be provided during creation.

__post_init__()#

__post_init__() Initializes the SCHashTable after creation by setting attributes like rehashable, mcapacity, alpha, cmp_function, key, prime, scale, shift, and iodata. It also sets the default values for the min_alpha and max_alpha attributes, which are used to control the load factor of the hash table.

NOTE: Special method called automatically after object creation.

Return type:

None

default_compare(key1, entry2)#
default_compare() Default comparison function for the SCHashTable and its MapEntry objects. Compares the key of the MapEntry with the provided key key1 and reurns:
  • 0 if they are equal.

  • 1 if the MapEntry key is less than key1.

  • -1 if the MapEntry key is greater than key1.

Parameters:
  • key1 (Hashable) – Key from the first MapEntry to compare.

  • entry2 (MapEntry) – Second MapEntry to compare.

Returns:

Comparison result.

Return type:

int

property size: int#

size Property to retrieve the number if entries (n) in the SCHashTable. :returns: Number of entries (n) in the SCHashTable. :rtype: int

Return type:

int

property empty: bool#

empty Property to check if the SCHashTable has entries or not.

Returns:

True if the SCHashTable is empty, False otherwise.

Return type:

bool

property collisions: int#

collisions Property to retrieve the number of collisions in the SCHashTable.

Returns:

Number of collisions in the SCHashTable.

Return type:

int

clear()#

clear() function to reset the SCHashTable to its initial state. It clears all the entries in the hash table and resets the size, collisions and current load factor.

Return type:

None

insert(key, value)#

insert() adds a new entry to the SCHashTable. It creates a new MapEntry object with the key-value pair.

Parameters:
  • key (T) – key for the entry.

  • value (T) – value for the entry.

Return type:

None

get_entry(key)#

get_entry() retrieves an entry from the SCHashTable using the provided key.

Parameters:

key (T) – key for the entry.

Raises:

IndexError – error if the SCHashTable is empty.

Returns:

MapEntry object with the key-value pair if found, None otherwise.

Return type:

Optional[MapEntry]

get_bucket(key)#

get_bucket() retrieves the bucket containing the key-value pair from the SCHashTable using the provided key.

Parameters:

key (T) – key for the entry.

Raises:

IndexError – error if the SCHashTable is empty.

Returns:

Bucket object containing the key-value pair if found, None otherwise.

Return type:

Optional[Bucket]

is_present(key)#

is_present() checks if the provided key is present in the SCHashTable.

Parameters:

key (T) – key for the entry.

Raises:

IndexError – error if the SCHashTable is empty.

Returns:

True if the key is present in the SCHashTable, False otherwise.

Return type:

bool

delete(key)#

delete() removes an entry from the SCHashTable using the provided key.

Parameters:

key (T) – key for the entry.

Raises:

IndexError – error if the SCHashTable is empty.

Returns:

MapEntry object with the key-value pair if found, None otherwise.

Return type:

Optional[MapEntry]

keys()#

keys() returns a single linked list of keys from the SCHashTable.

Returns:

list of keys from the SCHashTable. e.g. [key1, key2, …].

Return type:

SingleLinkedList[T]

values()#

values() returns a single linked list of values from the SCHashTable.

Returns:

list of values from the SCHashTable. e.g. [value1, value2, …].

Return type:

SingleLinkedList[T]

entries()#

entries() returns a list of tuples with the key and value of each entry in the hash table.

Returns:

list of tuples with the key-value pair of each entry in the hash table. e.g. [(key1, value1), (key2, value2), …].

Return type:

SingleLinkedList[T]

resize()#

resize() rehashes the SCHashTable by creating a new hash table with a new capacity (M) and rehashing all the entries from the old hash table to the new one. It also updates the size, collisions and current load factor.

Return type:

None

__len__()#

__len__() function to return the number of entries (n) in the SCHashTable.

Returns:

Number of entries (n) in the SCHashTable.

Return type:

int

__str__()#

__str__() function to return a string representation of the SCHashTable.

Returns:

string representation of the SCHashTable.

Return type:

str

__repr__()#

__repr__() function to return a string representation of the SCHashTable.

Returns:

string representation of the SCHashTable.

Return type:

str