Data Structures#

Data structures are fundamental constructs in computer science and are used to store, and organise data efficiently. Using the correct data structure for the task at hand can make your code a lot simpler and more performant.

Although there are many data structures present within Python and associated libraries, in this notebook, we’ll go through the 4 fundamental building blocks native to Python: Lists, Tuples, Dictionaries and Sets.

Lists#

We already discussed lists briefly in Control Structures, so we’ll quickly recap, then go through some functionalities that lists provide.

Lists can be instantiated as an empty object, or initialised with values:

instantiated_list = [] # Defining an empty list to add values to later.
initialised_list = [1, 2, 3] # Initialising the list with some values, this can still be added to later!

Each item within a list can be accessed by using its index, Python (as with all good programming languages) starts counting from 0, so the first element in a list is element 0:

initialised_list[0] # should give the first element in the list, which is 1.

It’s also possible to select a range of values by using a slice. These can be used as index_to_start_from:index_to_end_before:

initialised_list[0:2] # Returns a list containing 1 and 2.

Let’s add some objects to out empty list:

instantiated_list.append(21) # We can add numerical values (integers, floats etc...)
instantiated_list.append("Yes") # strings
instantiated_list.append(initialised_list) # other lists

instantiated_list

A list is also mutable, meaning we can change any of these values, let’s change the second element from "Yes" to "Changed":

instantiated_list[1] = "Changed"

instantiated_list

Tuples#

Tuples are like immutable lists, so once they are initialised they cannot be changed.

first_tuple = ("a", "b", "c")

Try changing the first element in the tuple to another letter:

# Your code here

You should see an error. Tuples are really useful if you want to protect your data from being overwritten/changed in your code.

Dictionaries#

Dictionaries store key-value pairs, where keys are unique and must be immutable.

first_dict = {"a":0, "b":1, "c":2} # initialise a dictionary

second_dict = {} # instatiate an empty dictionary
second_dict[0] = 3 # add the key 0 to the dictionary and assign it the value 3.

second_dict

It’s then possible to check the keys and values separately or iterate over them:

print(first_dict.keys()) # Just the keys
print(first_dict.values()) # Just the values 

print(first_dict.items()) # Returns tuples of (key, value)

Can you change the value associated with "b" in first_dict? And add a new key-value pair into the same dictionary?

# Your code here

Sets#

Sets are an unordered collection of unique immutable objects. They’re extremely useful for utilising set theory operations, such as unions, intersections and differences.

set_a = {"a", "b", "c"}
set_b = {"c", "d", "e"}

print(set_a.union(set_b)) # Combines the sets, notice that the "c" is not duplicated, and the order is not preserved
print(set_a.intersection(set_b)) # Returns a set containing values that are in both set_a and set_b
print(set_a.difference(set_b)) # Returns a set of values that are only in set_a and not in set_b

Values can be added to an already existing set or removed:

set_b.remove("d") # Remove "d" from the set
set_b.add("f") # Add in an "f"
set_b

Converting between data structures#

Converting between data structures (casting) in Python is simple! Each data structure has a keyword associated with it:

  • list

  • tuple

  • dict

  • set

This does all the heavy lifting:

my_list = [1, 2, 3, 4, 5] # initialise the list with some values

my_tuple = tuple(my_list) # convert it to a tuple
my_set = set(my_list) # convert it to a set

print(my_list)
print(my_tuple)
print(my_set)

Dictionary conversion is slightly different, because you need to provide both keys and values, but it is possible to create an empty dictionary by only providing keys:

dict.fromkeys(my_list)

Task#

You have been given a list containing duplicated below, can you remove all the duplicates and return the results in an immutable structure?

duplicate_containing_list = [7, 2, 3, 4, 7, 5, 9, 3]

# Your code here