Hi :wave:, in this post, I’m going to introduce something new I’ll be looking into in the next few posts where I go into detail with some of the fundamental algorithms and data structures that you use day to day - and more importantly - implementing our own version so we can see in more detail what is being done underneath and how this affects performance. The examples for this will be written in C#, but if I get chance I’ll also create a version in swift, potentially using swift playgrounds.

First up - we’ll be looking at some of the basic data structures, starting with lists.

I’ve created a repository that contains everything that I’ve created in this posts, which you can find on GitHub here.

What is a list?

A list is similar to an array in that it can store any type of object, and can store the same object multiple times. In some languages, lists can also store items of different data types.

The primary difference between an array and a list however is that lists can be dynamically sized. This means that their size can be increased or decreased as items are removed. This is different to an array where to increase the size of the array, you would need to create a new array, and then copy all of the existing data across.

Ironically, this is how lists are implemented in most programming languages, as they use an array internally, and re-size the array as needed.

What are lists good for?

Lists are primarily useful for when you don’t know the size of data you will be working with. They are also really useful when you want to frequently add or remove items from a collection of data, as they hide the need to change the size of the underlying array.

Implementation

There are two main parts of a list that we will implement:

  • Adding an item to the list
  • Removing an item from the list

Adding an item to the list

When adding an item to the list, the first thing we need to check is if we are already at the end of the internal array. If we are, we will need to create a new array of a larger size, copy the existing items across, and then change the internal array to point to the new, updated version.

To do this, we also need to keep track of the current index in the

We can do this using the following:

public class CustomList<T>
{
    private T[] _items = new T[16];
    private int _currIndex;

    public void AddItem(T item)
    {
        // If the array is full, extend the array and copy the values across.
        if (_currIndex + 1 == _items.Length)
        {
            var extendedItems = new T[_items.Length * 2];
            Array.Copy(_items, extendedItems, _items.Length);
            _items = extendedItems;
        }

        _items[_currIndex] = item;
        _currIndex++;
    }

    public int Count() => _currIndex;
}

This code makes a few assumptions, the first being that _items.Length is never 0, as if it is, the array would never grow.

This code also doubles the size of the array each time we need to extend it, which as the list gets larger will become very inefficient and will likely use more memory than required. The other side of this however is that the operation of extending the array is O(n), so we would like to avoid extending the array as much as possible. This means that extending the array for each item you add would also be inefficient, as this would mean there is a best case performance of θ(n)

Performance

As I’ve just hinted at, the worst case performance for adding an item to a list is O(n), where n is the number of items in the list, as if we need to extend the list, we would need to copy each of the items in the array to the new, extended array one by one.

The average case performance however is likely to be θ(1), as inserting an item into the array is a constant time operation and we will not need to extend the array on each addition.

Removing an item from the list

The main way to remove an item from a list is to remove an item at a given index, which is relatively straight forward, as all we need to do is:

  1. Remove the item from the array
  2. Move everything after the index backwards one place.
  3. Set the value at the end index to be the default
  4. Decrease the currIndex to decrease the size of the list.

We can also accomplish this by setting everything from the target index to the end of the array, equal to the value in the next index.

After this, we then need to se the value at the end of the list (currIndex in our case) to be the default value for the type. We can do this with the following:

public void RemoveAt(int index)
{
    for (int i = index; i < _items.Length - 1; i++)
    {
        _items[i] = _items[i + 1];
    }

    _items[_currIndex] = default;
    _currIndex--;
}

Now if this was production code, you’d want to do a bounds check on the index passed in before doing anything to check it is not out of bounds.

Performance

When removing from an index, we may need to iterate over the entire list to update the indexes of each of the items. This would occur if we needed to remove the item at the very start of the list.

This means that we would get a worst-case complexity of O(n), where n is the number of items in the list.

Finding an item in the list.

To find an item in the list, we will need to iterate through every index in the list and compare the value we are looking for to the value in the list. This is sometimes also known as a linear search.

This is because we can’t guarantee that the list is sorted, so we can’t make any assumptions about how the data is organized in the list. This means that, for example, if we were searching for an int value in our list, we would still need to check all of the value after a larger value, even if we are looking for a smaller value. This is where a binary search comes in.