Back to Data Structures and Algorithms Interview
Data Structures and Algorithms Interview

Data Structures and Algorithms Interview

19 of 20 Completed

Array Time Complexity And Dynamic Arrays

A big reason why arrays are one of the most used data structures is due to its efficiency and ability to remain lightweight. Most array operations run at lightning-fast speed, low constant overhead, and, due to its structure in memory (contiguous), are incredibly easy to optimize. But just how fast are arrays, really?

To fully understand how fast arrays are, let’s look at its operations and analyze what is the lower bound.

Lower Bounds

There are many implementations of algorithms that do the exact same thing. As such, the complexity of operations in a data structure is not solely reliant on the data structure itself but also its internal implementation. We can attribute the speed of an algorithm to its lower bound, as this often mirrors the usual speed of an algorithm.

A lower bound is the lower limit as to how fast an algorithm can possibly go, no matter how cleverly constructed. For example, printing a list will be required to be at a lower bound, O(n) O(n) time complexity since we need to somehow iterate over each element of the array. No matter how clever the code is, we will need to go over each item to print them out, thus incurring an O(n) O(n) cost.

On the other hand, some tasks such as sorting (specifically comparison-based sorting) have a lower bound of O(nlogn) O(n \log n) , the proof of why is explained through decision-tree analysis (we will not go through the extensive proof within this course, unfortunately).

Ergo, to fully analyze the speed of an array operation, we will take the lower bound of the worst-case scenario of the best implementation of said operation (confusing, we know). We can express this through the following:

Lower  Bound=Best  Algorithm(Worst  Case  Input) Lower\;Bound = Best\;Algorithm(Worst\;Case\;Input)

Addition Operations

Let’s look at the addition operations, particularly emplace and emplace_back. Let’s review what emplace_back does in order to place an algorithm at the back of the array.

  1. Place the element at index equal to the length of the current array.
  2. Increment the length by one.

For emplace_back, the algorithm is O(1) O(1) due to its complexity not relying on the length of the current array. Step one will remain the same regardless of the size of the array. The same can be said for step two, as incrementing does not have anything to do with the length of the array (even though we are altering the length variable, the speed of the operation remains the same).

On the other hand, emplace, and by extension, emplace_front (note: if you have not noticed yet, emplace_front is just a specialized version of emplace), will have a time complexity of O(n) O(n) . As to why this is the case, let’s look at its algorithm:

  1. From the back, move all the elements by one to the right until we reach the desired index.
  2. Place the element at the desired index.
  3. Increment the length of the array by one.

Steps two and three are all O(1) O(1) . However, step 1 is O(n) O(n) . Why? Let’s delve into the steps:

Emplace 0,1

Look at this shifting operation. Notice that since we need to shift the elements starting from index 1 up until the end of the array, we took up 5 steps / operations. Consider that this array grows even larger, and this array is of length 10. We will also need 10 steps to perform this operation. Notice that the growth of complexity is linear to the size of the array. This is the behavior of a O(n) O(n) operation, or operations that operate in a linear time complexity.

Now that we have an idea of how time complexity and algorithm analysis can be used in the context of array operations, let’s test how well you can do this on your own.

Challenge Question: Deletion Time Complexity

Consider the array deletion operations pop and pop_back. What are their time complexities?

A. O(1) O(1) and O(n) O(n)

B. O(1) O(1) and O(1) O(1)

C. O(n) O(n) and O(1) O(1)

D. O(n) O(n) and O(n) O(n)

Dynamic Arrays and Algorithm Complexity Amortization

So far, we have talked about the operations in the context of a fixed-length array. However, let’s introduce the concept of dynamic arrays. Dynamic arrays (sometimes called vectors) are types of arrays that expand automatically as more elements are added or removed, contrasting with fixed-length arrays that have a defined capacity. This feature of dynamic arrays introduces a new dimension to time complexity analysis, particularly regarding operations like emplace_back.

Additional variables are needed in order to perform proper resizing. Instead of just having the length stored, we will also store the current capacity. Typically, the resizing action is done with the following logic:

function improved_emplace(element, index):
	is length >= current capacity ?
		then resize_up() else do nothing
	regular_emplace(element, index)

function improved_pop(index):
	element = regular_pop(index)
	is length < current capacity * 0.5 ?
		then resize_down() else do nothing
	return element

Let’s visualize how a resizing would occur:

Improved Emplace

Resizing in Dynamic Arrays

Aside from emplacing, resizing consists of three major steps:

  1. Create a new copy with the size equal to the initial size * growth factor.
  2. Copy the elements from the smaller array into the larger one.
  3. Delete / disregard the new array.

An implementation might look like this:

def resize_up(self):
    self.capacity *= self.growth_factor
    new_array = [None] * self.capacity
    for index, element in enumerate(self._array):
        new_array[index] = element

Downsizing Algorithm Explanation

Downsizing uses the exact same algorithm, but instead of creating a larger array and copying the items from the smaller array to the larger array, you create a smaller array (calculated through capacity * 0.5) and copy the elements from the larger array to the smaller array, truncating the extra None values.

Explaining the Increased Cost

While emplace_back in a fixed-length array is typically an O(1)O(1) operation, as we discussed, this changes in the context of dynamic arrays. Here’s why:

  1. Initial Operations: Similar to fixed-length arrays, placing an element at the end and incrementing the length is an O(1)O(1) operation.
  2. Capacity Check and Expansion: However, before these operations, dynamic arrays need to check if there is enough space for the new element. If the array is at full capacity, it needs to expand.
  3. Expansion Process: This process involves creating a new, larger array and then copying all the elements from the old array to the new one. This operation is O(n)O(n) since each element must be individually copied.

Amortized Time Complexity

Given that expansions are not frequent (they occur after several emplace_back operations), we use amortized analysis to understand the overall time complexity. Amortized analysis spreads the cost of the expensive operation (expansion) over the sequence of all operations, providing an average time per operation.

In the case of dynamic arrays, even though expansion is O(n)O(n), it doesn’t happen with every emplace_back operation. Most additions are still O(1)O(1). The array typically grows by doubling its size each time it runs out of space, reducing the frequency of expansions as the array grows. This results in an amortized time complexity for emplace_back in dynamic arrays being O(1)O(1), despite the occasional O(n)O(n) operation for resizing.

Impact on Time Complexity

The concept of dynamic sizing impacts the time complexity in two major ways:

  1. Worst-Case Scenarios: Occasionally, certain operations (like emplace_back) can spike to O(n)O(n) due to the need to expand and copy elements.
  2. Average Case Analysis: On average, these costly operations are rare, and their cost is spread out over many O(1)O(1) operations, leading to an amortized complexity of O(1)O(1).

How would you resize arrays

Good job, keep it up!

95%

Completed

You have 1 sections remaining on this learning path.