Array Time Complexity And Dynamic Arrays
A big reason why arrays are one of the most used data structures is due to its efficiency and ability to remain lightweight. Most array operations run at lightning-fast speed, low constant overhead, and, due to its structure in memory (contiguous), are incredibly easy to optimize. But just how fast are arrays, really?
To fully understand how fast arrays are, let’s look at its operations and analyze what is the lower bound.
Lower Bounds
There are many implementations of algorithms that do the exact same thing. As such, the complexity of operations in a data structure is not solely reliant on the data structure itself but also its internal implementation. We can attribute the speed of an algorithm to its lower bound, as this often mirrors the usual speed of an algorithm.
A lower bound is the lower limit as to how fast an algorithm can possibly go, no matter how cleverly constructed. For example, printing a list will be required to be at a lower bound, time complexity since we need to somehow iterate over each element of the array. No matter how clever the code is, we will need to go over each item to print them out, thus incurring an cost.
On the other hand, some tasks such as sorting (specifically comparison-based sorting) have a lower bound of , the proof of why is explained through decision-tree analysis (we will not go through the extensive proof within this course, unfortunately).
Ergo, to fully analyze the speed of an array operation, we will take the lower bound of the worst-case scenario of the best implementation of said operation (confusing, we know). We can express this through the following:
Addition Operations
Let’s look at the addition operations, particularly emplace
and emplace_back
. Let’s review what emplace_back
does in order to place an algorithm at the back of the array.
- Place the element at index equal to the length of the current array.
- Increment the length by one.
For emplace_back
, the algorithm is due to its complexity not relying on the length of the current array. Step one will remain the same regardless of the size of the array. The same can be said for step two, as incrementing does not have anything to do with the length of the array (even though we are altering the length variable, the speed of the operation remains the same).
On the other hand, emplace
, and by extension, emplace_front
(note: if you have not noticed yet, emplace_front
is just a specialized version of emplace
), will have a time complexity of . As to why this is the case, let’s look at its algorithm:
- From the back, move all the elements by one to the right until we reach the desired index.
- Place the element at the desired index.
- Increment the length of the array by one.
Steps two and three are all . However, step 1 is . Why? Let’s delve into the steps:
Look at this shifting operation. Notice that since we need to shift the elements starting from index 1
up until the end of the array, we took up 5
steps / operations. Consider that this array grows even larger, and this array is of length 10
. We will also need 10
steps to perform this operation. Notice that the growth of complexity is linear to the size of the array. This is the behavior of a operation, or operations that operate in a linear time complexity.
Now that we have an idea of how time complexity and algorithm analysis can be used in the context of array operations, let’s test how well you can do this on your own.
Challenge Question: Deletion Time Complexity
Consider the array deletion operations pop
and pop_back
. What are their time complexities?
A. and
B. and
C. and
D. and
Dynamic Arrays and Algorithm Complexity Amortization
So far, we have talked about the operations in the context of a fixed-length array. However, let’s introduce the concept of dynamic arrays. Dynamic arrays (sometimes called vectors) are types of arrays that expand automatically as more elements are added or removed, contrasting with fixed-length arrays that have a defined capacity. This feature of dynamic arrays introduces a new dimension to time complexity analysis, particularly regarding operations like emplace_back
.
Additional variables are needed in order to perform proper resizing. Instead of just having the length
stored, we will also store the current capacity. Typically, the resizing action is done with the following logic:
function improved_emplace(element, index):
is length >= current capacity ?
then resize_up() else do nothing
regular_emplace(element, index)
function improved_pop(index):
element = regular_pop(index)
is length < current capacity * 0.5 ?
then resize_down() else do nothing
return element
Let’s visualize how a resizing would occur:
Resizing in Dynamic Arrays
Aside from emplacing, resizing consists of three major steps:
- Create a new copy with the size equal to the
initial size * growth factor
. - Copy the elements from the smaller array into the larger one.
- Delete / disregard the new array.
An implementation might look like this:
def resize_up(self):
self.capacity *= self.growth_factor
new_array = [None] * self.capacity
for index, element in enumerate(self._array):
new_array[index] = element
Downsizing Algorithm Explanation
Downsizing uses the exact same algorithm, but instead of creating a larger array and copying the items from the smaller array to the larger array, you create a smaller array (calculated through capacity * 0.5
) and copy the elements from the larger array to the smaller array, truncating the extra None
values.
Explaining the Increased Cost
While emplace_back
in a fixed-length array is typically an operation, as we discussed, this changes in the context of dynamic arrays. Here’s why:
- Initial Operations: Similar to fixed-length arrays, placing an element at the end and incrementing the length is an operation.
- Capacity Check and Expansion: However, before these operations, dynamic arrays need to check if there is enough space for the new element. If the array is at full capacity, it needs to expand.
- Expansion Process: This process involves creating a new, larger array and then copying all the elements from the old array to the new one. This operation is since each element must be individually copied.
Amortized Time Complexity
Given that expansions are not frequent (they occur after several emplace_back
operations), we use amortized analysis to understand the overall time complexity. Amortized analysis spreads the cost of the expensive operation (expansion) over the sequence of all operations, providing an average time per operation.
In the case of dynamic arrays, even though expansion is , it doesn’t happen with every emplace_back
operation. Most additions are still . The array typically grows by doubling its size each time it runs out of space, reducing the frequency of expansions as the array grows. This results in an amortized time complexity for emplace_back
in dynamic arrays being , despite the occasional operation for resizing.
Impact on Time Complexity
The concept of dynamic sizing impacts the time complexity in two major ways:
- Worst-Case Scenarios: Occasionally, certain operations (like
emplace_back
) can spike to due to the need to expand and copy elements. - Average Case Analysis: On average, these costly operations are rare, and their cost is spread out over many operations, leading to an amortized complexity of .
95%
CompletedYou have 1 sections remaining on this learning path.