What Are Data Structures?
At its very core, data structures are a data storage format built for efficient and fast data operations. However, this definition, while technically correct, is blunt and vague. Data itself is an abstract concept, and understanding how to structure abstract objects is intrinsically complex. Let’s start with a real-world example to get things rolling.
Let’s assume that you work as a librarian. To manage your books, some books are placed on an array inside shelves. On the other hand, others may be placed on top of each other, like a stack. For easier lookup, your library also implements an indexing system called the Dewey Decimal System which acts as a lookup table or a map.
Notice the words in bold, array, stack, and map. These words coincidentally are also the names of some data structures. Suppose that we treat each book as a unit of information, or data. The arrays and stacks of books, as well as the lookup table are tools that help us access, store, and search for books faster.
Data structures function in a similar vein. They are concepts of computer science and computing in general that allow for efficient data storage and use.
Why should I learn data structures?
Most of the essential data structures are already implemented as standard containers in most languages. In Python, for example, these are already built-in, like the {set}
, [list]
, {'map':''}
, and (tuple)
. On other programming languages like Java, these data structures can also be found in the standard library’s collections package.
In fact, when trying to use data structures, we should, as much as possible, stick to standard implementations.
If that is the case, then why do we need to learn data structures?
It’s one thing to have a hammer, and another to know how to use it.
Understanding data structures is akin to knowing not only that a hammer exists, but also when and how to use it effectively. Standard implementations in languages like Python or Java provide you with the tools, but without understanding the underlying structures, you might not use them to their fullest potential.
This, in of itself, isn’t bad. Sometimes, abstracting the logic away and being able to treat a data structure as an oracle can be a good thing. Often, we might want things to “just work”. However, there will be an instance where “just working” won’t be enough.
Example
Imagine you’re working on a large-scale web application that handles thousands of user requests per second. Initially, you choose a Python list to store and manage session data because it’s straightforward and easy to use. As the application grows, you start noticing performance issues: the server takes longer to respond, and users experience delays.
This slowdown is a bottleneck, and it stems from your choice of data structure. Python lists are not optimized for scenarios where you frequently need to search for elements, especially in large datasets. Every time a user session is queried, the list undergoes a linear search, which is time-consuming for large lists.
Upon analyzing this bottleneck, you realize that a different data structure, like a hash table (implemented as a dict
in Python), would be more efficient. Hash tables provide faster lookups compared to lists, significantly reducing the time it takes to find and retrieve user sessions.
In this scenario, not understanding the implications of using a list over a hash table led to a significant performance issue. While the list was easy to implement and worked fine initially (“just working”), it wasn’t the right tool for the job as the scale of data grew.
Once you know how to use a hammer, every problem is a nail.
If you have programming experience, you should at least know one data structure, most commonly a list or an array. However, if all you know is an array, then everything might look like an array problem.
Sure, many data structures are built on top of arrays (case in point, heaps), but often, these data structures maintain properties that make them able to do specific things really well.
There are a lot of research, math, and proof around data structures. Knowing at least a few basic and some complex ones won’t hurt, and knowing how they work, even just in theory, might alter your software design decisions for the better.
What are the different types of data structure?
Data structures come in many forms, and listing all of them is simply not feasible. However, data structures can be one of the following forms.
- Lists
- Stacks
- Queues
- Graphs
- Trees
Let’s explore these data structures in detail further along the course.
70%
CompletedYou have 6 sections remaining on this learning path.