Throughout the course of any data-based career, upskilling becomes a necessity but isn’t the easiest of tasks. When transitioning from tools like Excel to more complex and less visual tools like SQL, learning to recognize familiar patterns can help translate your current skills into new techniques and technologies.
One of the most-used functions by Excel specialists is the lookup function VLOOKUP. In SQL, however, there’s not quite a direct equivalent. How are you then supposed to look up values?
In this article, we’ll explore other options and discuss why VLOOKUP isn’t possible– or even a necessity– in SQL.
Let’s start by breaking down the logic of VLOOKUP and translating these operations to SQL.
Your typical VLOOKUP function has the following arguments:
lookup_value
: The value you want to search for in the leftmost column of the range or table.table_array
: The range of cells containing the data you want to search.col_index_num
: The index of where we’ll search for our value.range_lookup
: (optional) If set to TRUE or if not specified, an approximate match will be performed. If you set it to FALSE, an exact match will be required.Starting with just the required values, let’s look at an example:
Our VLOOKUP function can be expressed as the following:
“In the table A3:C8
, find the name (represented by col_index_num = 2
) of the employee with EmpID = 3
(from the value of the A6
cell).”
This would return the string “Arianna”.
? Note: in VLOOKUP, lookup_value
can either be a cell reference or a value. For example, we could change it to VLOOKUP(4, A3:C8, 2)
, and it will return “Racquel”.
Using SQL, solving this problem is a bit simpler. Let’s consider the following table:
Employees
EmpID | Fname | Age |
---|---|---|
1 | Erick | 24 |
2 | Rosa | 26 |
3 | Arianna | 30 |
4 | Racquel | 32 |
5 | Farhan | 19 |
Considering “Arianna” for our example again, we can use the following query:
SELECT Fname from employees where EmpId = 3;
Unlike Excel, there are no well-defined positions in SQL tables, so it’s not necessary to specify the table array or the cell reference. EmpID
won’t necessarily be the leftmost column, and “Farhan” may not be the last Fname
in the column. Instead, we’ll deal with direct values.
Let’s break down this query:
Fname
.table_array
. Here, we’re referring to Fname
from the Employees table.TRUE
.In VLOOKUP, the condition is that the lookup_value
of the data at col_index_num
should be the same. Given the table above, the lookup value is 3
, so we should retrieve the Fname
with an EmpID
equal to 3. The WHERE clause here does the same.
In Excel, VLOOKUP is commonly used to retrieve data from different sheets. With SQL, we instead use the JOIN
operation, which allows you to access and combine data from multiple tables.
Joins can be a challenging concept to grasp if you’re new to SQL. Put into simple terms, JOINs allow you to access data from other tables.
From the JOIN section in Interview Query’s SQL learning path, JOINs can be described as:
An operation that matches rows of two different tables and appends them.
Just like VLOOKUP
and WHERE
, JOIN
also has a matching process that allows it to return the appropriate data (or record) based on a condition.
Let’s look at the following for an example:
Employees
A | B | C | D | |
---|---|---|---|---|
1 | EmpID | Fname | Age | Department_ID |
2 | 1 | Erick | 24 | 3 |
3 | 2 | Rosa | 26 | 3 |
4 | 3 | Arianna | 30 | 1 |
5 | 4 | Racquel | 32 | 2 |
6 | 5 | Farhan | 19 | 1 |
Departments
A | B | |
---|---|---|
1 | Department_ID | Department_Name |
2 | 1 | Human Resources |
3 | 2 | Accounting |
4 | 3 | Analytics |
Suppose that, given an employee name, you want to retrieve their department. How would you do this in VLOOKUP and SQL?
Using VLOOKUP requires a two-step process. We’d need to find the Department_ID
of the employee from the Employees table first and then use this to find the department name in the Departments table.
Our full VLOOKUP solution should look like this:
VLOOKUP(VLOOKUP("name", Employees!B2:D6, 3, FALSE), Departments!A2:B4, 2, FALSE)
Our solution starts with looking up the Department_ID
. Then, we’ll conduct a secondary VLOOKUP to find the corresponding department name using the returned id.
Step 1:
VLOOKUP("Farhan", Employees!B2:D6, 3, FALSE) ' returns Farhan's department_id, '1'
Step 2:
VLOOKUP(1, Departments!A2:B4, 2, FALSE) ' the '1' here is from the value in STEP 1.
This would return “Human Resources”.
To solve this problem using SQL, we’d use a JOIN operation between the Employees table and the Departments table based on a condition. In this case, our condition is where the Department_ID
of the two tables matches.
Our solution would then look like the following:
SELECT Departments.Department_Name
FROM Employees
JOIN Departments
ON Employees.Department_ID = Departments.Department_ID
WHERE Employees.Fname = 'name';
Broken down into sections, we have:
department_name
value from the Departments
table. Unlike our first SQL example above, we need to be explicit about which table we’re taking our values from. While the SQL server might be able to infer this, it’s good practice to include them in the query for readability.Employees
table. In this instance, we’re getting the department name from an employee name. Fname
is a field of the table Employees
, so our FROM clause is based on this table. FROM
can take in more than one table, but, in this case, it’s best to stick to one for efficiency.Departments
to the Employee
table.Before this point, our query was operating on the following set with the FROM Employees
clause:
EmpID | Fname | Age | Department_ID |
---|---|---|---|
1 | Erick | 24 | 3 |
2 | Rosa | 26 | 3 |
3 | Arianna | 30 | 1 |
4 | Racquel | 32 | 2 |
5 | Farhan | 19 | 1 |
Now that we’ve added the JOIN operation, our set now includes the following:
EmpID | Fname | Age | Department_ID | Department_ID | Department_Name |
---|---|---|---|---|---|
1 | Erick | 24 | 3 | 1 | Human Resources |
1 | Erick | 24 | 3 | 2 | Accounting |
1 | Erick | 24 | 3 | 3 | Analytics |
2 | Rosa | 26 | 3 | 1 | Human Resources |
2 | Rosa | 26 | 3 | 2 | Accounting |
2 | Rosa | 26 | 3 | 3 | Analytics |
3 | Arianna | 30 | 1 | 1 | Human Resources |
3 | Arianna | 30 | 1 | 2 | Accounting |
3 | Arianna | 30 | 1 | 3 | Analytics |
4 | Racquel | 32 | 2 | 1 | Human Resources |
4 | Racquel | 32 | 2 | 2 | Accounting |
4 | Racquel | 32 | 2 | 3 | Analytics |
5 | Farhan | 19 | 1 | 1 | Human Resources |
5 | Farhan | 19 | 1 | 2 | Accounting |
5 | Farhan | 19 | 1 | 3 | Analytics |
This set is the Cartesian product of the two tables. A Cartesian product of two sets is the combination of all possible pairs formed by taking one element from each set.
However, our solution isn’t quite finished at this point. This is when we perform the matching process, using the ON clause.
ON Employees.Department_ID = Departments.Department_ID
This clause eliminates all the rows in the Cartesian product that don’t meet the match requirement, which is for the department id of the first set to match with the second set.
Using the last column as an example, Employees.Department_ID
is ‘1’, while the Departments.Department_ID
is ‘3’. This would not be included in the joined set.
EmpID | Fname | Age | Department_ID | Department_ID | Department_Name |
---|---|---|---|---|---|
5 | Farhan | 19 | 1 | 3 | Analytics |
After applying the ON clause, we should get the following set:
EmpID | Fname | Age | Department_ID | Department_ID | Department_Name |
---|---|---|---|---|---|
1 | Erick | 24 | 3 | 3 | Analytics |
2 | Rosa | 26 | 3 | 3 | Analytics |
3 | Arianna | 30 | 1 | 1 | Human Resources |
4 | Racquel | 32 | 2 | 2 | Accounting |
5 | Farhan | 19 | 1 | 1 | Human Resources |
With all of the unnecessary data eliminated, we can now clearly see the relationship between the employees and their departments.
The final WHERE clause narrows down the set to a specific employee name. If we’re using **"Farhan"
** as an example, this would select the row with the statement WHERE employees.Fname = "Farhan"
.
EmpID | Fname | Age | Department_ID | Department_ID | Department_Name |
---|---|---|---|---|---|
5 | Farhan | 19 | 1 | 1 | Human Resources |
With our SELECT clause, only the department name (”Human Resources”) will be returned, just like VLOOKUP.
While Excel’s VLOOKUP function and SQL’s JOIN operation are both capable of retrieving and matching data across multiple tables or sheets, their syntax and usage vary depending on the context. SQL provides more powerful and flexible operations, while Excel offers a more intuitive interface for non-programmers. To continue your SQL journey, learn more using our SQL learning path.
The best way to get better in SQL is practice. Interview Query offers a variety of SQL learning resources to help you practice and improve your SQL skills, including: