Top 13 Amazon SQL Interview Questions (Updated for 2024)

Written by IQ Team

IQ Team

Published July 24, 2024

Estimated reading time: 13 minutes

Table of contents

Overview

Amazon SQL Interview Process

Who Gets Asked SQL Interview Questions at Amazon?

Examples of Amazon SQL Interview Questions

Conclusion

Overview

At Amazon, one of the most well-known, highly sought-after FAANG companies, you have the chance to use your skills to help improve one of the most successful businesses in the world. Acing the Amazon interview is no easy feat.

In this article, we’ll be focusing on getting you through Amazon SQL interview questions.

Amazon SQL Interview Process

Interview Graphics

Amazon’s interviews are not unlike other tech companies. The overall process is as follows:

Initial Phone Screen
Technical Interview
Onsite Interview

The initial screen is a phone interview where your resume and the position will be discussed. This is a lower stakes interview, more of a ‘getting to know you’ than an actual test of your abilities.

During the technical interview, you should expect at least a few questions centered around the specifics of the job. For example, a data science technical screen consists of coding, algorithms, and machine learning. For most data science positions, at least one of these questions will be SQL-based.

Finally, the onsite interview will consist of several steps with interviewers. Here, you will be tested on more thorough and difficult concepts and coding questions. In addition, the interviewers will be looking for your understanding of Amazon’s 14 leadership principles.

Who Gets Asked SQL Interview Questions at Amazon?

1. Business Analyst

A business analyst is primarily responsible for helping a business achieve its goals by analyzing data to generate business-related insights. A business analyst will run analyses, maintain reports, and build dashboards. The following infographic compares the skills which are tested in the Amazon interview with skills tested in a typical business analyst interview.

Business Analyst interview

As you can see, SQL is tested almost twice as heavily in the Amazon interview versus the average interview.

According to an Amazon business analyst from Glassdoor, the Amazon interview consists of a technical phone screen, a behavioral phone screen, and an exhaustive third round of several interviews throughout the day. Expect to encounter SQL-related questions during the first and third rounds.

In particular, during the phone screen, one should expect basic SQL questions on DATE, GROUP BY, and JOIN. These typically start simple in nature and are followed by basic queries. The final round of interviews includes advanced SQL questions using CASE, JOIN, sub-queries, and complex queries.

You should be able to verbally explain the different kinds of joins as well as the difference between the WHERE statement and HAVING statement. Further, you should be familiar with deriving insights from given tables. This includes exploring the data by finding various metrics, such as totals and metrics over a given time. As far as particular SQL questions go, Amazon really likes to ask their business analysts about joins.

2. Business Intelligence (BI) Engineer

At Amazon, BI engineers work in teams to form business decisions based on collected data with the intention of improving the customer experience. BI engineers work to improve Prime, Alexa, Twitch, and more. Here is a breakdown of skills tested in an Amazon BI engineer interview.

Business Intelligence (BI) Engineer

Compared to all other domains, SQL is by far the most thoroughly emphasized skill in an Amazon BI engineer interview.

The interview process involves two phone screenings and several onsite interviews. SQL queries have been known to come up during the first phone screen and the first two onsite interviews. People have reported seeing SQL questions centered around JOIN, LEAD, LAG, pivoting, WHERE, HAVING, GROUP BY, and aggregate functions.

Typically, the phone-based SQL questions describe aggregate functions and find the approximate number of rows in an output table after a JOIN. You shouldn’t expect to write more advanced queries until the onsite interview.

3. Data Analyst

An Amazon data analyst performs important functions such as data interpretation, the building and maintaining of dashboards, system design for data collection and compiling, and more.

The specifics of the role can vary depending on what team you are working with. Data analysts who work with Twitch data may perform vastly different functions than those working with Alexa data. The graph below shows how Amazon tests data analyst skills like SQL compared to other companies.

Data Analyst interview

The interview process for a data analyst position involves three levels: behavioral, analytical, and technical interviews. The behavioral and technical interviews are conducted over the phone.

For the technical interview, you must also share your screen with the interviewer as you answer some questions. To start, you have your typical conceptual SQL questions, such as differentiating between WHERE and HAVING clauses. Later in the process, you can expect more query writing questions.

4. Data Scientist

Much like the other positions, data scientists at Amazon perform different roles depending on which branch of the company their work is focused on. A data scientist works with data to produce models and solutions in machine learning and natural language applications in order to make predictions and provide forecasting insights. Here is the comparison graph of Amazon data scientists’ tested skills versus average data scientists’ tested skills.

Data scientist

Since data science as a profession is conceptually deep and diverse, more topics are tested overall. This leaves less room for SQL, although it is still a major skill in data science. Amazon tests you in SQL slightly more than the average employer.

Hopefully, you see a pattern with these interview questions. The interviewers always start out with basic SQL syntax. Then they hit you with a few solvable queries.

Here is an example of one:

Given a table with three columns (id, category, value), with each id having three or fewer categories (price, size, color), find all ids for which two or more category values match.

Examples of Amazon SQL Interview Questions

You can expect an Amazon SQL question on the technical screen, and one or two of the on-site interviews will focus heavily on SQL and data analysis. In general, Amazon SQL questions tend to focus on customer metrics and e-commerce cases.

1. What are the different types of Joins?

Explain the difference between the types of joins.

Hint: Try thinking of the question as a series of Venn diagrams.

Types of SQL joins

2. Write a query to retrieve data on customers who purchased more than $100.

Consider the following two tables:

customer_id	purchase_date	product_id	unit_price	unit_purchased
10001	2021-02-01	35525	25.00	4
10002	2021-02-02	30321	10.00	8
10003	2021-02-14	35525	25.00	3
10004	2021-02-05	34826	300.00	1
10005	2021-02-23	30321	10.00	5
…..

customer_id	registration_date
10001	2020-10-10
10002	2021-01-31
10003	2021-02-10
10004	2021-01-15
10005	2021-02-15
…..

a. Write a query to retrieve data with unique customer ids that made over $100 in purchases during the first week of February.

b. Write a query to retrieve data with unique customer ids that made over $100 in purchases within ten days of registering.

3. Write a SQL query to create a histogram of the number of comments per user.

Consider the following tables:

Write a SQL query to create a histogram of the number of comments per user in the month of January 2020. Assume bin buckets class intervals of one.

users table

Columns	Type
id	INTEGER
name	STRING
created_at	DATETIME
neighborhood_id	INTEGER
mail	STRING

comments table

Columns	Type
`user_id`	INTEGER
`body`	TEXT
`created_at`	DATETIME

Here’s a hint:

What does a histogram represent? In this case, we’re interested in using a histogram to represent the distribution of comments each user made in January 2020.

A histogram with bin buckets of size one means that we can avoid the logical overhead of grouping frequencies into specific intervals.

For example, if we wanted a histogram of size five, we would have to run a SELECT statement like so:

SELECT CASE WHEN frequency BETWEEN 0 AND 5 THEN 5
WHEN frequency BETWEEN 5 AND 10 THEN 10
etc.

Test your abilities on this question on Interview Query.

4. Write a query that displays unique pairs of two locations.

Write a query to create a new table, named flight routes, that displays unique pairs of two locations.

Example:

Duplicate pairs from the flights table, such as Dallas to Seattle and Seattle to Dallas, should have one entry in the flight routes table.

flights table

Column	Type
`id`	INTEGER
`source_location`	STRING
`destination_location`	STRING

5. Write a query to get the cumulative number of new users added by day.

Given a users table, write a query to get the cumulative number of new users added by day, with the total reset every month.

users table

Columns	Type
`id`	INTEGER
`name`	VARCHAR
`created_at`	DATETIME

Here’s a hint:

This question first seems like it could be solved by just running a COUNT(*) and grouping by date. Or maybe it’s just a regular cumulative distribution function?

But we have to notice that we are actually grouping by a specific interval of month and date. And that when the next month comes around, we want to reset the count of the number of users.

Tangentially aside - the practical benefit of a query like this is that we can get a retention graph that compares the cumulative number of users from one month to another. If we have a goal to acquire 10% more users each month, how do we know if we’re on track for this goal on February 15th without having the same number to compare it to for January 15th?

Therefore how can we make sure that the total amount of users on January 31st rolls over back to 0 on February 1st?

6. Write a query to output a table that includes every product name a user has ever purchased.

With this question, you’re provided a table that contains data about products that a user purchased. Products are divided into categories. The column id is the primary key of table products and represents the order in which the products are purchased.

7. Write a query to get the distribution of the number of conversations created by each user by day in the year 2020.

In this question, you’re given a table that represents the total number of messages sent between two users by date on Messenger.

What are some insights that could be derived from this table? What do you think the distribution of the number of conversations created by each user per day looks like?

See a video solution for this question:

8. Write a SQL query to select the 2nd highest salary in the engineering department.

Write a SQL query to select the 2nd highest salary in the engineering department. If more than one person shares the highest salary, the query should select the next highest salary.

Here’s a hint: First, we need the name of the department to be associated with each employee in the employees table, to understand which department each employee is a part of.4

9. Write a query to identify the manager with the biggest team size.

Write a query to identify the manager with the biggest team size. You may assume there is only one manager with the largest team size.

Example:

Input:

employees table

Column	Type
`id`	INTEGER
`name`	VARCHAR
`manager_id`	INTEGER

managers table

Column	Type
`id`	INTEGER
`name`	VARCHAR
`team`	VARCHAR

Output:

Column	Type
`manager`	VARCHAR
`team_size`	INTEGER

Here’s a hint: We need data from both tables. Therefore we need to join them somehow.

10. Create a report displaying which shipments were delivered to customers during their membership period.

Let’s say that you’re a data scientist working on a distribution team at Amazon.

Create a report displaying which shipments were delivered to customers during their membership period.

If the transaction is shipped during the membership period, column is_member should have the value Y. If not, the column should have the value N.

Here’s a hint: You may assume that customer_id is unique and that shipments are made only to registered customers.

11. Time Series Discrepancies

Let’s say that you’re working at a global trading company and you’re analyzing the price of a particular asset over time. This dataset is super noisy, volatile, and you’re not guaranteed completely accurate data.

How would you analyze this data to ensure there aren’t any discrepancies?

12. Customer Orders

Write a query to identify customers who placed more than three transactions each in both 2019 and 2020.

13. Subscription Overlap

Given a table of product subscriptions with a subscription start date and end date for each user, write a query that returns true or false whether or not each user has a subscription date range that overlaps with any other completed subscription.

Completed subscriptions have end_date recorded.

Conclusion

Regardless of the position, SQL is a common feature of the Amazon interview. It’s best to be familiar with a basic syntax all the way through complex queries and sub-queries. Doing well in this portion of the interview could make the difference between getting the job or being asked to look elsewhere.

If you want to check out more SQL questions, check out our in-depth guide to approaching the different types of SQL interview questions you’ll see on the interview.

The interview process can feel long and intimidating for many. But it doesn’t need to be like that! We offer the guidance and tips that are needed to ensure you ace your interview with little stress and get started with your dream job.

20 Beginner to Advanced Data Science Projects for Students How to Land Your First Data Science Job: From Student to Data Scientist How to Become a Data Engineer in 2024: Your Guide to This In-Demand Career Path 9 In-Demand Entry Level SQL Jobs You Can Apply For in 2024 From Capital One to New Heights: Jayandra Lade’s Data Science Journey