Top 15 Google SQL Interview Questions (Updated for 2024)

Top 15 Google SQL Interview Questions (Updated for 2024)

Overview

SQL questions popup in Google data science interviews all the time. And if you’ve got an interview coming up, you might be wondering: How can I prepare for SQL interviews at Google?

First, you’ll want to be ready to answer any definition-based SQL questions, like “what is a join?” Basic questions test your competency level with SQL, and they’re often asked early in the interview process. More importantly, though, in Google SQL interviews, you’ll likely be asked SQL business-focused questions, e.g. writing SQL queries around customer data.

Today, we’re looking specifically at the SQL portion of Google interviews. We cover: What you need to know, how SQL questions align to various Google roles, and SQL practice problems you can use to prep.

Google Interview Process

Google has an extremely rigorous interview process. Each stage is as important as the next. But you likely won’t be asked SQL interview questions until the technical screen and on-site interview. Here’s what it looks like:

  • Initial Phone Screen - The initial call typically asks behavioral questions and questions about your experience. They’re typically conducted with a recruiter, and they’re designed to see if you’re a right fit for Google.

  • Technical Screen - During the technical screen, you should expect basic SQL questions, including definitions of basics like should expect basic SQL questions on DATE, GROUP BY, and JOIN. You might be asked to perform basic queries as well.

  • Onsite Interview - The final round of interviews includes intermediate to advanced SQL questions using CASE, JOIN, sub-queries and complex queries.

What Gets Asked in Google SQL Interviews?

Like all FAANG companies, Google relies heavily on data. And SQL is a go-to tool for processing and analyzing that data. Google SQL interviews don’t just ask basics. These interviews tend to ask case study SQL questions. In other words, you’ll be presented with more practical problems and real data, and be asked to write queries for that dataset. The most common questions to get asked include:

  • Basic SQL interview questions - These are typically definition-based questions, and they come up on the technical screen. One tip: Develop your ability to explain these basic SQL concepts in layman’s terms.

  • SQL query questions - These types of questions test your knowledge in writing queries and statements in SQL. You’ll be presented with a dataset and asked to write SQL code to return a specific value.

  • Advanced SQL questions - Finally, you may be asked SQL scenario-based interview questions that will require you to write advanced queries. With this type of questions, you’ll be asked to write queries that address a specific case and use a range of SQL clauses, from basics like SELECT and FROM to advanced like HAVING.

Who Gets Asked SQL Questions at Google?

SQL is used in many different roles at Google. It will most commonly come up in interviews for:

Business Intelligence Analyst

Business analysts at Google use SQL to generate insights, maintain reports and run analyses. In business analyst interviews, you can expect the basics, like DATE, GROUP BY and JOIN, and in the final round of interviews, more complex queries.

Business Intelligence Engineer

BI engineers at Google are at the “intersection of product, data, and business strategy.” In other words, you’ll be tasked with using data to make business decisions that improve customer experience. SQL is one of the most important skills tested in BI engineer interviews at Google.

Data Analyst

Google data analysts perform a range of key tasks. Google analysts tend to derive business insights from data, and provide that information to key stakeholders. These roles vary by the team you work with; for example, a data analyst on the Google Ads team will have a much different role than one working on Google Drive.

Data Scientist

At Google, data scientists perform a variety of roles, and they’re tested on many different tools and skills. Although SQL questions get asked, these interviews tend to focus more on statistics, algorithms and machine learning topics. But be prepared for basic SQL syntax questions, as well as solvable queries.

Example Google SQL Interview Questions

Google SQL Interview Questions

These are examples of SQL questions you might expect in a Google interview:

1. Explain the different types of joins in SQL.

A JOIN is a clause in SQL that’s used to join rows from two or more tables, based on a common column between the tables. It is used for merging tables, as well as retrieving data. The most common types of joins include:

  • Inner Join: An INNER JOIN is the most common, and it is used to return all rows from two or more tables, when the JOIN condition is met.
  • Left Join: This type of join returns rows from the left table, when a matching row from the right meets the JOIN condition.
  • Right Join: Similar to a left join, but rows are returned from the right table, when the JOIN condition on the left is met.
  • Full Join: Full joins returns all rows from the left and right, when there is a match in any of the tables.

2. What is the PRIMARY KEY in SQL?

Primary keys are constraints that uniquely identify each record. One thing to note: Primary keys cannot have NULL values, and all values must be UNIQUE. A table can have only one primary key, but the primary key can consist of single or multiple columns.

3. What are constraints?

Constraints in SQL are rules that can be applied to the type of data in a table. They are used to limit the type of data that can be stored in a particular column within a table. Some common types used in SQL are:

  • NOT NULL - This constraint prevents null values from being stored in a column.
  • UNIQUE - This constraint says that values in a column must be unique. The primary key uses the UNIQUE constraint.
  • PRIMARY KEY - This constraint is used to specify which field is the primary key.
  • FOREIGN KEY - A foreign key is a constraint that can uniquely identify a row in another table.

4. What’s the difference between DELETE and TRUNCATE statements in SQL?

DELETE is used to remove specific data from a table. This statement is a DML command, and it’s slower than TRUNCATE. One key difference: You can rollback data after using DELETE. TRUNCATE, on the other hand, is a DDL command, and it is used to delete all the rows from a table.

5. What is query optimization?

Inefficient SQL queries can drain a database, and lead to slow performance and loss of service. Optimization is especially critical when working with production databases. As such, query optimization is the process of making SQL queries more efficient. More efficient queries provide outputs faster, and minimizes the impact on the database.

6. Given the tables below, select the top three departments with at least ten employees and rank them according to the percentage of their employees making over $100,000 in salary.

employees table

Columns Type
id INTEGER
first_name VARCHAR
last_name VARCHAR
salary INTEGER
department_id INTEGER

departments table

Columns Type
id INTEGER
name VARCHAR

Output:

percentage_over_100K department name number of employees
.9 engineering 25
.5 marketing 50
.12 sales 12

Hint: What’s the question really asking? Breaking it down, we can subset this into separate clauses of conditions:

  1. Top 3 departments
  2. % of employees making over $100,000 in salary
  3. Departments must have at least 10 employees.

7. Given a users table, write a query to get the cumulative number of new users added by day, with the total reset every month.

users table

Columns Type
id INTEGER
name VARCHAR
created_at DATETIME

Output:

Date Monthly Cumulative
2020-01-01 5
2020-01-02 12
...
2020-02-01 8
2020-02-02 17
2020-02-03 23

Hint: This question first seems like it could be solved by just running a COUNT(*) and grouping by date. Or maybe it’s just a regular cumulative distribution function? But we have to notice that we are actually grouping by a specific interval of month and date. And when the next month comes around, we want to the reset the count of the number of users.

8. Given a table of product subscriptions with a subscription start date and end date for each user, write a query that returns true or false whether or not each user has a subscription date range that overlaps with any other user.

subscriptions table

Column Type
user_id INTEGER
start_date DATE
end_date DATE

Example:

user_id start_date end_date
1 2019-01-01 2019-01-31
2 2019-01-15 2019-01-17
3 2019-01-29 2019-02-04
4 2019-02-05 2019-02-10

Output:

user_id overlap
1 1
2 1
3 1
4 0

Hint: Let’s take a look at each of the conditions first and see how they could be triggered. Given two date ranges, what determines if the subscriptions would overlap?

9. Given a table of students and their SAT test scores, write a query to return the two students with the closest test scores with the score difference.

If there are multiple students with the same minimum score difference, select the student name combination that is higher in the alphabet.

scores table

Column Type
id INTEGER
student VARCHAR
score INTEGER

Input:

id student score
1 Jack 1700
2 Alice 2010
3 Miles 2200
4 Scott 2100

Output:

one_student other_student score_diff
Alice Scott 90

Hint: Given the problem statement is referencing one table with only two columns, we have to self-reference different creations of the same table. It’s helpful to think about this problem in the form of two different tables with the same values.

10. We’re given two tables, a users table with demographic information and the neighborhood they live in and a neighborhoods table. Write a query that returns all of the neighborhoods that have 0 users.

users table

Columns Type
id INTEGER
name VARCHAR
neighborhood_id INTEGER
created_at DATETIME

neighborhoods table

Columns Type
id INTEGER
name VARCHAR
city_id INTEGER

Output:

Columns Type
neighborhood_name VARCHAR

Hint: Our predicament is to find all the neighborhoods without users. In a sense we need all the neighborhoods that do not have a singular user living in them. This means we have to introduce a concept of existence of a column in one table, but not in the other.

11. Given a table of transactions and products, write a query to return the product id, product price, and average transaction price of all products with price greater than the average transaction price.

transactions table

Column Type
id INTEGER
user_id INTEGER
created_at DATETIME
product_id INTEGER
quantity INTEGER

products table

Column Type
id INTEGER
name STRING
price FLOAT

Hint: We first need to find the average price of all transactions. The total price of a transaction is price*quantity so we write a sub-query to find the average over all transactions.

12. Let’s say we have two tables, transactions and products. Hypothetically the transactions table consists of over a billion rows of purchases bought by users.

We are trying to find paired products that are often purchased together by the same user, such as wine and bottle openers, chips and beer, etc..

Write a query to find the top five paired products and their names.

Note: for the purposes of satisfying the test case, P1 should be the item that comes first in the alphabet. transactions table:

Column Type
id INTEGER
user_id INTEGER
created_at DATETIME
product_id INTEGER
quantity INTEGER

products table:

Column Type
id INTEGER
name STRING
price FLOAT

example output:

Column Type
P1 STRING
P2 STRING
count INTEGER

Hint: To solve this, we need to break this into several steps. First, we should find a way to select all the instances a user purchased 2 or more products at the same time. How can we use user_id and created_at to accomplish this?

13. Write a query to return the number of songs played on each date for each user

We have a table called song_plays that tracks each time a user plays a song.

Write a query to return the number of songs played on each date for each user

Note: If a user plays the same song twice during the day count should be two.

14. Write a SQL query using the database schema to find the percentage of customers that order drinks with their meal.

Design a database for a stand-alone fast food restaurant.

Based on the above database schema, write a SQL query to find the top three highest revenue items sold yesterday.

Note: We will only record the customer if they either sign up for our rewards program or order by delivery. Customers will be recorded in the orders and customers tables only if they match one of those criteria. Deliveries will be recorded in the orders and deliveries tables only if they were delivered.

15. Display the top three users by downloads each day. Order your data by date, and then by daily_rank

Let’s say you work at a file-hosting website. You have information on user’s daily downloads in the download_facts table

Use the window function RANK to display the top three users by downloads each day. Order your data by date, and then by daily_rank

Input:

download_facts table

Column Type
user_id INTEGER
date DATE
downloads INTEGER

Output:

Column Type
daily_rank INTEGER
user_id INTEGER
date DATE
downloads INTEGER

In interviews for many Google positions, especially analyst roles, SQL questions make up a sizeable part of the interview. Your prep should include a solid study of the basics and SQL definitions, yet you should also be comfortable with more complex queries and sub-queries. One tip: Ask the recruiter the types of questions that will come up in the interview. This will give you an idea of where to focus your study.

Learn SQL fundamentals in this easy-to-follow SQL course.

This course will help you brush up on your SQL skills and learn basic to advanced techniques.

More Resources

If you need additional help, be sure to check out the SQL module in our Data Science Course. It offers a solid review of basic-to-advanced concepts. You might also want to see our Top 25+ Data Science SQL Interview Questions.