35 Basic SQL Interview Questions for 2024

35 Basic SQL Interview Questions for 2024

Overview

If you’re applying for a job in the field of data science, data engineering, or business analytics, there’s a good chance that you’ll have to work with SQL at one point or another. Knowing how to manage and query your own data is incredibly important in these fields. If you’re new to SQL or need to brush off a rusty set of skills to address the current marketplace, it can be difficult to figure out what you need to know about SQL in order to be competitive in your field.

That being said, here are the basics of SQL:

Invented in 1974, SQL is a programming language designed to interface with Relational Database Management Systems (RDBMS). An RDBMS structures data into tables made up of rows (called “records”) and columns (called “fields”), which can then be queried, combined, and manipulated in various ways to achieve the desired outcome.

MySQL, Microsoft SQL Server, Oracle, Amazon Redshift, and IBM DB2 are all examples of RDBMS-based software. The chances that you will encounter one (or more) of these technologies in the course of your career or interview is pretty high. So, let’s make sure you’re prepared.

What Types of Basic SQL Questions Can You Expect in Interviews?

General Basic SQL Interview Questions

Here are some general basic SQL questions to help you prepare for foundational topics:

1. What is SQL?

Answer: SQL (Structured Query Language) is a standardized language used to manage and manipulate relational databases.

2. What is a relational database?

Answer: A relational database organizes data into tables that can be related to one another based on shared attributes, using primary and foreign keys.

3. What is a primary key?

Answer: A primary key is a unique identifier for each record in a table, ensuring that no duplicate or null values exist in the column(s) used.

4. What is a foreign key?

Answer: A foreign key links rows in one table to rows in another, establishing a relationship between tables based on primary key values.

5. Explain the difference between DELETE and TRUNCATE.

Answer: DELETE removes specific rows and logs each deletion, whereas TRUNCATE removes all rows without logging individual deletions, making it faster but less selective.

6. What is an SQL index?

Answer: An index speeds up data retrieval by allowing quicker search access to rows in a table, but it can slow down data modification operations.

7. What is a JOIN in SQL?

Answer: A join combines rows from two or more tables based on a related column, allowing data from multiple tables to be retrieved in a single query.

8. Differentiate between INNER JOIN and OUTER JOIN.

Answer: INNER JOIN returns only matching rows between tables, while OUTER JOIN returns matching rows plus unmatched rows from one or both tables.

9. What is normalization?

Answer: Normalization organizes data to reduce redundancy and dependency by dividing data into related tables, following rules up to the third or higher normal form.

10. What is a subquery?

Answer: A subquery is a nested query within another SQL query used for intermediate results or complex filtering conditions.

11. What is a VIEW in SQL?

Answer: A view is a virtual table based on a query, storing no data itself but providing a way to simplify complex queries or secure data access.

12. How can you retrieve all the flight records from the flights table, including details such as flight ID, source location, destination location, and associated plane ID?

Answer: To obtain a complete view of all flight entries, execute a SELECT statement that specifies the flights table. This will return every column and row, providing a comprehensive dataset that includes all pertinent information for each flight record, which is essential for further analysis or reporting on airline operations.

13. What is a stored procedure?

Answer: A stored procedure is a set of SQL statements stored in the database that can be executed as a single procedure, often to encapsulate complex logic.

14. What is a NULL value in SQL?

Answer: NULL represents a missing or unknown value in SQL, distinct from zero or empty strings.

15. What is the purpose of the GROUP BY clause?

Answer: GROUP BY groups rows sharing a specified column value, allowing aggregate functions like COUNT or SUM to operate on grouped data.

16. Calculate the weighted average score for each email campaign based on the number of users who opened the email and clicked on a link

Answer: To compute the weighted average score for each campaign, first determine the open rate by dividing the number of opens by the total users and the click rate by dividing the number of clicks by the total users. Then, apply the weights (0.3 for the open rate and 0.7 for the click rate), sum these products, and finally, round the result to two decimal places for a clear presentation of the effectiveness of each campaign in the output.

17. Explain the concept of transactions in SQL.

Answer: A transaction is a sequence of SQL operations executed as a single unit, ensuring that all operations are completed successfully or none are applied.

18. What are window functions in SQL?

Answer: Window functions perform calculations across a specified range of rows within a query result, commonly used for running totals, rankings, and moving averages.

19. Define the term Schema in a database.

Answer: A schema is the structure that defines tables, columns, data types, and relationships within a database, serving as the blueprint of the database.

20. What is the difference between HAVING and WHERE clauses?

Answer: WHERE filters rows before aggregation, while HAVING filters aggregated results, typically used with GROUP BY.

Basic SQL Coding Interview Questions

Here are some basic SQL interview questions that could be asked during your coding interview:

21. How can you identify all dates on which the hospital released more patients than the previous day, given a table of daily patient releases?

Approach: To find these dates, perform a self-join on the released_patients table, comparing each date’s released patient count to that of the previous day. By filtering for cases where the current day’s count exceeds the prior day’s count, you can extract the relevant release dates and patient counts for the output, highlighting days of increased patient discharges.

22. How can the marketing team calculate the weighted average score for each email campaign based on the open and click rates, with the open rate weighted at 0.3 and the click rate at 0.7?

Approach: To compute the weighted average, first calculate the open rate by dividing the number of opens by the total number of users and then the click rate by dividing the number of clicks by the total number of users. Then, apply the respective weights to these rates (0.3 for opens and 0.7 for clicks) and sum the results. Finally, round the output to two decimal places for clarity in presenting the effectiveness of each campaign.

23. How can you calculate the total mass of each grocery item required across three imported recipe tables?

Approach: To achieve this, perform a UNION ALL operation to combine the grocery items and their corresponding masses from the three recipe tables. Then, group the combined results by grocery item, summing the mass for each item to get the total required across all recipes. This will provide a comprehensive view of all grocery items needed for the event and their total mass in the output.

24. How can you estimate and fill in missing temperature readings in a time-series DataFrame containing daily temperatures for various cities using linear interpolation?

Approach: *Create a function that converts the date column to datetime format and sorts the DataFrame by* city and date. Then, use the groupby method on city and apply linear interpolation to the temperature column, ensuring that only data from the same city is used for estimating missing values. The result will be a complete dataset with interpolated temperatures for each city.

25. How can you calculate the total earnings for each job title, including the sum of regular salaries, overtime pay based on hours worked, and the total compensation?

Approach: To achieve this, aggregate the employee data by job_title, summing the salary for total salaries. For total overtime payments, multiply overtime_hours by overtime_rate and sum these values. Finally, the total compensation is calculated by adding the total salaries and total overtime payments. This will provide a clear breakdown of earnings for each role in the organization.

26. How can you identify the five lowest-paid employees who have completed at least three projects, considering only those projects with a non-null end date?

Approach: To find these employees, first join the employees table with the projects table based on the employee ID. Filter for projects where the End_dt is not NULL to ensure they are completed. Then, group the results by employee ID, counting the number of completed projects. Finally, filter for those who have completed at least three projects and order the results by salary to select the five lowest-paid employees.

27. How can you create a new table that tracks each student’s exam scores across four exams, ensuring each student’s scores are organized by exam ID?

Approach: To achieve this, you can use a pivot operation on the exam_scores table, grouping by student_name and transforming the exam_id into individual columns for exam_1, exam_2, exam_3, and exam_4. For scores that are not present, NULL values will be assigned. This will result in a structured table showing each student’s scores clearly aligned with their respective exams.

28. How can you calculate the total spending for the IT, HR, and marketing departments, as well as a combined total for other departments, grouped by fiscal quarters for transactions that occurred in 2023?

Approach: To accomplish this, first categorize transactions into fiscal quarters based on the transaction_date. *Use conditional aggregation to sum the amount for each department, assigning any transactions outside of IT, HR, or marketing to an “other” category. Finally, group the results by quarter and filter out any quarters with no transactions, resulting in a clear summary of departmental spending for each quarter of 2023.*

29. How can you categorize sales into Standard, Premium, and Promotional groups based on the sale amount, region, and specific conditions for the month of July, while summing these amounts by region for performance analysis?

Approach: To achieve this, first, apply the given categorization rules to determine the type of each sale. Use conditional aggregation to sum the sales amounts for each category—Standard, Premium, and Promotional—grouped by region. The categorization rules should account for the specified conditions for July and the East region, allowing for a comprehensive report that supports financial planning and bonus allocations.

30. How can you calculate the total salary of employees who have not completed any projects at all?

Approach: To find the total slack salary, first join the employees table with the projects table on employee ID. Filter for employees with at least one project assigned but with none of those projects having a non-null End_dt. Finally, sum the salaries of these employees to get the desired output of total slack salary, providing insight into potential cost-cutting areas.

31. How can you retrieve the total number of vacation bookings made in the last 90 days, last 365 days, and overall, given that today is January 1, 2022?

Approach: To get this information, you can use conditional aggregation to count bookings based on the check_in_date. Count bookings where the check_in_date falls within the last 90 days, the last 365 days, and total bookings regardless of the date. This will provide a comprehensive view of the booking activity over different time frames.

32. How can you calculate the cumulative sum of sales for each product, ordered by product ID and date?

Approach: To compute the cumulative sum, you can use a window function that partitions the results by product_id and orders them by date. This will allow you to sum the price for each product cumulatively up to the current row. The output will show the product ID, date, and the calculated cumulative sum for all sales made on or before that date.

33. How can you identify the user with the highest average number of unique item categories per order from the given user_orders and ordered_items tables?

Approach: To solve this, first join the user_orders and ordered_items tables on order_id. Then, group the results by user_id and count the distinct item_category for each order. Calculate the average number of unique item categories per order for each user. Finally, select the user with the highest average and return their name and average unique item categories, ensuring only one user meets the criteria.

34. How can you calculate the median household income for each city from the survey responses?

Approach: To compute the median household income, you can use a common SQL method involving window functions. First, order the household incomes for each city and then use the ROW_NUMBER() and COUNT() functions to determine the middle value(s). For odd sets, select the middle value; for even sets, average the two middle values. Finally, group the results by city to get the median income for each city.

35. How can you rank users by the number of downloads each day, displaying only the top three users for each date?

Approach: To achieve this, use the RANK() window function to assign a rank to each user based on their downloads for each day. Partition the results by date and order by downloads in descending order. Finally, filter the results to show only those rows where the rank is 1, 2, or 3, ensuring you list the top three users for each date.

Find more SQL interview questions, including more advanced ones, and their solutions on our database to prepare better for your upcoming interview.

Strategies for Answering SQL Questions

When faced with SQL interview questions, especially for foundational concepts, it’s essential to have a structured approach. Here are key strategies to effectively tackle these questions:

Restate the Problem Statement

Begin by actively listening to the question and then rephrasing it to confirm your understanding. This not only helps clarify the problem but also provides the interviewer with an opportunity to correct any misconceptions. However, if you’re given a take-home assignment, you’ll have plenty of time to reassess the problem.

Identify Edge Cases

Consider various scenarios that could impact your results. Take time to discuss base cases and edge cases. For example, if asked for the average order value, clarify whether to include canceled orders or how to treat null values. Document your assumptions and feel free to ask questions.

Work Backwards from the Desired Output

Visualizing the expected output can guide your SQL problem-solving approach. If the question asks for the monthly average sales, consider what the intermediate results might look like (e.g., daily sales per product) and how you can aggregate them. Creating a rough mock-up of the output helps in identifying the necessary SQL functions, like GROUP BY, AVG(), and DATE_TRUNC().

Recognize Common Patterns

Many SQL problems share similar patterns and structures. Familiarize yourself with recurring techniques, such as:

  • Aggregations: Using functions like SUM(), COUNT(), and AVG() for summarizing data
  • Joins: Understanding how to use inner, left, and right joins to combine data from different tables
  • Subqueries: Identifying when to use subqueries for filtering or aggregation
  • Window Functions: Applying functions like ROW_NUMBER(), RANK(), or NTILE() for advanced data analysis

By recognizing these patterns, you can apply previously learned SQL techniques to new problems more efficiently.

Start Writing SQL Code

Don’t hesitate to begin coding, even if your initial solution isn’t perfect. Writing something down can help you clarify your thoughts and highlight areas needing adjustment. As you write your SQL code, verbalize your reasoning and assumptions. You might start with the Interview Query Database for updated questions and their proper solutions.

Optimize and Refine Your Query

After constructing your initial query, review it for optimization opportunities. Look for:

  • Index Utilization: Ensure that you are leveraging indexes on columns used in joins and where clauses to improve performance.
  • Filtering Early: Apply filters (using WHERE) before aggregation to reduce the data volume processed, thus enhancing performance.
  • Using Aliases: Employ table and column aliases for better readability, making your SQL statements easier to understand.

Test Your Query

If you have time, mentally run through your query to check for logical errors or unintended consequences. Consider the edge cases we mentioned earlier and ensure your query accounts for them.

More SQL Interview Resources and Questions

Interview Query provides a wealth of resources designed to enhance your SQL skills and help you prepare for interviews. Premium members gain access to our comprehensive data science course, which includes a dedicated SQL module, as well as sections on Python, statistics, and product metrics. You can also check out our SQL questions tailored for data analysts and explore our top 10 SQL questions for data engineers.

If you’d like to delve deeper into our discussions on Interview Query, visit our blog, where we cover a variety of topics, including:

Learn More about SQL

Check out our Introduction to SQL Course that will help you get started with the foundation of SQL: