Considered the leading consulting firm with a prestige ranking of 9.06, McKinkey & Company helps its clients develop and implement strategies to achieve their operational, organizational, and sales goals. As a firm focused on delivering insights to its clients, McKinsey heavily relies on data analysis and engineering.
Data engineers at McKinsey are responsible for designing and implementing scalable data architectures, developing efficient methods of collecting and cleaning data, and creating appropriate storage solutions. As a data engineer, your efforts will shape data-driven business decisions across multiple departments within the firm.
If you have a data engineer interview at McKinsey lined up or are interested in the same, you’ve come to the right place. With our years of experience guiding candidates to crack their interviews, we’ve compiled a list of probable questions.
But first, let’s briefly discuss the interview process.
McKinsey believes in hiring talents with “diverse backgrounds and with great problem-solving skills,” meaning the data engineer interview process is rigorous and reflects the firm’s high standards. You can expect these steps from a typical McKinsey data engineer interview, with variations based on the role’s seniority and location:
After submitting your application and being shortlisted for the role, you’ll be contacted by a McKinsey HR representative to facilitate your interview process. This round is usually over the phone to assess your basic qualifications, interest in the role, and availability. Feel free to ask about the position and budget or any questions you have during this round.
Also, be prepared to answer a few foundational pre-defined behavioral and technical questions.
The technical screening rounds involve coding assessments and live coding sessions. You may be asked to complete a take-home coding assignment to demonstrate your programming skills, especially in data structure, algorithms, and data manipulation.
The live coding session often revolves around solving a coding problem related to SQL and ETL processes, explaining your thought process throughout.
McKinsey is known for its case study interviews, even for technical roles like data engineers. You may be given an assignment involving a real-world issue with imaginary datasets. You’ll be expected to break down the problem and communicate your thought process clearly.
In addition to the technical case, you might also have a more business-oriented case interview. This could involve working through a business scenario where you need to use data to make strategic decisions.
However, these types of questions fall more within the domain of product sense than traditional case studies.
If you clear the previous stages, you’ll be invited for on-site interviews, which are typically technical and behavioral in nature. During this round, product sense questions could also be asked to assess your critical thinking skills. Expect your interviewers to dive deep into data engineering concepts, system designs, and programming fundamentals.
Usually conducted by the hiring manager or partners, these interviews focus on your behavioral competencies, such as leadership, problem-solving, and teamwork. You’ll be asked to share specific examples from your past that illustrate these qualities.
When discussing strengths, first ask yourself, “What sets me apart from others?”. Focus on those strengths that you can back up with examples using the STAR method, showing how your strength solved a business issue.
Now on to weaknesses. The key here is to avoid listing those things that you are not good at or are insecure about. This approach does not help anyone, especially the interviewer trying to get a well-rounded picture of you.
You’ll find different variations to this question, but the objective is always the same: to assess your ability to communicate complex subject matter and make it accessible.
When companies ask questions like these, they want to distinguish you from other applicants by how good a fit you are for the company. Some things you can lean into to signal how invested you will be in their culture and corporate goals are Company Values, Company Projects/Reputation, Employee Benefits, and Passion.
employees
, find the largest salary of any employee by department.Given a table called employees
, get the largest salary of any employee by department
Example:
Input:
employees
table
Column | Type |
---|---|
id |
INTEGER |
department |
VARCHAR |
salary |
INTEGER |
Output:
Column | Type |
---|---|
department |
VARCHAR |
largest_salary |
INTEGER |
This question requires some memorization. At first glance we can infer that it’s a binomial distribution problem given that we have to guess the number of heads out of a number of trials. Therefore similar to a binomial distribution with n trials and probability of success p on each trial.
{string: number}
, return a list of values that occur only once in the dictionary.](https://www.interviewquery.com/questions/dictionary-unique-values)
You are given a dictionary with a key-value of {string: number}
where values in the dictionary could be duplicates. You are required to extract the unique values from the dictionary where the value occurred only once.
Return a list of values where they occur only once.
Note: You can return the values in any order.
Input:
dictionary = {"key1": 1, "key2": 1, "key3": 7, "key4": 3, "key5": 4, "key6": 7}
Output:
find_unique_values(dictionary) -> [3,4]
#Only 3 and 4 occurred once.
min_distance
to calculate the minimum absolute distance between two elements. Then return all pairs having that absolute difference.Given an array of integers, write a function min_distance
to calculate the minimum absolute distance between two elements then return all pairs having that absolute difference.
Note: Make sure to return the pairs in ascending order.
Example:
Input:
v = [3, 12, 126, 44, 52, 57, 144, 61, 68, 72, 122]
Output:
def min_distance(V) ->
min = 4
[[57, 61], [68, 72], [122, 126]]
grades_colors
to select only the rows where the student’s favorite color is green or red and their grade is above 90.You’re given a dataframe of students named students_df
:
students_df
table
name | age | favorite_color | grade |
---|---|---|---|
Tim Voss | 19 | red | 91 |
Nicole Johnson | 20 | yellow | 95 |
Elsa Williams | 21 | green | 82 |
John James | 20 | blue | 75 |
Catherine Jones | 23 | green | 93 |
Write a function named grades_colors
to select only the rows where the student’s favorite color is green or red and their grade is above 90.
Example:
Input:
import pandas as pd
students = {"name" : ["Tim Voss", "Nicole Johnson", "Elsa Williams", "John James", "Catherine Jones"], "age" : [19, 20, 21, 20, 23], "favorite_color" : ["red", "yellow", "green", "blue", "green"], "grade" : [91, 95, 82, 75, 93]}
students_df = pd.DataFrame(students)
Output:
def grades_colors(students_df) ->
name | age | favorite_color | grade |
---|---|---|---|
Tim Voss | 19 | red | 91 |
Catherine Jones | 23 | green | 93 |
Let’s say we have a table representing vacation bookings. Write a query that returns columns representing the total number of bookings in the last 90 days, last 365 days, and overall.
Note: You may assume that today is the 1st of January 2022.
Example:
Input:
bookings
table
Column | Type |
---|---|
reservation_id |
INTEGER |
guest_id |
INTEGER |
check_in_date |
DATE |
check_out_date |
DATE |
Output:
Column | Type |
---|---|
num_bookings_last90d |
INTEGER |
num_bookings_last365d |
INTEGER |
num_bookings_total |
INTEGER |
Let’s say you’re tasked with building the YouTube video recommendation algorithm.
How would you design the recommendation system?
What are important factors to keep in mind when building the recommendation algorithm?
Why is it standard practice to explicitly put foreign key constraints on related tables instead of creating a normal BIGINT field? When considering foreign key constraints, when should you consider a cascade delete or a set null?
While most of the survey data is pre-quantified, a decent chunk is pure text data, along with different languages in the region. To centralize the analytics, a translation module has been utilized to ensure that analysts can derive a cohesive analysis of the region.
As a consulting engineer, you have been tasked to look at the ETL pipeline connecting PayPal’s data marts with the survey platform’s data warehouses. Within this layer also comes another layer of ETL pipelines, connecting transactional data stores with the survey platform’s data warehouse, as well as the pipeline normalizing this data through translation modules.
How would you ensure the data quality across these different ETL platforms?
An online marketplace company has introduced a new feature that allows potential buyers and sellers to conduct audio chats with each other prior to transacting.
Let’s say we have two tables that represent this data.
Example:
Input:
chats
table
Column | Type |
---|---|
id |
INTEGER |
buyer_user_id |
INTEGER |
seller_user_id |
INTEGER |
call_length |
INTEGER |
call_connected |
INTEGER |
marketplace_purchases
table
Column | Type |
---|---|
id |
INTEGER |
buyer_user_id |
INTEGER |
seller_user_id |
INTEGER |
item_id |
INTEGER |
purchase_amount |
FLOAT |
from collections import Counter
import re
def findStopWords(paragraphs):
p = paragraphs.split()
p1 = []
for w in p:
w1 = re.findall(r”[\w’]+”, w)
p1.append(w1)
d = Counter(p1)
d_sort = sorted(d.items(), key = lambda x: x[1], reverse = True)
res = [i[0] for i in d_sort[:3]]
return res
Preparing for a data engineer interview at McKinsey requires a combination of technical preparation, case study practice, and an understanding of McKinsey’s unique interview style. Here’s a structured approach to help you prepare:
You must already have an idea about what a data engineer does at McKinsey, but get more information about the particular role you’re applying for and how it contributes to the overall operation of the company. Review the job description to identify key skills and experiences McKinsey is seeking in your role to customize your resume and personalize your approach according to it.
Data engineers at McKinsey are expected to demonstrate their ability in SQL, DS & Algo, Big Data technologies, and cloud platforms. As a candidate, be prepared to tackle complex queries, joins, and SQL performance optimization techniques.
Moreover, understand Hadoop, Spark, Kafka, and their use cases, and familiarize yourself with AWS, GCP, or Azure services for data engineering.
Understand ETL processes, data ingestion, transformation, and loading. Learn about dimensional modeling, star schemas, and data marts. Furthermore, practice creating data models for different use cases. Know how to ensure data accuracy, completeness, and consistency.
Despite data engineering being more on the technical side, case studies are an integral part of most data-based roles at McKinsey. Practice structuring problems, identifying key issues, and developing solutions. Learn to apply data analysis to business challenges. Familiarize yourself with McKinsey’s problem-solving approach to further solidify your skills.
Get insight into McKinsey’s interview process and practice answering questions with confidence with our AI Interviewer and P2P Mock Interview Portal.
Average Base Salary
Average Total Compensation
The average basic salary for a data engineer at McKinsey is around $117,000, depending on the location and job responsibilities. The average total compensation, in contrast, may even reach up to $138,000 due to stocks and bonuses for more experienced data engineers.
Beyond McKinsey, numerous companies across various industries, such as Goldman Sachs, Google, and Netflix, are actively seeking skilled data engineers.
Yes, we have the latest McKinsey data engineer job postings listed on our job board. However, keep an eye out on the company’s official career pages for more updated info.
Cracking a McKinsey Data Engineer interview requires a solid technical foundation, analytical prowess, and alignment with McKinsey’s culture. Focus on SQL, Python, data structures, and big data technologies. Practice problem-solving, case studies, and behavioral questions. If interested, you may explore other roles, such as business analyst, software engineer, and data scientist, in our main McKinsey Interview Guide to find your best fit. All the best!