Founded in 2015 by Nik Storonsky and Vladyaslav Yatsenko, Revolut stands among the leading fintech providers, with more than 400 million transactions a month. It offers services that include cross-border transactions, stock trading, crypto trading, personal loans, and commodity trading.
As a data analyst looking to join Revolut, your responsibilities will likely include developing fraud detection algorithms, analyzing customer spending patterns, optimizing user experience, and enhancing financial services through collaboration with cross-functional teams.
If you’re curious about how to respond to Revolut data analyst interview questions, this article is definitely for you.
Your experience throughout the interview process may vary depending on the data scientist role (risk, credit, computer vision, etc.) you’re applying for. However, as far as we can tell, Revolut follows a predefined pattern in interviews with data scientist candidates.
If you aren’t encouraged to apply by a Revolut recruiter, you can directly apply for your preferred role through the Revolut Career portal. The application process for data scientist candidates involves a questionnaire covering your contact details, relocation preferences, and CV submission. Questions about your experience developing algorithms, participation in similar projects, and programming skills may also follow.
Submit the application and wait for a recruiter to screen your CV. If you pass this phase, you’ll be invited to attend an initial screening call.
Revolut prioritizes soft skills and job-related interview questions. During this stage of the data science interview, which usually lasts 15-20 minutes, interviewers assess your soft skills and knowledge of data science. They may ask behavioral and basic algorithm questions and discuss problems with machine learning models.
Revolut interviewers typically refrain from asking generic coding questions unrelated to your potential position. However, your hiring manager may also request a video meeting during this round.
After passing the initial screening, you’ll be assigned a take-home task to help the technical managers assess your understanding of machine learning models, product metrics, and programming skills. The task usually involves a simulated work environment assignment that you would be asked to resolve as a data scientist. Depending on the task, you may or may not be provided datasets to find a solution.
If you’ve been successful in the previous rounds, you’ll be asked to appear for an on-site interview at your nearest Revolut office. During this stage, you’ll meet your potential immediate seniors, colleagues, and the hiring manager. They may conduct multiple one-on-one interviews and organize a group exercise with other data science candidates to judge your collaborative and leadership skills.
This round often includes discussions of cross-validation techniques, neural network architectures, regularization and data visualization techniques, and big data technologies.
If Revolut hires you as a data scientist, you’ll be notified via email or phone after the on-site meeting. They may also host a partner interview or integration round to help you settle into the office environment.
Revolut data science interviewers typically discuss machine learning, programming skills, and statistics. They will also ask behavioral questions to draw out your understanding of Revolut’s visions, goals, and culture.
Our experienced members have found a few questions effective in a Revolut data science interview.
A Revolut interviewer may ask this question to understand how you handle challenges and achieve success in projects.
How to Answer
Describe a project where you set ambitious goals, took initiative, and used your skills effectively to exceed expectations. Highlight your approach to problem-solving, collaboration with team members, and any innovative strategies you employed.
Example
“In a recent project, I was tasked with optimizing our fraud detection system. I took the initiative to analyze historical transaction data, identifying patterns and anomalies using advanced machine learning algorithms. By collaborating closely with the engineering team, we implemented a real-time monitoring system that significantly reduced false positives by 30%, exceeding the initial target. My proactive approach and data-driven insights were key to achieving this success.”
The interviewer at Revolut may ask this to gauge your understanding of your capabilities and areas for improvement as a data scientist.
How to Answer
Identify three strengths relevant to the position, highlighting skills like data analysis, problem-solving, and communication. For weaknesses, mention areas you’re actively working to improve, such as learning new programming languages or enhancing statistical modeling techniques.
Example
“In terms of strengths, I excel in data analysis, particularly in extracting actionable insights from complex datasets. My problem-solving skills allow me to tackle challenges efficiently, and my strong communication skills enable me to convey technical findings to non-technical stakeholders effectively. As for weaknesses, I’m currently focusing on improving my proficiency in advanced statistical modeling techniques to enhance predictive analytics.”
This question assesses your experience in handling complex data and your proficiency with analytical tools and techniques.
How to Answer
Describe a specific project where you dealt with a large and complex dataset. Discuss your approach to data cleaning, preprocessing, and analysis, as well as the tools and techniques you used, such as Python libraries like Pandas, NumPy, and scikit-learn, or SQL queries for data manipulation.
Example
“In a previous role, I worked on a project that involved analyzing customer transaction data from multiple sources to identify spending patterns and detect potential fraud. I first performed data cleaning and preprocessing to handle missing values and outliers using Python libraries like Pandas and NumPy. Then, I applied advanced statistical techniques and machine learning algorithms, such as clustering and anomaly detection, to uncover actionable insights from the dataset. Tools like Jupyter Notebooks and SQL were instrumental in managing and analyzing the large volume of data efficiently.”
Your resilience and problem-solving skills when facing obstacles in a data analysis project will be assessed through this question.
How to Answer
Describe a challenge or setback you encountered in a data analysis project, how you addressed it, and the lessons learned. Emphasize your ability to adapt, collaborate with team members, and implement alternative solutions to overcome obstacles.
Example
“In a recent data analysis project, we faced challenges with data quality issues, resulting in inconsistencies and inaccuracies in the dataset. To address this setback, I worked closely with the data engineering team to identify the root causes and implement robust data validation processes. Additionally, I recalibrated our analytical models and performed sensitivity analyses to mitigate the impact of unreliable data. This experience taught me the importance of proactive data quality management and effective collaboration across teams to ensure accurate and reliable insights.”
Revolut may ask this to check your organizational skills and time management abilities as a data scientist who may be required to work on multiple projects simultaneously.
How to Answer
Express a situation where you had to juggle multiple tasks or projects simultaneously. Discuss your approach to prioritization, including factors considered and strategies employed to meet deadlines. Emphasize your ability to delegate tasks, set realistic timelines, and adapt to changing priorities.
Example
“My previous role required frequently dealing with competing priorities and tight deadlines. When faced with multiple tasks, I first evaluated the urgency and importance of each task, considering factors like project deadlines and impact on business objectives. I then organized tasks into a prioritized list, focusing on high-impact projects while ensuring essential tasks were addressed promptly. To manage my workload effectively, I used project management tools like Trello to track progress and allocate time efficiently. By staying organized and adaptable, I consistently met deadlines and delivered quality results across multiple projects.”
'var'
. Calculate the t-value for the mean of ‘var
’ against a null hypothesis that μ=μ0.Note: You do not have to calculate the p-value of the test or run the test.
Example:
Input:
mu0 = 1
print(df)
...
var
0 -34
1 40
2 -89
3 5
4 -26
Output:
def t_score(mu0, df) ->
var -1.015614
dtype: float64
Your data science interviewer at Revolut will assess your ability to calculate the t-value for a mean against a null hypothesis using Pandas DataFrame with this question.
How to Answer
To calculate the t-value for the mean of a single column against a null hypothesis, you can use the formula: t = (mean - μ0) / (std / sqrt(n)), where mean is the sample mean, μ0 is the hypothesized population mean, std is the standard deviation of the sample, and n is the sample size.
Example
import pandas as pd
import numpy as np
def t_score(mu0, df):
mean = df['var'].mean()
std = df['var'].std()
n = len(df)
t = (mean - mu0) / (std / np.sqrt(n))
return t
mu0 = 1
df = pd.DataFrame({'var': [-34, 40, -89, 5, -26]})
print(t_score(mu0, df))
Revolut may ask this to gauge your ability to implement solutions for compliance and regulatory purposes.
How to Answer
Describe a systematic approach to build a model that detects firearm sales listings in a marketplace. This may involve using natural language processing (NLP), image recognition, and machine learning algorithms to classify listings as firearm sales or non-firearm sales.
Example
“To detect firearm sales listings, I would first preprocess listing descriptions using NLP techniques to extract relevant keywords related to firearms. Additionally, I would use image recognition algorithms to scan listing images for recognizable firearm objects. Finally, I would train a machine learning model on labeled data to classify listings as either firearm sales or non-firearm sales based on textual and visual features.”
You ask the data department in the company for a subset of data to get started working on the problem. The data includes different features about applicants such as age, occupation, zip code, height, number of children, favorite color, etc. You decide to build multiple machine learning models to test out different ideas before settling on the best one. How would you explain the bias-variance tradeoff with regards to building and choosing a model to use?
This question examines your understanding of the bias-variance tradeoff in the context of machine learning model selection.
How to Answer
Explain the bias-variance tradeoff in the context of machine learning model selection. Discuss how models with high bias may oversimplify the data, leading to underfitting, while models with high variance may capture noise in the data, leading to overfitting. Emphasize the need to find the right balance between bias and variance to optimize model performance.
Example
“The bias-variance tradeoff refers to the delicate balance between the simplicity and flexibility of a machine learning model. Models with high bias, such as linear regression, may oversimplify the underlying relationships in the data, resulting in underfitting and poor performance on both training and test datasets. On the other hand, models with high variance, such as decision trees with no constraints, may capture noise in the training data, leading to overfitting and poor generalization to unseen data. To find the optimal model, it’s essential to strike a balance between bias and variance by selecting a model complexity that minimizes both training and test errors.”
Your interviewer at Revolut may ask this to assess your proficiency in data structures and algorithms. This question evaluates your ability to manipulate linked lists in programming.
How to Answer
Implement a function to traverse the singly linked list until it reaches the last node and returns it. If the list is empty, return null.
Example
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
def last_node(head):
if not head:
return None
while head.next:
head = head.next
return head
# Example usage:
# head = ListNode(1)
# head.next = ListNode(2)
# head.next.next = ListNode(3)
# print(last_node(head).val) # Output: 3
Note: The function should return a tuple containing the minimum and maximum values of the confidence interval rounded to the tenths place.
Example
Input:
values = [1, 2, 3, 4, 5]
Output
bootstrap_conf_interval(values, 1000, 0.95) -> (1.2, 4.8)
In this case, the function returns a tuple indicating that, based on our bootstrap samples, we are 95% confident that the population parameter lies between 1.2 and 4.8.
Note: Results may vary due to the randomness of bootstrap sampling.
Your ability to implement bootstrapping and calculate confidence intervals will be assessed through this question. You may be asked this to evaluate your statistical reasoning and coding skills.
How to Answer
Implement a function to perform bootstrap sampling on the given array and calculate the confidence interval based on the given size. The confidence interval can be calculated by taking percentiles of the bootstrap sample distribution.
Example
import numpy as np
def bootstrap_conf_interval(values, num_samples, confidence_level):
bootstraps = np.random.choice(values, size=(num_samples, len(values)), replace=True)
sample_means = np.mean(bootstraps, axis=1)
lower_percentile = (1 - confidence_level) / 2
upper_percentile = 1 - lower_percentile
lower_bound = np.percentile(sample_means, lower_percentile * 100)
upper_bound = np.percentile(sample_means, upper_percentile * 100)
return round(lower_bound, 1), round(upper_bound, 1)
# Example usage:
# values = [1, 2, 3, 4, 5]
# print(bootstrap_conf_interval(values, 1000, 0.95)) # Output: (1.2, 4.8)
Because we are a financial company, we must provide each rejected applicant with a reason. Given that we don’t have access to the feature weights, how would we give each rejected applicant a reason?
This question examines your problem-solving skills in providing reasons for rejection in a binary classification model without access to feature weights.
How to Answer
Discuss a systematic approach to provide reasons for rejection to unqualified applicants without access to feature weights. The solution may involve analyzing misclassified instances, identifying common patterns or features among rejected applicants, and developing rules or decision trees based on these patterns.
Example
“To provide reasons for rejection without access to feature weights, I would first analyze misclassified instances to identify common patterns among rejected applicants. For example, if a significant portion of rejected applicants have low credit scores and high debt-to-income ratios, these factors could be potential reasons for rejection. I would then develop rules or decision trees based on these patterns to explain to applicants why their application was rejected.”
knn
that returns the nearest data point from a list of data points to a given query point. Use Euclidean distance as the similarity measure. For the purpose of this task, consider the scenario where k=1, meaning you only need to find the single closest data point.Note: Using external libraries such as NumPy and scikit-learn is not allowed.
Example:
Input:
data_points = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
querying_point = [1, 9, 7]
Output:
def knn(data_point: List[List[float]], query_point: List[float]) -> [4, 5, 6]
The interviewer at Revolut may ask this to evaluate your proficiency in algorithmic coding and computational thinking as a data scientist.
How to Answer
Implement a function that calculates the Euclidean distance between the querying point and each data point in the list of data points. Then, return the data point that is nearest to the querying point based on the calculated distances.
Example
from typing import List
def knn(data_points: List[List[float]], query_point: List[float]) -> List[float]:
min_distance = float('inf')
nearest_point = None
for point in data_points:
distance = sum((x - y) ** 2 for x, y in zip(point, query_point)) ** 0.5
if distance < min_distance:
min_distance = distance
nearest_point = point
return nearest_point
# Example usage:
data_points = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
querying_point = [1, 9, 7]
print(knn(data_points, querying_point)) # Output: [4, 5, 6]
Example:
Input:
employees
table
Column | Type |
---|---|
id | INTEGER |
name | VARCHAR |
manager_id | INTEGER |
managers
table
Column | Type |
---|---|
id | INTEGER |
name | VARCHAR |
team | VARCHAR |
Output:
Column | Type |
---|---|
manager | VARCHAR |
team_size | INTEGER |
As a data scientist candidate, you may be asked this question to assess your ability to extract relevant information from a database and solve complex SQL queries.
How to Answer
Write an SQL query to join the employees
and managers
tables, group by manager, and calculate the size of each manager’s team. Then, select the manager with the largest team.
Example
SELECT managers.name AS manager,
COUNT(employees.id) AS team_size
FROM employees
JOIN managers ON employees.manager_id = managers.id
GROUP BY managers.name
ORDER BY team_size DESC
LIMIT 1;
This question assesses your understanding and ability to provide examples of the fundamental differences between supervised and unsupervised learning.
How to Answer
Explain what supervised learning and unsupervised learning is and give examples of each.
Example
“Supervised learning involves training a model on labeled data, where the model learns to make predictions based on input-output pairs. An example of supervised learning is training a spam email classifier using labeled emails (spam or not spam).
In contrast, unsupervised learning involves training a model on unlabeled data, where the model learns to find patterns or structures in the data without explicit guidance. An example of unsupervised learning is clustering customer data to identify distinct customer segments based on their purchasing behavior.”
Revolut may ask this to assess your ability to address challenges commonly encountered in real-world data analysis scenarios.
How to Answer
Explain the techniques involved in handling imbalanced classes. Emphasize the importance of understanding the problem context and selecting the most suitable approach based on the specific dataset and business requirements.
Example
“To handle imbalanced classes, I would first explore resampling techniques such as oversampling the minority class using methods like Synthetic Minority Over-sampling Technique (SMOTE) or undersampling the majority class. Additionally, I would consider using evaluation metrics like precision-recall instead of accuracy to assess model performance more effectively in imbalanced datasets. Lastly, I would experiment with algorithms like Random Forest or Gradient Boosting Machines, which can handle class imbalance by adjusting class weights or incorporating sampling strategies, to improve model performance on imbalanced datasets.”
Revolut may ask this to evaluate your knowledge of techniques for preventing overfitting and improving model generalization, which are necessary skills for a data scientist.
How to Answer
Explain what regularization is and how it’s used to prevent overfitting and improve the generalization of machine learning models by adding a penalty term to the loss function.
Example
“Regularization in machine learning is a technique used to prevent overfitting by adding a penalty term to the loss function. This penalty term discourages overly complex models with large coefficients, leading to improved generalization performance. Common regularization techniques include L1 regularization (Lasso), which adds the absolute values of coefficients to the loss function; L2 regularization (Ridge), which adds the squared values of coefficients to the loss function; and elastic net regularization, which combines both L1 and L2 penalties.”
This question assesses your knowledge of decision trees and random forests, including their differences and when to choose one over the other.
How to Answer
Describe the differences between decision trees and random forests. Explain how decision trees are used when interpretability is important or the dataset is small, and describe how random forests are used when robustness and performance are priorities.
Example
“Decision trees are simple, interpretable models that recursively split the data based on feature thresholds to make predictions. However, they are prone to overfitting, especially with complex datasets. On the other hand, random forests are ensembles of decision trees where each tree is trained on a random subset of the data and features. Random forests reduce overfitting by averaging predictions from multiple trees, leading to better generalization performance. I would choose decision trees when interpretability is crucial or when working with a small dataset. In contrast, I would choose random forests when robustness and performance are priorities, especially with large and complex datasets.”
Your ability to identify key metrics for evaluating user engagement with a mobile banking app like Revolut will be evaluated through this question.
How to Answer
Identify and discuss key metrics that may help evaluate user engagement with Revolut. Include metrics such as active users, user retention rate, average session duration, and frequency of app usage.
Example
“To evaluate user engagement with a mobile banking app like Revolut, I would track key metrics such as active users, user retention rate, average session duration, frequency of app usage (daily, weekly, monthly), number of transactions per user, user satisfaction ratings through surveys or app store reviews, and conversion rates for specific features like account opening or card activation. These metrics provide insights into how users interact with the app and indicate the overall engagement level.”
The interviewer at Revolut may ask this to evaluate your proficiency as a day scientist in user behavior analytics and your ability to derive insights from data.
How to Answer
Explain what cohort analysis is and how it can be based on characteristics like signup date, acquisition channel, or demographic attributes. Mention that it helps identify trends, patterns, and differences in user behavior.
Example
“Cohort analysis is a powerful method in understanding user behavior by grouping users based on common characteristics or actions and analyzing their behavior over time. For example, we can create cohorts based on the signup date, acquisition channel, or demographic attributes of users. By tracking metrics like retention rate, engagement, and conversion rate for each cohort over time, we can identify trends, patterns, and differences in user behavior. Cohort analysis helps us understand how user behavior evolves and provides valuable insights for product improvement and targeted marketing strategies.”
This question examines your knowledge of the central limit theorem and how it works in statistics.
How to Answer
Explain the central limit theorem and how it enables us to make inferences about population parameters based on sample statistics, even when the population distribution is unknown or non-normal.
Example
“The central limit theorem is a fundamental concept in statistics that states that the distribution of sample means from a population approaches a normal distribution as the sample size increases, regardless of the population distribution. This theorem is important because it enables us to make inferences about population parameters, such as the mean or variance, based on sample statistics, even when the population distribution is unknown or non-normal. For example, when estimating the population mean from a sample, we can use the normal distribution to calculate confidence intervals or conduct hypothesis tests, assuming the sample size is sufficiently large.”
This question is likely asked in a Revolut Data Scientist interview to assess your ability to work with time series data and financial metrics, which are crucial for a fintech company like Revolut.
How to Answer
When answering, explain that the key is to first aggregate the transactions by day to isolate daily deposits. Then, use a self-join to create a rolling three-day window, as SQL doesn’t natively support rolling calculations. This allows you to compute the rolling average by summing the relevant rows within each window.
Example
“To tackle this problem, I would first aggregate the transactions by day to focus solely on deposits. After that, I would use a self-join technique to simulate a rolling three-day window, since SQL doesn’t inherently support rolling calculations like some other languages. This approach allows me to calculate the rolling average by summing the deposits over the last three days for each date, providing a clear view of the trend in deposit activity.”
This question might be asked in a Revolut Data Scientist interview to assess your problem-solving skills and ability to write efficient code. Anagram detection is a common string manipulation problem that tests your understanding of algorithms and data structures, particularly around sorting and hash maps.
How to Answer
If the two strings are not equal length or they are the same word then they are not a valid anagram. Convert the two strings into 2 lists and sort them. For two anagrams when sorted, they become equal as the anagram is a rearrangement of letters.
Example
“I would start by checking if the two strings are the same length and if they are identical, as these conditions would immediately rule out them being anagrams. Then, I could convert both strings into lists of characters and sort them. If the sorted lists are identical, it would mean the strings are anagrams since sorting arranges the characters in the same order.”
Technical, behavioral, and analytical skills are critical in proving yourself as an efficient data scientist to Revolut. Here is a rough guideline on how to prepare for the role:
Understand and learn to apply the core concepts of data science, such as algorithms, statistical modeling, data manipulation, and data visualization. Also, dive deeper into the popular Python libraries and frameworks, such as NumPy and Pandas. An extensive understanding of statistics & AB testing could also help you succeed in the data science interview at Revolut.
Acquire knowledge of big data technologies such as Apache Flink, Spark, and Hadoop to solidify your claim to the data scientist role at Revolut. Also, consider learning about financial and product metrics that are often used in real-world data science projects involving marketing and risk management. Additionally, be sure you know something about distributed computing frameworks and batch-processing modes.
Modeling and machine learning have become integral parts of the data science domain, and they are used for fraud detection, risk assessment, and personalization. Revolut especially focuses on deep learning, natural language processing (NLP), machine learning system design, and reinforcement learning. Ensure that you have hours of learning and practice in these topics to stake your claim to the data science role at Revolut.
It’s not enough to know concepts and answers. You need to convey your thought process to the Revolut interviewers. For that, practice a lot of data science behavioral questions and participate in our P2P mock interviews to refine your collaboration and communication skills. Moreover, religiously prepare the data science case study questions to avoid being caught off-guard during the interview rounds.
During the technical rounds, you’ll be asked to solve a Take-Home Challenge and multiple Python and SQL interview questions. Be well prepared for the challenges to avoid fumbling in front of the hiring manager.
For more details, follow our extensive data science interview guide.
Average Base Salary
Average Total Compensation
The salary of data scientists at Revolut can vary based on factors such as experience, location, and specific job responsibilities. Depending on your level of experience, you may expect an average base salary of $123,000 and a total compensation of $179,000 as a data scientist at Revolut. However, as per our data scientist salary guide, senior employees command a more robust package.
You’re welcome to explore our Slack community to read about other people’s interview experiences for the Revolut Data Scientist role. And after your interview, feel free to share your experience. We have real-time discussions about job interviews and share informative tips to help our candidates improve.
Yes. We have up-to-date info on job postings for the Revolut Data Scientist role. Check our Job Board to gain insight into the available positions and leave your application with them.
As a data scientist candidate at Revolut, you must have a deep understanding of machine learning models, Python, algorithms, and product metrics. We were, hopefully, able to guide you through the complexities of the Revolut interview process and answer the technical and project interview questions. To further If you have more queries, follow our Revolut main Interview guide and explore other positions such as data analyst and software engineer.