Interview Query
Back to Data Science Interview
Data Science Interview

Data Science Interview

2 of 257 Completed

3m read
5m read
5m read
5m read
5m read
6m read
4m read
4m read
Question
Question
Question
Question

Introduction

All areas of study in math can roughly be divided into two camps: discrete mathematics and continuous mathematics. Perhaps the best way to describe the difference between the two is to talk about what each of the branches means by “number.”

Discrete mathematics studies structures that are analogous to the integers, that is, the set of all whole numbers {,2,1,0,1,2,}\{\dots,-2,-1,0,1,2,\dots\}. The symbol Z\mathbb{Z} is the most common symbol used to represent the integers (the notation comes from the German word “Zahlen”, meaning “number”). Continuous mathematics, on the other hand, studies structures that are analogous to the real numbers, denoted as R\mathbb{R}.

You might notice we did not give a list of real numbers like we did for integers. This was not a mistake. In fact, the inability of listing out the numbers in R\mathbb{R} is one of the defining features of R\mathbb{R}! Indeed, if we even tried to begin listing out the real numbers in order, for example:

R?{,9000,2,0,12,π,}\mathbb{R}\approx_?\left\{\dots,-9000,-\sqrt{2},\,0,\frac{1}{2},\pi,\dots\right\}

we can always find a number that is not included in our “list” of real numbers. For example, the real number (π1)(\pi-1) is not included in our “list” above.

In data science, numerical data can always be categorized in a similar way as either discrete or continuous. Rainfall, for example, would be a continuous variable since there is no well-defined next “step” after 11 inch of rainfall. If we said the next “step” was 22 inches, we preclude the possibility of 1.11.1 inches. Likewise, if we say the next set is 1.11.1 inches, we forget the possibility of a 1.011.01 inches, and so on.

In contrast, the number of car crashes on a given day is a discrete value, since it doesn’t make sense to talk about 1.51.5 car crashes. The number of car crashes must be a (positive) integer.

For this reason, traditional probability theory treats discrete and continuous variables in different, but somewhat analogous, ways. Technically, there is a more modern formulation of probability theory, which relies on an area of advanced mathematics called measure theory, that avoids the need to make this distinction. However, the formality and rigor gained from using measure-theoretic probability theory is not worth the cost in simplicity for most applications.

In this section, we will explore the foundations of how we model discrete data.

Good job, keep it up!

0%

Completed

You have 255 sections remaining on this learning path.

Loading pricing options