
Data Science Interview
2 of 257 Completed
Introduction
Discrete Random Variables
Expected Values
Variance
Discrete Uniform Distributions
Bernoulli and Binomial Distributions
Geometric Distributions
Poisson Distributions
N Dice
Fair Coin
Converted Sessions
Biased Five Out of Six
Introduction
All areas of study in math can roughly be divided into two camps: discrete mathematics and continuous mathematics. Perhaps the best way to describe the difference between the two is to talk about what each of the branches means by “number.”
Discrete mathematics studies structures that are analogous to the integers, that is, the set of all whole numbers . The symbol is the most common symbol used to represent the integers (the notation comes from the German word “Zahlen”, meaning “number”). Continuous mathematics, on the other hand, studies structures that are analogous to the real numbers, denoted as .
You might notice we did not give a list of real numbers like we did for integers. This was not a mistake. In fact, the inability of listing out the numbers in is one of the defining features of ! Indeed, if we even tried to begin listing out the real numbers in order, for example:
we can always find a number that is not included in our “list” of real numbers. For example, the real number is not included in our “list” above.
In data science, numerical data can always be categorized in a similar way as either discrete or continuous. Rainfall, for example, would be a continuous variable since there is no well-defined next “step” after inch of rainfall. If we said the next “step” was inches, we preclude the possibility of inches. Likewise, if we say the next set is inches, we forget the possibility of a inches, and so on.
In contrast, the number of car crashes on a given day is a discrete value, since it doesn’t make sense to talk about car crashes. The number of car crashes must be a (positive) integer.
For this reason, traditional probability theory treats discrete and continuous variables in different, but somewhat analogous, ways. Technically, there is a more modern formulation of probability theory, which relies on an area of advanced mathematics called measure theory, that avoids the need to make this distinction. However, the formality and rigor gained from using measure-theoretic probability theory is not worth the cost in simplicity for most applications.
In this section, we will explore the foundations of how we model discrete data.
0%
CompletedYou have 255 sections remaining on this learning path.