Introduction to Discrete & Continuous Probability Distributions

  ✅ 1. What is a Probability Distribution? A probability distribution describes how probabilities are distributed over the values of a random variable . Random Variable : A variable whose values are outcomes of a random phenomenon. ๐Ÿงฎ 2. Types of Probability Distributions Type Description GIS Example Discrete           Takes countable values  Number of landslides per year in a          valley Continuous          Takes infinite values over an                 interval Rainfall (mm), elevation, temperature  ๐Ÿ“Œ Discrete Probability Distributions ๐ŸŽฏ 3. Binomial Distribution ✅ Definition : Used when an experiment is repeated n times , and each trial has two outcomes : success or failure. ✅ Conditions : Fixed number of trials (n) Only two possible outcomes per trial (success/failure) Constant probability of success (p) Trials are in...

Population vs Sample


The first step of every statistical analysis you will perform is to determine whether the data you are dealing with is a population or a sample.
A population is the collection of all items of interest to our study and is usually denoted 
with an uppercase N. The numbers we’ve obtained when using a population are called parameters. A sample is a subset of the population and is denoted with a lowercase n, and the numbers we’ve obtained when working with a sample are called statistics. Now you know why the field we are studying is called statistics ๐Ÿ˜Š

Let’s say we want to make a survey of the job prospects of the students studying in 
the New York University. 
What is the population? 
You can simply walk into New York University and find every student, right? Well, probably, that would not be the population of NYU students. The population of interest includes not only the students on campus but also the ones at home, on exchange, abroad, distance education students, part-time students, even the ones who enrolled but are still at high school. Though exhaustive, even this list misses someone. Populations are hard to define and hard to observe in real life.
A sample, however, is much easier to contact. 
It is less time consuming and less costly. Time and resources are the main reasons we prefer drawing samples, compared to analyzing an entire population. So, let’s draw a sample then.
As we first wanted to do, we can just go to the NYU campus. 
Next, let’s enter the canteen, because we know it will be full of people. We can then interview 50 of them.
This is a sample. 
But what are the chances these 50 people provide us answers that are a true representation of the whole university?
The sample is neither random nor representative.
A random sample is collected when each member of the sample is chosen from the population 
strictly by chance. 
We must ensure each member is equally likely to be chosen.
Let’s go back to our example. 
We walked into the university canteen and violated both conditions. People were not chosen by chance; they were a group of NYU students who were there for 
lunch. 
Most members did not even get the chance to be chosen, as they were not on campus. Thus, we conclude the sample was not random.
What about representativeness of the sample? 
A representative sample is a subset of the population that accurately reflects the members of the entire population.
Our sample was not random, but was it representative?
Well, it represented a group of people, but definitely not all students in the university.
To be exact, it represented the people who have lunch at the university canteen.
Had our survey been about job prospects of NYU students who eat in the university canteen,
we would have done well.
By now, you must be wondering how to draw a sample that is both random and representative.
Well, the safest way would be to get access to the student database and contact individuals
in a random manner.
However, such surveys are almost impossible to conduct without assistance from the university!
We said populations are hard to define and observe. 
Then, we saw that sampling is difficult.
But samples have two big advantages. 
First, after you have experience, it is not that hard to recognize if a sample is representative a small mistake while sampling is not always a problem.

Comments

Popular posts from this blog

Propositional Logic and Its Applications

Introduction to Probabilistic Modelling

Common Logical Fallacies