P value (Part 1)

People often thinks “p-value” as “probability”, though they are related but not completely same.

Lets say, You flip a coin two times:

  1. First time you will get either 50% of Head and 50% of Tails
  2. Second time you will again get either 50% of Head or 50% of Tails.

So, if the question now is

a. what is the probability of getting two heads in a row ?

or

b. What is the p-value for getting 2 heads in a row ?

Now, lets breakdown the questions:

a. What is the probability of getting two heads in a row?

So after two flips there are four outcomes, in which each one is equally likely, as it is equally likely we can use the following formula to calculate probability:

Number of times two heads occurred / total outcomes = 1/4 = 0.25

So, there is 25% chances of getting two heads after two flips.

so if the question is probability of getting two tails then it will be again: 1/4= 0.25

means 25% chances of getting two tails

what about one tail and one heads what is the likelihood ?

2/4 = 0.50

so we have twice the chances of getting one head and one tail compared to two heads or tails.

B. So what is the p value for HH ?

P value definition: A p-value is the probability that random chance generated the data or something else that is equal or rarer.

(P value range = 0 – 1 and in statistics if it is less than alpha 0.05 it is significant and it is more than alpha 0.05 it is not significant)

So, it consist of three part:

  1. A p-value is the probability of event/ that random chance generated the data

so, it will be like probability of getting two heads (HH) in our two random flips, which is 0.25

2. something else that is equal

which is like getting two Tails (TT), which is equal to getting two heads (HH) and is also 0.25

3. Any rare event

There is no rare event than getting HH, so it is 0

so, the p-value for getting two heads (HH) = 0.25 + 0.25 = 0.50 and which is different than the probability of getting two heads which is 0.25

How about make this more complex:

Its easy to calculate each outcomes with coin, but lets consider human heights, is it easy to consider each possible outcome, definitely not, if so how many decimal places you need to put to accurately calculate that, which is not possible every time. So, for that we use density plot or distribution plot:

Image result for density plot

So, from a study “Height of nations: a socioeconomic analysis of cohort differences and patterns among women in 54 low- to middle-income countries.”

we found that height for Brazilian women (between 15 and 49 years old) which was measured in 1996 mostly lies in between

142 cm (4.6 ft) to 169 cm (5.5 ft)

So, area under the curve shows the distribution or probability of someone having heights in that range.

If we breakdown and analyse the density plot.

  1. 95% of the women or most of the women have height in between 142 – 169 cm or there is 95% probability that each time you measure a Brazilian woman’s height it will be in between 142 – 169 cm.

2. There 2.5% changes or 2.5% probability that each time you measure Brazilian women it will be less than 142 cm.

3. There is 2.5% chance that each time you measure Brazilian women it will be more than 169 cm.

So, what will be the p-value for someone who is 142 cm tall:

To calculate:

  1. there is 2.5% chance that someone will be 142 cm or shorter = 0.025 (which is probability of the even)
  2. there is 2.5% chance that is someone will be 169 cm or shorter = 0.025 (which is the event more likely equal or rare)
  3. there is no more rare event than that = 0

So, the p-value = 0.025 +0.025 = 0.05

(so in statistics, it is significant)

How about, what is the p-value for someone who is between 155.4 and 156 cm tall ?

so to calculate:

  1. the probability of the event or person with heights between 155.4 and 156 cm tall is = 4% or 0.04
  2. For rare events or extreme values .

there is 48% of the people who are taller than 156 cm and there is 48% of people who are shorter than 155.4 cm which is equal to = 0.48 + 0.48 = 0.96

So, the p value is = 0.04 + 0.96 = 1

So, in statistics its not significant or measuring someone between 155.4 and 156 is not significant thought the probability of the event is rare.

Reference:

Josh stammer video statquest. (https://www.youtube.com/watch?v=5Z9OIYA8He8&t=92s)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s