this is a video on hypothesis testing. hypothesis testing is like a debate or argument.it has lots of technical ideas, but in the end we'regonna have to decide do we have evidence to show that we are right or maybe wedon't have evidence. let me show you you what we're gonna belooking at and all the details. first, i will talk about something called left, rightand two tailed tests. the idea is do we think we are on the low side, the highside or is it different from what we had expectedto be done or what the other guy said was true? then we'll talk about type i and type ii errors. those have to do with the idea that you make aconclusion that maybe your conclusion isn't right and then youmight think, well if the conclusion is wrong
what kind of bad things might happen? then we will look at something called p-values and rejection regions and that kind of the maththat we've been talking about in the last fewchapters. the p-value tells you the rarity the probability that what you were able to showwith your evidence would be that rare if the otherguy was right. the rejection region is a graph under the z or thet distribution, we will have some others too andthat graph will help us understand the p-value, the probability because a probability, if youremember, is the area under the curve. then after that, i will get into the details, the step-by-step process in conducting a hypothesis test.
and then finally we will do some examples. so the first type ofexample we will look at is we'll look at a hypothesis test for a proportion. that's when we want to find out if the proportion isless than or greater than or maybe not equal tosome proportion that the other guy's saying is true then after that we will do the same thing exceptwe will look at a hypothesis test for a mean. and we'll find out whether the population mean isgreater than, less than are not equal to whateverthe other guy is saying was true. let's start getting into some of the details. first let's talk about the null and alternativehypotheses. i like to think of a hypothesis test as a debate.
the idea is that your opponent in the debate says "this thing is true." and you say "no it's not! that's ridiculous." butwe know, of course, in statistics unless you collectdata then you really don't have good evidence to showthat the other guy's ridiculous idea is wrong. so we will look at the data and will saywell i think that based on my data i can show that the other guy is wrong or least it'svery likely that he is wrong. the null hypothesis will be the other guy'sstatement. the alternative hypothesis is
my statement. the other guy we can think of asthe villian and that's the villian's statement and we're the heroes, so we can think of thealternative hypothesis as the hero's statement. here are some examples. let's suppose that you want to find out if the more than 50% of voters support a candidate because if more than 50%support that candidate, then the candidate wins. so let's write down the idea of the null hypothesisand the alternative hypothesis. here's how the symbols work.
first we use the letter "h" capital "h" to representthe word hypothesis. now the subscript here is zero but unfortunately orfortunately depending on where you're from, the brits got to it before the americans. so wedon't say "h zero". we say "h naught". that means the standard, what the other guy saysis true. we don't think it's true, but we write thatdown. in this case the other guy says "ah, its 50%." that's the number of voters. now50% isn't enough to win. it turns out in a nullhypothesis we always write down first the variable and it is forthe population parameter. remember, the letter we use for the populationproportion is p and for the sample proportion is pwith a hat over it.
so we write p. and for the null hypothesis we will always write "=". so p = 0.5. but then in the debate, we say "no it is not 0.5. itis greater than 0.5 because my candidate is going to win. and if mycandidate is going to win then the proportion will be more than 50%. so pis greater than 0.5." let's look at another example. let's suppose you want to find out
if the mean number of snowboards rented isgreater than 60 snowboards per day. this is different in that in the last one we werelooking for a proportion and this one we are looking for a mean. nowremember when we're talking about the nullhypothesis and the alternative hypothesis when we're looking for a mean or proportion it'salways the population parameter. let's look atwhat the hypotheses are in symbols. the null hypothesis h and then naught, that's the zero, and then not p anymore because we're talkingabout population mean and recall that the symbolwe use for a population mean is mu. then always "=". i want to compare it with thenumber 60. so the other guy, our opponent, issaying
the mean number of snowboards rented is 60.and we're are saying, no it's not, it's greater than60. the opponent is the null hypothesis: mu = 60. the alternative hypothesis is h1. some people bythe way write an "a" for alternative instead of 1. they use "0" and "a". i like to write down h1. ihave 0 and 1. that seems good in numbers.there is about a 50-50 split in terms of what people use. h1, so you always use a â€Å“:â€,this guy is not an â€Å“=â€. this is a â€Å“:†and over here,this is a ":" so, never ever use h1 =, use h1:, which meansthat there is a statement that follows that's what ":" means. and then mu isgreater than because of this word "greater" righthere, 60 let's look at one more example and let's seewhat we get. we want to find out
if the average speed that people drive on highway50 is different from 40 miles per hour. so here, let's look at the keywords. one of thekeywords is "average", so we're not talking about a proportion we're talking about an average or a mean. another key word is "different" notice that herewe have greater than 50% here we are greaterthan 60. we could also have "less than". i didn't dothat example. the less than would be "<" but different is the thirdinequality sign. let's take a look at what thatlooks like. so again ho with a ":". notice that this alwayshappens ho:
and then mu because we're looking at theaverage. equal is always what we write for the ho and then40 because we are comparing with 40. the other guy says it's 40. we say nowit's not 40. two h1: mu and the symbol for different is notequal to. so mu is not equal to 40. so hopefully you have an idea on how to construct the null hypothesis and the alternative hypothesis given a hypothesis test.
let's move on to talk about left, right and two tails. so we call a hypothesis test a left tailedhypothesis if it involves a hypothesis that theparameter is less than a number so ho for example mu = 10. but then when you talk about left tail tests and thatis the h1: mu < 10 and that's a less than so a lessthan on a graph is the left side of the graph and that's why we call it a "left tailedhypothesis test". you might be able to guess what a right tailedhypothesis is that means we're gonna involve thehypothesis that the parameter is bigger than
a number. so in this case the alternativehypothesis is that mu > ten and when we have a ">" sign we call that a right tailedtest or right tailed hypothesis. then finally the third type, which as saw on the lastscreen, is called the 2 tailed hypothesis. so left tailedmeans that we're gonna say we're right if it's very far left so very far left of 10 for example right tailed means that we will say we are correct if the data show that it is very far to the right and two tailed means
that we're going to say were correct if the dataindicates it's either far to the left or far to the right and in that case thetwo tailed hypothesis test is going to have thenot equal to sign. again, this could be a proportion or it could be amean so we use left tailed, right tailed and twotailed. let's talk about type i and type ii errors. remember a hypothesis test is a debate. your opponent is saying, "i'm" right, but i am going to say, "no, no, no. i'mright!" and the idea is you will collect evidence to decide or at least to support the claim that i'mright.
or maybe you will support the claim, actually youwon't support the claim but we might say, "well wedon't really have enough evidence to say i'm right." so it could be the case that i have really goodevidence, i have evidence to say that i'm right, but i wasn't. that happens sometimes not veryoften, but it does happen. have you ever been in that case where you said,"i'm right and i know it. see look, i looked at a bunch of things and ya ya ya, you're wrong andi'm right" and then the next day you realize your wrong. that you have to come and go back to youropponent and say ya ya, i didn't do it right. i had evidence. i thoughti was right, but no, i'm an honest guy andyou were really right. that would be a type i error. a type ii error is kind of the opposite.
that means your opponent is saying youropponent is right. i say, "no, no, no i'm right. i think i'mright" and then we're gonna get the evidence.then we will collect the data and end up saying, "you know, my data doesn't show it. i took a survey. it doesn't look like i'm right at all. maybe i'm notright. maybe i'm right but the evidence doesn'tshow it." and then the next day you find out you're right andyou go, "oh no! i was right and my opponent was wrong and i didn't say i was right. i said myevidence wasn't strong enough." that's called atype ii error. so in general
a type i error means that we rejected ho when ho is true. for the probability of a type i error, we use the letter "alpha". this is a greek letter and that's "alpha". then a type ii error means we fail to reject the null hypothesis whenthe null hypothesis was false. and that this is a detail we won't talk much about.the probability of a type ii error is often noted asone minus that's the greek letter "beta": one minus beta.
those are type i and type two errors in general but let's look at some examples because that'sreally when you find out what these errors are allabout. so let's suppose a hypothesis test is done to seeif a proposed vaccination for hiv is effective in reducing the chance of a persongetting infected. let's discuss the implications of the type i anda type ii error. it is the implications that really matter in otherwords if you have an error, errors are never agood thing, that means if you messed up you collected data maybe you collected your datajust right but you just had bad luck and your datadidn't indicate the truth. we want to find out what will happen what will happen if our data doesn't indicate thetruth. let's start out with a type i error.
a type i error means we're gonna makethat decision we will say that we rejected the null hypothesisbut the null hypothesis was true. rejecting the nullhypothesis, the null hypothesis here is that the vaccination doesn't make any differenceat all so the infection rate for hiv with the vaccination or without is the same so witha vaccination is the same as it had always been, which means that the vaccination is not working. h1 means that the infection rate for hiv is less than the infection rate
without the vaccination. so if you have thevaccination the infection rate will be less than if you don't havethe vaccination. so then the question is if you have a type ierror, why is it bad? so that means is that we collecteda sample. we checked it out and we looked at our sampleand we said, "wow look at this! in our sample the vaccination, the vaccinatedpeople in our sample are getting hiv at a much smaller rate than the people, the general public people, thatwe have data from before.
and then what are we going to do? we will gothrough and vaccinate everyone. we will say, "wehave the most amazing vaccination in the world." and guess what will happen. we will think we arewonderful. we have a great vaccine, and then later on we will realize that this vaccinedoesn't work at all. it does nothing. it doesnothing to prevent hiv. so maybe we've given everyone shot. maybe wespent lots of money. maybe people will even havea false sense of security because they've been vaccinated. they think, "i've been vaccinated. idon't have to worry anymore" and they may have unsafe sex or maybe use badneedles or whenever might be. that would be horrible. that would be really bad. that's a type i error. that's the problem witha type i error. if we make the decision and wesay our medicine works we vaccinate everybody and then it doesn't work.
let's look at a type ii error. a type ii error means that we looked at oursample our sample, it might have been a little better butnot enough to say that our vaccination is really making thedifference when it comes to reducing the rate of hiv. and because our sample did not make that difference then the fda is not going to approve the vaccine to begiven.
remember, this is a type ii error whichmeans we fail to reject so that means no approvalof this vaccination but actually it does work. so what will happen is we have this vaccinethat is wonderful. it works. it drops the number orthe percent of hiv significantly, but our sample just from bad luck, maybe we got abunch of people that engage in unsafe sex morethan other people. the vaccine would have helped but we just had really bad luck with the sample. the vaccine works and it's never used. had wechosen a sample that showed thevaccine worked because it really does work then we would be able to reduce hiv, reduce the percentage of people that end up withhiv. and that's terrible also. that's anothererror and that's a type ii error in this situation.
to the p-value. as i mentioned in this last example we looked at this vaccination. so the idea is if it's true that this vaccine that we'retrying to use is useless, and we took sample and oursample showed it was lower maybe the questionis what would be the probability do to random chance if the vaccination had been useless,so if ho had been true what would the probability be that we would endup with a sample mean that is as
ridiculously low or ridiculously high if that happensto be an ho with a greater than or ridiculously different so a sample mean that is thatextremely high, low, or different depending on whether we are doing a left tailed, right tailed ortwo tailed test, that extreme result that the sample showed. let's move on. this is a tough concept and wewill show examples of this. here's an example. let's suppose a hypothesis test wasconducted with 200 students. we will always use n for thesample size so n equals 200 students, to see if more than 70% of college students gainweight
in their first year of college. sometimes they call that the freshmen 15. youmight have heard that before. maybe we read an article thatsaid that 70% of all college students gain weightin their first year college. we say, "no, that article can't be right. its higherthan that so ho would be that p is equalto 0.7, that's 70%, and h1 would be the p is greater than 0.7. now let's talk about the p-value. let's supposethat the sample proportion was found to be 0.76.in our sample 76% of these 200 students gained weight in their first year of college. noticeof 76% is higher than 70% but we've learned enough,were pretty far into statistics that we know thatrandom chance can happen.
yah we got 76% which is higher. but the questionis had it been 70% and that we did a sample with 200 students,is it unlikely to just randomly get 76% of these 200 again, you don't expect exactly 70%. it mightbe a little bit higher. so let's suppose that the probability, the p-value, was 0.03 or 3%. and this is how we interpret that 3%. it means that if the proportion really was 70% so ifthat article we had read really was true that 70% of all freshmen gain weight and if we took a sample or if we took manysamples of size 200 students then 3% of thesamples
will produce a sample proportion at least as bigas 76%. in other words we can say that there wouldbe just out of random chance a 3% chance of taking a random sample of 200 student that we will end up with a proportion of freshmen gain weight their firstyear at least 76%. that's what the p-value means. here's the badnews. there is no way of stating this in a very short way. you have tostart with, "if the proportion really was 70%" so ingeneral if ho had been true. ho was that the proportion was70%.
and if, you could either go many samples or ifwe took a sample of size 200. you could either say 3% of these samples or therewould be a 3% chance of such a random sample would produce a sample proportion at least as large as 76%. so notice that this probability is for sample proportions. ho is for the population portion but we've only taken a sample. we are sayingthat
given a sample, what will be the probability ofgetting something as extreme as what we got? hopefully this makes a little bit of sense onwhat a p value is. we will look at some more. now let's look at this graphically. what we will talk about is called the rejectionregion. that's the region under the z or the t, we'velearned in confidence intervals that we can use z,for example, for proportions as long as np and nq are greater than 5 and also you can use z if youhappen to know for means, if you happen to knowthe population standard deviation which is really rare.usually you don't. otherwise use t when you're talkingabout a hypothesis test for a mean and you don'tknow the population standard deviation. so the region under the graph such that thestatistic in that area
will result in ho being rejected which by the way if you reject the hypothesis test which by the way if you reject the null hypothesis that means you say the other guy is wrong and ifyou say the other guy is wrong that means you'reright. so h1 is accepted. a test statistic outside of the region willresult in failing to reject ho. this takes some getting used to. this is one of those kind of bassportsmanship kind of things that we do withhypothesis testing. never admit you are wrong. all you do is you say that i didn't haveevidence to say i was right.
it is a double negative and that's the way wedo things in hypothesis testing. it might be thatour sample size is too low. it might be that i waswrong. we really don't know, so we just fail toreject. let's look at this graphically. if we have a left tailed test, then this is showing the normal curve, but a tdistribution is similar. later on in this class we willhave some other distributions, then the critical value would be a z, and we've seen this number:-1.645. the area to the left of that -1.645
is 0.05. i am doing the example where we havewhat's called a 5% level of significance. what happens is we will collect oursample and we will calculate the p-value and if the p-value is over here so if the p-value is over here so thatmeans is that the area to the left of this value right here which we call the test statistic that'll be somevalue of z and if the area to the left of that is less than 0.05 to that means that our value of zis less than -1.645, we reject the null hypothesis. we call all ofthese values here the rejection region. on theother hand if we're greater than -1.645 then we will fail to reject the null hypothesis. if it is way out here we will saythat it would be very rare
to get a z-score so far to the left so the other guy is wrong. we are right. at least wehave evidence to say so. whereas if we get something out here to the right,that's not so rare that's a high probability so we will say that we failto reject ho. we won't say the other guy is right.we say we really don't know who's right. it's a very fancy way, failed to reject the nullhypothesis is a very fancy way of saying that i dida whole lot of work and i know nothing. that's all you are saying. let's look at a right tailed test. it is very similar. for a right tailed test we draw the graph such thatwe shade in the right hand side of this criticalvalue: positive 1.645. and if the test statistic happens to be over herewhich we will show you in a minute on a calculator,
that means the area to the right is less than say 0.05 then we will reject the null hypothesis on the other hand, if we're over here and the areato the right is greater than 0.05, then we will failed to reject the null hypothesis. then for a two tailed test that means we have a left and right hand side then you can go to either side. because you goto either side, we want say the combined area onboth sides in this rejection region should be 0.05. we have seen this with confidence intervals.
if you had, say a percent level of significance,which means a 95% confidence interval. we use 5% instead of 95% then that means we will the split the area in twopieces to 2 1/2% to the far left 2 1/2% to the far right and 1.96 and -1.96 arethose barriers so if we're out here, far left, we will reject the null hypothesis or the farright we also reject the null hypothesis on the other hand if we are in this middle area, we fail toreject. hopefully the graph makes sense. let's talk about the order in which we do things. it is very important that you do things in the rightorder. otherwise it is just poor statistics
it is something different. sometimes we call thatdata mining for me show you the work it is something different. sometimes we call thatdata mining. let me show you the work. the very first thing is we have to determine thehypotheses in words. it might be i read that article. thearticle said that 70% of freshmen gain weight in their first year. i say no,i don't think that's true. i think it is actually biggerthan 70%. more than 70% gain weight in their first year. andthat means all students in the whole country. so the next step after we have it in words is writedown the symbols ho and h1. ho in that case would be that p was equal to
0.7 and h1 was p was greater than 0.7. after you have written down ho and h1 insymbols, then what you do is you determine thelevel of significance. that's the "alpha". the level of significance is determined by how terrible it is for there to be a type i error. how aweful would it be if you ended up concluding that more than 70% offreshmen have
gained weight in their first year but really, that's not true. it was 70%. in that case you would probably lose face, maybeyou would publish but your publication would bewrong. you would be embarrassed. i don't think anybody would die. you you may endup maybe changing the cafeteriaand convincing the cafeteria director and having them have lower calorie foodwhen really it's not that big of a deal and students will be mad atyou because you have a bunch of this healthy notas tasty delicious food because you are not throwing sugar and fat in when the freshmen can handle the sugar and fat.who knows? that would be the problem with thetype i error. if you find that's really bad then you will have a level of significance veryvery small like 0.01.
if you find that it's really no big deal at allthen you might use 0.1. the standard level of significance is 5%, 0.05,and thats kind of the flip side of the standardconfidence level. 95% confidence is standard, a 5% level ofsignificance is standard. once you determine the level of significance thenyou go ahead and conduct the survey or theexperiment. that's actually the hard part. in the course ofstatistics, that will usually be given to you.sometimes you have to do a project or two you have to do that. that's a lot ofwork you have to actually go out and ask people this question. you ask them at thebeginning of the year and the end of the year. you weigh them and see if they weighmore at the end or not. that can be a lot of work. it can beexpensive and can take a year to do so you
wanna make sure that you do everything elsecorrectly so that when you conduct the survey orexperiment that takes all the work, you got it right and you don't have to redo it. then after that we calculate the test statistic. thatcan be done using a calculator or a computer. and we'll talk about that later. then after that we calculate the p-value. we talkedabout the p-value. that is the probability that if hois true and we did another study then we have that percent chance that our study would result insomething as extreme as what we got. after we calculate the p-value we compare the p-value with a level of significance. if the p-value is less than level of significance. if we get a p-valueof say 0.02
and the level of significance is 0.05 then we have a small p-value and we get to saythat there is statistically significant evidence to saywe're right to say that ho was false and h1is true. if the p-value is large in particular the larger than the level ofsignificance, say we got a p-value of 0.42 and the level of significance was 0.05 then the p-value is too big then we make our conclusion. in that case youdon't say that i was wrong. remember, never sayyou're wrong. instead the conclusion you make is that we don't have statistically significant evidence. sometimes wehave insufficient evidence or we say we havestatistically
insignificant evident. to make a conclusion or support the claim abouth1. i like the mnimonic if you go into a restaurant andthere is a side dish of peas. do you want to see the little nice cute peas on your plate or do youwant to see these gargantuan scary lookingpeas? i like to have little peas when i order peas for myside dish on my plate. little peas are good andbig peas are bad. so if you have a little p-value like0.0001, that's really little, then we can say we havestrong evidence strong statistically significant evident to say wewere right. whereas if you have one of those bigugly p's like 0.87, there we say we have insufficient evidence to support the claim.
ok so let's now look at how it kind of works mathematically we state ho and h1 in this case it involves p wehave seen some examples then we find the test statistic. i'm not expectyou to memorize this formula but this is the z-score for proportions. p-hat was what we observed. p was a hypothesized value of p. then if you remember the square root of pqover n that was the standard deviation when we have a large sample size.
p-hat minus p divided by the standarddeviation, observed minus our true proportion divided by the standard deviation, that was the z-score and that is called the test statistic. the calculator can do it for us but it's nice to seewhere it comes from. we can tie it in with all the math that we've beendoing. then we get the p-value and the p-value is theproportion that corresponds to the z and the calculator will do that and in the ti 84 we will use the 1-propztest. then if the p-value is that less than alpha rejectho and accept h1 and state the conclusion.
if the p-value is greater than alpha fail to reject hoand make no conclusion and say there'sinsufficient evidence there is statistically insignificant evidence. let's look at an example. here is an example. a candidate for the upcoming election has done asurvey of registered voters to see if she will win. in order for the candidate to win she will have to get more than 50% of the votes.
of the 300 registered voters that she serveys 170of them indicated that they would vote for thiscandidate. what can be concluded that the 0.05 level ofsignificance? we will use a 1-propztest. let's take a look. the first thing is we state our hypothesis ho is thatp = 0.5 h1 is that p > 0.5. you have to getmore than 50% in order to win. we will use an alpha of 0.05. this is standard in social science and politicalscience is one of the social sciences.
alpha is 0.05 is your standard level ofsignificance. gallup poll did this. the new york timesdecided that that this was standard and everyonelistens to the new york times and it became 0.05. the next step is we need to go to our calculator. let's gp to ourcalculator and let's calculate whatthe p-value is given the information. so here's the calculator. i go to stat, we have seen that button before. i go to tests. and then instead of a confidence interval, we'relooking at a hypothesis test, so in this casebecause we want to conduct a hypothesis test, we will conduct the hypothesis test for apopulation proportion
with a single sample. we go to 1-propztest which is numberfive here. i hit enter. it will ask us some things. the first thing it asks usis p-naught. p-naught is the proportion that is kindof contained in ho. and if you remember that was 50% but wealways write it as a decimal that's 0.5 then ask for x. x is the number in the sample that said "yes". if you remember170 of these voters that the candidate looked at orasked said "yes". so: 170
and then n. we have seen n before. n is a sample size. the sample size is 300. there were 300 voters.300 then it asks prop not equal to p0 less thanp0 or greater than p0. if you remember,our hypothesis test was that we are looking at p was greater than 0.5 so that means that we're looking at a ">" which remember is a right tailed test. i hit enteron "> p0". then i go to calculate.
then enter. there's our p-value. it gives us a few things. the test statistic in this case is z and that's 2.309 etc. the p-value and that is p is0.01 p-hat, that's the sample proportion is about 57%. 57% is greater than 50% but thenis it significantly greater than 50%? that's whatthe p-value tells us, the 0.01. let's go back to our powerpoint. here's the powerpoint and if you recall ho wasthat p = 0.5, ho was that p > 0.5 and our level of significance was0.05.
the calculator gave us the p-value but the question is can reallyuse that p-value and remember in order to use thenormal distribution, remember there was that z there when we looked at the test statistic. youhave to have np which is a number of yes's whichwe had 170 of and that certainly is greater than 5, much greaterthan 5. and nq, which is a number of no's, that's 130. if there are 170 yes's and there are300 people surveyed 300 -170 is 130. and that's also much bigger 5. so, yes we cango ahead and say that the p-value is valid so thep-value was 0.01. in particular 0.01 is less than 0.05.
little p's are good. remember that. that means it would be very rare, veryrare for this to happen with random chance. 1% chance which is less than our definition of rarethat we defined as 5%. so we reject ho and accept h1. let's state our conclusion. the conclusion is thereis statistically significant evidence to support theclaim that she will receive more than 50% of the votes. and this isn't 50% ofthe 300 votes. we already know about that. thisis 50% of the entire electorate. that votes for this candidate or against thiscandidate. so yes, she is very confident that she is going to win because the p-value is sosmall.
let's interpret the p-value. our p-value is 0.01. that means that if it were true that she couldonly get 50%, that isn't a win by the way, if she canonly get exactly 50% of the votes for the for this candidate, can only get 50% and if another randomly selected 300 voters weresurveyed, then there would only be a 1% chance that at least 170, that's how many said they'd votefor her, of them will vote for this candidate.
notice this is a small percent chance. that's how you interpret the p-value. notice this isinterpreting p-value. it's not stating the conclusion. that conclusion is yes we have evidence to sayshe's going to win and get more than 50% of thevotes. this is an interpretation of the p-value. now let's interpret the level of significance. the level of significance was 5% or 0.05. what that means is that if 50% of the vote were supporting the candidate and remember50% is not a win, so if a candidate can only get50%
and if another randomly selected 300 voters weresurveyed then there would be a 5% chance, that's our alpha that the new survey would lead us to the falseconclusion the candidate will receive more than 50% of the votes. notice that alpha, this level ofsignificance, is the probability of a type i error. a type i error means that if it were true that p was 0.5, if ho were true and another surveywas done there would be that 5% chance, the probability ofa type i error, that we'd end up finding out or we'd end upconcluding falsely that more than 50% will support her.
notice in this situation we ended up rejecting thenull hypothesis. so, in this case a type i error is of concernbecause yes, we rejected the null hypothesis andwe have concern that maybe we rejected the nullhypothesis but the null hypothesis was false. we are pretty confident because we got a p-value much less than 0.05. but it still can happen. a type ii error is no longer a concern at allbecause we rejected the null hypothesis. a type ii error happens when you fail to reject thenull hypothesis which didn't happen in thissituation. let's look at another example where we're just going to look at the
interpretation, we will look at the conclusion andnot p and alpha. ok here's another example. 12% of tahoe residents were born in tahoe let'ssuppose. if you know about tahoe, most of the peoplewho live here are transplants like myself. i wasn'tborn in tahoe. most of the people weren't. let's say 12% of allresidents were born in tahoe. a researcher wants to see if the proportion of at ltcc, at lake tahoecommunity college, is different from 12%. she surveys 400 ltcc students and finds that 45of them were born in tahoe notice that 12% of400 is 48. so yes, 45 is not 48,
but the question is it far enough from 48 tobe able to say that it would be really too unusual to end up with such a low number as 45 which is less than 48 that we can say that this is too unusual to havehappened. so maybe this whole 12% is just wrong for ltccstudents. so she finds that 45, which is actually a little lessthan 12%, were born in tahoe. what can we conclude at the 5% level of significance? so the first thing is you state your nulland alternative hypotheses.
again it is a proportion. ho is that the proportionof ltcc students, of all ltcc students, so it is apopulation proportion is equal to 12%. is equal 12%. h1 is that the population proportion isdifferent from 12%, so not equal to 0.12. alpha was 0.05. we need to next try to find the p-value. again, i will use a calculator for this. let's go toour calculator. again, i will use a calculator for this. let's go toour calculator. here's the calculator. just like lastone, i am going to go to stat
tests and then number 5 is the 1-propztest. we haveone sample and we're talking about a proportion and are talking about a hypothesis test not aconfidence interval. hit enter. if you remember, we one find out whether or not it's 12% for the populationproportion of ltcc students. i type 0.12. go down to x and we found there were 45 of them of the ones we looked at were born in tahoe
n was 400. there were 400 students surveyed. in this case our ho was that p was not equalto 12% . hit enter on not equal to p0. i go to calculate and hit enter. notice that our test statistic is -0.46. that's a z. the p-value is 0.64. it is really big. p-hat was about 11% which is not equal 12%, butit's not that much different. now let's go back to the powerpoint. here's the powerpoint. before we proceed, webetter check to make sure that whatwe did was right. was our sample size large
enough? we compute np. np was the number ofyes's. that was 45. that certainly is greater than 5. nq is the numberof no's. there were 400 sampled 45 said yes. so 400 - 45 is 355. that is much bigger than 5. so yes, we can actually say the p-value is valid and we can use that z asour test statistic. the p-value was 0.64 which is much greater than 0.05. so we fail to reject the null hypothesis. and the conclusion
is that there is insufficient evidence to support the claim that the proportion and that's a population proportion of all ltccstudents who were born in tahoe differs from 0.12. i want to note a couple things. one, is this isn't even close. the p-value is 0.64. had we gotten say 0.06 we might think well youknow that's hopeful that close. we didn't quite getthere, but maybe we were still right. here we were so far away that it might even be a bad idea even try to doanother sample with a larger sample size
so that we can kind of show the truth better. whenyou have a 0.64, a very high p-value, much higher than alpha, you usually give up. then you usually say well maybe i wasn't right. icould've been right, but probably not. i'll don't think i want to spend a whole lot more timeand money to try. on the other hand, if you get, say 0.06, you'd say"ah shucks, i just missed it!" maybe if i surveyed 4000 instead of 400 i'd get it. you can't do 400 again because if youkeep trying, remember there is always a 5%chance that you will just get lucky or unlucky depending if you look at it and reject thenull hypothesis when it was when the null hypothesis was correct.
so you can't use the same sample size but youcan use a much larger sample size and kind ofshow the truth a little better. on the other hand if the p-value is less than 0.05,like in the first example, say 0.01, you don't need to do another example becauseyou got your evidence. maybe someone elsemight want to do one because they don't believe you, but you have your evidence and you canpublish a paper. you can say, "yes, i was right." in this case pretty much kind of turn around, put your headdown. don't say you are wrong, but say, "well, idon't know." don't say, "i'm hopeful", but just say, "i don't know" and is a very fancy way of saying, "i don't know."
so let's move on and talk about hypothesis tests for a mean. it's very similar. we state the null and alternativehypotheses ho and h1 but this time instead of ho and h1 involving p, ho and h1 involve mu because we are talkingabout apopulation mean. you determine alpha, the level of significance. 0.05 is standard but sometimes you go higheror lower. we get our p-value in this case most of the timewe will use a t-test and that's because most ofthe time you really don't know the population standard deviation
in the way it if you knew the population standarddeviation you probably know the populationmean and you don't need to conduct a hypothesis test.you just know. so almost all the time it will be a t-test.occasionally you might happen to know thepopulation standard deviation. probably not and then we usethe z-test. if the p-value is less than alpha we reject ho andaccept h1 and state the conclusion. same ideaas we had with proportions. if the p-value is greater than alpha we fail to rejectthe null hypothesis and make no conclusion. you find a fancy way of saying you know nothingthat there is no evidence, statistically insignificant there is insufficient evidence, lots of fancy waysof saying, "i know nothing." a study was done to see if the averagecollege student
receives less than the recommended 8 hours ofsleep per night. the 45 students who were surveyed averaged 7.4hours of sleep and had a standard deviation of 2.1 hours. what can be concluded at the 5% level of significance? this is very similar to what we had before. let's start out with the null and alternativehypotheses. ho is that the population mean number of hours of sleep
for all college students is 8 hours h1 is that thepopulation mean, mu, is less than 8. and we're using a 5% level of significance. now let's go ahead and go to our calculator. here's our calculator. i hit the stat key. i go to number 2 which is a t-test. we don't knowthe population standard deviation again weusually won't. it is just a standard test for a mean, single sample. here it asks for data or stats. in this examplewe were given statistics.
we were given the mean of the sample and giventhe standard deviation of the sample. if you happen to be given the data from thesample instead of the mean and standarddeviation, then we would go to data. we put inl1 and l2, or l1 only in this case and then we would go to data instead. it is pretty self explanatory.use l1. but we have stats. and we have mu0. that is what thehypothesis was. remember, the hypothesis wasthat the mean was 8 hours of sleep. so 8. we go down. x-bar was the sample meanwhich is 7.4.
then we go to sx. that's the sample standarddeviation which was 2.1. n was 45. our alternative hypothesis was that mu was less than 8 hours. so less than. i go to "<" and hit enter. go down to calculate and hit enter. there it is. notice our test statistic here is t not zany more. that's about -1.9
and our p-value is about 0.03. here's the powerpoint. before i even look at the p-value in consider it i always want to make sure my sample size islarge enough. now if you remember, the samplesize, n, was 45. it's really important to check: 45 is greater thanour benchmark, we're using 30. some peopleuse 25. i like 30 because it is safer. it is bigger than 30and the central limit theorem says that when the sample size is bigger than 30 the samplingdistribution is a approximately normal and then we use the tdistribution
for our hypothesis test because we did't know ourpopulation standard deviation and the t distribution accounts for that little error of usingthe stand distribution accounts for that little errorof using the sample standard deviation instead of the population standard deviation. our n was 45. we can say that we have avalid test statistic and the p-value is valid. our p-value is 0.03 which is less than 0.05 so we reject the null hypothesis. we reject a null hypothesis we say there is statistically significant evidence tosupport the claim that the population mean hoursof sleep that college students get is less than
there's our h1, less than 8 hours per night. let's look at the p-value. the p-value is 0.03. let's interpret it. was 0.03what that tells us is that the population meannumber of hours of sleep that college students get is 8. that means thatthey are really getting the right about asleep, and if another 45 college students were randomlysurveyed than there would be a 3% chance, there's our p-value: 0.03 that the sample mean number of hours of sleep forthis new survey would be less than 7.4. 7.4 was the sample meanthat we got and there would only be a 3% chancegiven random
chance had the true population mean been8. that we would end up with something as small orsmaller than 7.4. just a note, had this been the two tailed test with ap-value 0.03, had this been a not equal to then that means you go to both sides. 7.4 is 0.6 away from 0.8. on the other side if you go 0.6 above 8, you'd end up with8.6. so had this been a tailed test, we would say there would be a 3% chance of a sample meannumber of hours of sleep the new survey would either be less than 7.4 or greater than 8.6.
had our p-value than 3%, it wouldn't be becauseeverything changes a little bit when you have twotailed tests, but the point is that if you want find out for two tailed tests, you have to go left and right and if we are only given the value on the left so you have to see how far is it frome the mean, that's 0.6, so you go right 0.6 and that's 8.6. let's look at the level of significancewhich was 0.05 and interpret that. what that says is if the population meannumber of hours of sleep that college students get is 8 and if another 45 students were randomly surveyed then there would be a 5% chance, thats our alpha, that this new survey would lead us to falselyconclude that the population mean number of hours of sleep per night
for college students is less than 8 hours per night. that's the type i error. let's do one the last example before we call it aday. the last example is the following: the average fuel efficiency for u.s. passenger cars is 24.1 mpg it must be true because i found it online. a study was done to see if the average fuelefficiency for cars in tahoe is different. maybe it is worse, because ourcars happen to look like this, big and huge, which happens to be a chevy tahoe. so the question
is, is it different? it might be better. it might be worse. i think it will be worse actually but we'll see. the 36 cars that were part of the study had anaverage fuel efficiency of 22.3. that is different but is it different enough? mpg and a standard deviation of 7 mpg. what can be concluded at the 1% level ofsignificance? what we will do is go to our calculator. i hit stat just like before. again we want to find out about the mean, a singlepopulation mean. go down to t-test and i hitenter. we have stats. we don't have data. again, had i given you the data for the 36 cars,
what their individual mpg were, then wewould go to data. we would be putting in an l1 all the differentnumbers first, but here to save time i am justshowing what the stats are. i go to mu0. in this case it is not 8. we want tofind out if it's different from 24.1, so 2 4.1. that's what the problem stated. x-bar was the sample mean which is 22.3. i put that in. the sample standard deviation was 7. i put 7 in. our sample size was 36.
we want to find out if it was different. it is a notequal to mu0. i hit enter on not equal to. i go to calculate. here what we have is our t, our test statistic is-1.54. we say that the statistic is t = -1.54 about. the p-value is 0.13 about. let's go back to our powerpoint and discuss this. let's go back to our powerpoint and discuss this.here's the powerpoint. the first thing is always check, hopefully youremember to make sure the sample size is largeenough. n was 36 and yes 36 is bigger than 32 enoughto do this to make the p-value valid.
we can say the p-value was 0.13 which is greater than 0.05. when it's greater than 0.05 we fail to rejectthe null hypothesis. big p's are bad. remember that big p's mean we fail. now we can state our conclusion that there isstatistically insignificant evidence to support the claim that the population mean fuelefficiency for cars in tahoe not just a 36 cars but all cars in tahoe is not 24.1 mpg. remember this is just a very fancy way
of saying i did a lot of work and at the end i really don't know anything. it's kind of a drag, but that's where you have to say. at least if you have to say you don't knowanything, say you don't know anything withauthority. there is statistically insignificant evidence. that's all i have for today and for this chapter soi hope you understand. otherwise gothrough it again watch some of the many videos go online and find more stuff on hypothesis testing and we'll move on to hypothesis testing for more than one sample.
after this in the next video. thanks a lot forcoming. i hope this all makes sense. otherwiseplease ask your instructor or ask me. i will be happy to help you to clarify any difficulties you might have had. thank you very much this is the end of the video.
Comments
Post a Comment