The null hypothesis in statistics: an example. Testing the null hypothesis. The null hypothesis concept

STATISTICAL HYPOTHESES

Sample data obtained in experiments are always limited and are largely random in nature. That is why mathematical statistics are used to analyze such data, which makes it possible to generalize the patterns obtained in the sample and extend them to the entire general population.

The data obtained as a result of the experiment on any sample serve as the basis for judging the general population. However, due to the action of random probabilistic reasons, the estimate of the parameters of the general population, made on the basis of experimental (sample) data, will always be accompanied by an error, and therefore such estimates should be considered as conjectural, and not as final statements. Such assumptions about the properties and parameters of the general population are called statistical hypotheses . According to G.V. Sukhodolsky: "A statistical hypothesis is usually understood as a formal assumption that the similarity (or difference) of some parametric or functional characteristics is accidental or, conversely, not accidental."

The essence of testing a statistical hypothesis is to establish whether the experimental data and the hypothesis put forward agree, whether it is permissible to attribute the discrepancy between the hypothesis and the result of the statistical analysis of experimental data due to random causes. Thus, a statistical hypothesis is a scientific hypothesis that allows statistical testing, and mathematical statistics is a scientific discipline whose task is to scientifically test statistical hypotheses.

Statistical hypotheses are classified into null and alternative, directed and non-directed.

Null hypothesis(H 0) Is the hypothesis that there are no differences. If we want to prove the significance of the differences, then the null hypothesis is required refute, otherwise it is required confirm.

Alternative hypothesis (H 1) Is a hypothesis about the significance of the differences. This is what we want to prove, which is why it is sometimes called experimental hypothesis.

There are problems when we want to prove just insignificance differences, that is, to confirm the null hypothesis. For example, if we need to make sure that different subjects receive, although different, but balanced in difficulty, or that the experimental and control samples do not differ in any significant characteristics. However, more often we still need to prove the significance of the differences, for they are more informative for us in our search for something new.

Null and alternative hypotheses can be directional and non-directional.

Directed hypotheses - if it is assumed that the characteristic values ​​are higher in one group, and lower in the other:

H 0: X 1 does not exceed X 2,

H 1: X 1 exceeds X 2.

Undirected hypotheses - if it is assumed that the forms of distribution of a feature in groups differ:

H 0: X 1 does not differ from X 2,

H 1: X 1 is different X 2.

If we noticed that in one of the groups the individual values ​​of the subjects for some criterion, for example, for social activity, are higher, and in the other, lower, then in order to test the significance of these differences, we need to formulate directed hypotheses.

If we want to prove that in a group A under the influence of some experimental influences, more pronounced changes occurred than in the group B, then we also need to formulate directed hypotheses.

If we want to prove that the forms of distribution of the trait in groups differ A and B, then undirected hypotheses are formulated.

Hypothesis testing is carried out using criteria for the statistical assessment of differences.

The accepted conclusion is called a statistical decision. Let us emphasize that such a solution is always probabilistic. When testing a hypothesis, experimental data may contradict the hypothesis H 0, then this hypothesis is rejected. Otherwise, i.e. if the experimental data agree with the hypothesis H 0, it does not deviate. It is often said in such cases that the hypothesis H 0 is accepted. This shows that statistical testing of hypotheses based on experimental sample data is inevitably associated with the risk (probability) of making a false decision. In this case, errors of two kinds are possible. An error of the first kind will occur when a decision is made to reject a hypothesis. H 0, although in reality it turns out to be true. An error of the second kind will occur when a decision is made not to reject the hypothesis. H 0, although in reality it will be incorrect. Obviously, correct conclusions can also be accepted in two cases. Table 7.1 summarizes the above.

Table 7.1

It is possible that the psychologist may be mistaken in his statistical decision; as we can see from table 7.1, these errors can be of only two kinds. Since it is impossible to exclude errors when accepting statistical hypotheses, it is necessary to minimize the possible consequences, i.e. acceptance of an incorrect statistical hypothesis. In most cases, the only way to minimize errors is to increase the sample size.

STATISTICAL CRITERIA

Statistical test- this is a decision rule that ensures reliable behavior, that is, acceptance of a true hypothesis and rejection of a false hypothesis with a high probability.

Statistical criteria also refer to the method for calculating a certain number and the number itself.

When we say that the reliability of differences was determined by the criterion j *(criterion is Fisher's angular transformation), then we mean that we used the method j * to calculate a specific number.

By the ratio of the empirical and critical values ​​of the criterion, we can judge whether the null hypothesis is confirmed or refuted.

In most cases, in order for us to recognize differences as significant, it is necessary that the empirical value of the criterion exceeds the critical, although there are criteria (for example, the Mann-Whitney criterion or the sign criterion) in which we must adhere to the opposite rule.

In some cases, the calculation formula of the criterion includes the number of observations in the studied sample, denoted as n... In this case, the empirical value of the criterion is at the same time a test for testing statistical hypotheses. Using a special table, we determine what level of statistical significance of differences a given empirical value corresponds to. An example of such a criterion is the criterion j * calculated based on the angular Fisher transform.

In most cases, however, the same empirical value of the criterion may turn out to be significant or insignificant, depending on the number of observations in the studied sample ( n) or on the so-called number of degrees of freedom, which is denoted as v or how df.

The number of degrees of freedom v equal to the number of classes variation series minus the number of conditions under which it was formed. These conditions include the sample size ( n), means and variances.

Let's say a group of 50 people was divided into three classes according to the principle:

Knows how to work on a computer;

Knows how to perform only certain operations;

Can't work on a computer.

The first and second groups included 20 people, the third - 10.

We are limited by one condition - the sample size. Therefore, even if we have lost data on how many people do not know how to work on a computer, we can determine this, knowing that in the first and second grades there are 20 subjects each. We are not free in determining the number of subjects in the third category, "freedom" extends only to the first two cells of the classification:

Let's get acquainted with the terminology used in hypothesis testing.

But - the null hypothesis (the skeptic's hypothesis) is a hypothesis no difference between the compared samples. The skeptic believes that the differences between sample estimates obtained from research results are accidental.

· Н 1 - an alternative hypothesis (optimist hypothesis) is a hypothesis about the presence of differences between the compared samples. The optimist believes that the differences between the sample estimates are caused by objective reasons and correspond to the differences populations

Testing statistical hypotheses is feasible only when it is possible to compose some magnitude(criterion), the distribution law of which in the case of validity H 0 is known. Then for this quantity one can indicate confidence interval, into which its value falls with a given probability P d. This interval is called critical area... If the value of the criterion falls into the critical region, then the hypothesis H 0 is accepted. Otherwise, hypothesis H 1 is accepted.

In medical research, P d = 0.95 or P d = 0.99 are used. These values ​​correspond to significance levels a = 0.05 or a = 0.01.

When testing statistical hypotheses level of significance(a) is the probability of rejecting the null hypothesis when it is true.

Note that, in essence, the hypothesis testing procedure aims to spot differences, and not to confirm their absence. When the value of the criterion goes beyond the critical area, we can say with a pure heart to a “skeptic” - what else do you want ?! If there were no differences, then with a probability of 95% (or 99%), the calculated value would be within the specified limits. But no! ...

Well, if the value of the criterion falls into the critical region, then there is no reason to believe that the hypothesis H 0. Is true. This most likely indicates one of two possible reasons.



a) The sample sizes are not large enough to detect the existing differences. It is likely that continued experimentation will bring success.

b) There are differences. But they are so small that they have no practical value. In this case, the continuation of experiments does not make sense.

Let's move on to considering some of the statistical hypotheses used in medical research.

§ 3.6. Testing hypotheses about equality of variances,
F - Fisher's criterion

In some clinical studies, the positive effect is not so much evidenced by magnitude of the investigated parameter, how much is it stabilization, reducing its fluctuations. In this case, the question arises of comparing two general variances based on the results of a sample survey. This task can be solved with Fisher's criterion.

Formulation of the problem

normal law distribution. Sample sizes n 1 and n 2, and sample variances are equal respectively. It is required to compare with each other general variances.

Testable hypotheses:

H 0- general variances are the same;

H 1 - general variances different.

Shown if samples are extracted from general populations with normal law distribution, then if the hypothesis H 0 is true, the ratio of sample variances obeys the Fisher distribution. Therefore, the value is taken as a criterion for checking the validity of H 0 F calculated by the formula

where are sample variances.

This ratio obeys the Fisher distribution with the number of degrees of freedom of the numerator n 1 = n 1 -1, and the number of degrees of freedom of the denominator n 2 = n 2 -1. The boundaries of the critical area are found using the Fisher distribution tables or using the computer function FRASPINV.

For the example presented in table. 3.4, we get: n 1 = n 2 = 20 - 1 = 19; F = 2.16 / 4.05 = 0.53. At a = 0.05, the boundaries of the critical region are equal, respectively: F lion = 0.40, F right = 2.53.

The value of the criterion fell into the critical region, therefore, hypothesis H 0 is accepted: general variances of samples are the same.

§ 3.7. Testing hypotheses about equality of means,
t- Student's test

Comparison task middle two general populations arises when it is of practical importance magnitude the trait under study. For example, when comparing the terms of treatment with two different methods, or the number of complications arising from their use. In this case, you can use the Student's t-test.

Formulation of the problem.

Two samples (X 1) and (X 2) were obtained, extracted from general populations with normal law distribution and equal variances... Sample sizes n 1 and n 2, sample means are equal and sample variances- , respectively. It is required to compare with each other general averages.

Testable hypotheses:

H 0- general averages are the same;

H 1 - general averages different.

It is shown that in the case of the validity of the hypothesis H 0, the value t calculated by the formula

, (3.10)

distributed according to Student's law with the number of degrees of freedom n= n 1 + n 2 - 2.

Here, where n 1 = n 1 - 1 - the number of degrees of freedom for the first sample; n 2 = n 2 - 1 is the number of degrees of freedom for the second sample.

The boundaries of the critical area are found from tables t-allocation or using the computer function STYUDRASP. The Student's distribution is symmetric about zero, therefore the left and right boundaries of the critical region are the same in magnitude and opposite in sign: - t gr and t gr.

For the example presented in table. 3.4, we get: n 1 = n 2 = 20 - 1 = 19; t= –2.51, n = 38. At a = 0.05 t gr = 2.02.

The values ​​of the criterion go beyond the left border of the critical region, therefore, we accept hypothesis H 1: general averages different... Moreover, the average of the general population first sample less.

5. The main problems of applied statistics - data description, estimation and hypothesis testing

Basic concepts used in hypothesis testing

Statistical hypothesis - any assumption regarding the unknown distribution of random variables (elements). Here are the formulations of several statistical hypotheses:

1. The observation results have a normal distribution with zero mathematical expectation.
2. The observation results have a distribution function N(0,1).
3. The observation results have a normal distribution.
4. The results of observations in two independent samples have the same normal distribution.
5. The results of observations in two independent samples have the same distribution.

Distinguish between null and alternative hypotheses. The null hypothesis is a hypothesis to be tested. An alternate hypothesis is each admissible hypothesis other than null. The null hypothesis is denoted H 0, alternative - H 1(from Hypothesis - "hypothesis" (English)).

The choice of one or another null or alternative hypotheses is determined by the applied problems facing the manager, economist, engineer, researcher. Let's look at some examples.

Example 11. Let the null hypothesis be hypothesis 2 from the above list, and the alternative hypothesis 1, which means that the real situation is described by a probabilistic model, according to which the observation results are considered as realizations of independent identically distributed random variables with a distribution function N(0, σ), where the parameter σ is unknown to statistics. Within this model, the null hypothesis is written as follows:

N 0: σ = 1,

and the alternative is like this:

N 1: σ ≠ 1.

Example 12. Let the null hypothesis still be hypothesis 2 from the above list, and the alternative hypothesis 3 from the same list. Then, in a probabilistic model of a managerial, economic, or industrial situation, it is assumed that the observation results form a sample from the normal distribution N(m, σ) for some values m and σ. Hypotheses are written like this:

N 0: m= 0, σ = 1

(both parameters take fixed values);

N 1: m≠ 0 and / or σ ≠ 1

(i.e. either m≠ 0, or σ ≠ 1, or and m≠ 0, and σ ≠ 1).

Example 13. Let N 0 is hypothesis 1 from the above list, and N 1 - hypothesis 3 from the same list. Then the probabilistic model is the same as in example 12,

N 0: m= 0, σ is arbitrary;

N 1: m≠ 0, σ is arbitrary.

Example 14. Let N 0 is hypothesis 2 from the above list, and according to N 1 observation results have a distribution function F(x), not the same as the standard normal distribution function F (x). Then

N 0: F(x) = Ф (x) with all X(written as F(x) ≡ Ф (x));

N 1: F(x 0) ≠ Ф (x 0) with some x 0(i.e. it is not true that F(x) ≡ Ф (x)).

Note. Here ≡ is the sign of identical coincidence of functions (i.e., coincidence for all possible values ​​of the argument X).

Example 15. Let N 0 is hypothesis 3 from the above list, and according to N 1 observation results have a distribution function F(x), not normal. Then

With some m, σ;

N 1: for any m, σ exists x 0 = x 0(m, σ) such that .

Example 16. Let N 0 - hypothesis 4 from the above list, according to the probabilistic model, two samples are extracted from populations with distribution functions F(x) and G(x), which are normal with parameters m 1, σ 1 and m 2, σ 2, respectively, and N 1 - negation N 0. Then

N 0: m 1 = m 2, σ 1 = σ 2, and m 1 and σ 1 are arbitrary;

N 1: m 1 ≠ m 2 and / or σ 1 ≠ σ 2.

Example 17. Suppose that under the conditions of Example 16 it is additionally known that σ 1 = σ 2. Then

N 0: m 1 = m 2, σ> 0, and m 1 and σ are arbitrary;

N 1: m 1 ≠ m 2, σ> 0.

Example 18. Let N 0 - hypothesis 5 from the above list, according to the probabilistic model, two samples are extracted from populations with distribution functions F(x) and G(x) respectively, while N 1 - negation N 0. Then

N 0: F(x) G(x) , where F(x)

N 1: F(x) and G(x) are arbitrary distribution functions, and

F(x) G(x) with some X.

Example 19. Let, under the conditions of Example 17, it is additionally assumed that the distribution functions F(x) and G(x) differ only in shift, i.e. G(x) = F(x- a) with some a... Then

N 0: F(x) G(x) ,

where F(x) - an arbitrary distribution function;

N 1: G(x) = F(x- a), and ≠ 0,

where F(x) Is an arbitrary distribution function.

Example 20. Let, under the conditions of Example 14, it is additionally known that according to the probabilistic model of the situation F(x) is the normal distribution function with unit variance, i.e. has the form N(m, one). Then

N 0: m = 0 (those. F(x) = Ф (x)

with all X); (written as F(x) ≡ Ф (x));

N 1: m 0

(i.e. it is not true that F(x) ≡ Ф (x)).

Example 21. In the statistical regulation of technological, economic, managerial or other processes, a sample is considered, extracted from a population with a normal distribution and a known variance, and hypotheses

N 0: m = m 0 ,

N 1: m= m 1 ,

where the parameter value m = m 0 corresponds to the streamlined course of the process, and the transition to m= m 1 indicates discord.

Example 22. In statistical acceptance control, the number of defective product units in the sample obeys a hypergeometric distribution, the unknown parameter is p = D/ N- the level of defectiveness, where N- the volume of a batch of products, D- the total number of defective items in the batch. Control plans used in regulatory, technical and commercial documents (standards, supply agreements, etc.) are often aimed at testing a hypothesis

N 0: p < AQL

N 1: p > LQ,

where AQL - acceptance level of defectiveness, LQ - the rejection level of defectiveness (it is obvious that AQL < LQ).

Example 23. As indicators of the stability of a technological, economic, managerial or other process, a number of characteristics of the distributions of controlled indicators are used, in particular, the coefficient of variation v = σ/ M(X). It is required to test the null hypothesis

N 0: v < v 0

under the alternative hypothesis

N 1: v > v 0 ,

where v 0 - some predetermined limit value.

Example 24. Let the probabilistic model of two samples be the same as in Example 18, the mathematical expectations of observation results in the first and second samples will be denoted M(X) and M(Have) respectively. In a number of situations, the null hypothesis is tested.

N 0: M (X) = M (Y)

against alternative hypothesis

N 1: M (X) ≠ M (Y).

Example 25... It was noted above great importance in the mathematical statistics of distribution functions symmetric with respect to 0, When checking symmetry

N 0: F(- x) = 1 – F(x) with all x, otherwise F arbitrary;

N 1: F(- x 0 ) ≠ 1 – F(x 0 ) with some x 0 , otherwise F arbitrary.

In probabilistic-statistical methods of decision-making, many other formulations of problems for testing statistical hypotheses are used. Some of them are discussed below.

The specific task of testing a statistical hypothesis is fully described if the null and alternative hypotheses are given. The choice of the method for testing the statistical hypothesis, the properties and characteristics of the methods are determined by both the null and alternative hypotheses. Generally speaking, different methods should be used to test the same null hypothesis under different alternative hypotheses. So, in examples 14 and 20, the null hypothesis is the same, and the alternative ones are different. Therefore, in the conditions of Example 14, methods based on the criteria of goodness-of-fit with a parametric family (of the Kolmogorov type or the omega-square type) should be applied, and in the conditions of Example 20, methods based on the Student's test or the Cramer-Welch criterion. If, in the conditions of Example 14, the Student's t-test is used, then he will not solve the assigned tasks. If, under the conditions of Example 20, we use the Kolmogorov-type goodness-of-fit test, then, on the contrary, it will solve the posed problems, although, possibly, worse than the Student's t-test specially adapted for this case.

When processing real data, the correct choice of hypotheses is of great importance. N 0 and N one . The assumed assumptions, for example, the normal distribution, must be carefully substantiated, in particular, by statistical methods. Note that in the overwhelming majority of specific applied formulations, the distribution of observation results is different from the normal one.

A situation often arises when the form of the null hypothesis follows from the formulation of an applied problem, but the form of the alternative hypothesis is not clear. In such cases, one should consider an alternative hypothesis of the most general type and use methods that solve the problem for all possible N one . In particular, when testing hypothesis 2 (from the above list) as null, one should use as an alternative hypothesis N 1 from Example 14, and not from Example 20, if there is no special justification for the normality of the distribution of observation results under the alternative hypothesis.

Previous

On the basis of the data collected in statistical studies, after their processing, conclusions are drawn about the phenomena under study. These conclusions are made by putting forward and testing statistical hypotheses.

Statistical hypothesis any statement about the form or properties of the distribution of experimentally observed random variables is called. Statistical hypotheses are tested by statistical methods.

The hypothesis to be tested is called main (zero) and denoted N 0. In addition to zero, there is also alternative (competing) hypothesis H 1, denying the main . Thus, as a result of testing, one and only one of the hypotheses will be accepted. , and the second will be rejected.

Types of errors... The hypothesis put forward is tested on the basis of a study of a sample obtained from the general population. Due to the randomness of the sample, the validation does not always lead to the correct conclusion. In this case, the following situations may arise:
1. The main hypothesis is correct and accepted.
2. The main hypothesis is correct, but it is rejected.
3. The main hypothesis is incorrect and it is rejected.
4. The main hypothesis is not correct, but it is accepted.
In case 2, one speaks of error of the first kind, in the latter case we are talking about error of the second kind.
Thus, for some samples, the correct decision is made, and for others, the wrong one. The decision is made by the value of some sampling function, called statistical characteristics, statistical criterion or simply statistics... The set of values ​​for this statistic can be divided into two disjoint subsets:

  • N 0 is accepted (not rejected), called hypothesis acceptance area (feasible area);
  • a subset of the statistic values ​​for which the hypothesis N 0 is rejected (rejected) and the hypothesis is accepted N 1 is called critical area.

Conclusions:

  1. Criterion is a random variable K that allows you to accept or reject the null hypothesis H0.
  2. When testing hypotheses, errors of 2 genera can be made.
    Error of the first kind is that the hypothesis will be rejected H 0 if correct ("skip target"). The probability of making a mistake of the first kind is denoted by α and is called level of significance... Most often in practice, it is assumed that α = 0.05 or α = 0.01.
    Type II error is that the hypothesis H0 is accepted if it is incorrect ("false positive"). The probability of an error of this kind is denoted by β.

Classification of hypotheses

Main hypothesis N 0 about the value of the unknown parameter q of the distribution usually looks like this:
H 0: q = q 0.
Competing hypothesis N 1 can thus have the following form:
N 1: q < q 0 , N 1: q> q 0 or N 1: qq 0 .
Accordingly, it turns out left-sided, right-sided or bilateral critical areas. Boundary points of critical regions ( critical points) are determined from the distribution tables of the corresponding statistics.

When testing a hypothesis, it is prudent to reduce the likelihood of making bad decisions. Type I error tolerance usually denoted a and called level of significance... Its value is usually small ( 0,1, 0,05, 0,01, 0,001 ...). But a decrease in the probability of a type I error leads to an increase in the probability of a type II error ( b), i.e. the desire to accept only correct hypotheses causes an increase in the number of rejected correct hypotheses. Therefore, the choice of the level of significance is determined by the importance of the problem posed and the severity of the consequences of an incorrect decision.
Statistical hypothesis testing consists of the following steps:
1) defining hypotheses N 0 and N 1 ;
2) selection of statistics and setting the level of significance;
3) determination of critical points K cr and critical area;
4) computation of the statistic value from the sample To ex;
5) comparison of the statistic value with the critical area ( K cr and To ex);
6) decision making: if the value of the statistics is not included in the critical area, then the hypothesis is accepted N 0 and the hypothesis is rejected H 1, and if it enters the critical region, then the hypothesis is rejected N 0 and the hypothesis is accepted N one . At the same time, the results of testing the statistical hypothesis should be interpreted as follows: if the hypothesis is accepted N 1 , then it can be considered proven, and if the hypothesis is accepted N 0 , then it was recognized that it does not contradict the results of observations. However, this property, along with N 0 may have other hypotheses as well.

Classification of hypothesis tests

Let us now consider several different statistical hypotheses and mechanisms for testing them.
I) Hypothesis about the general mean of the normal distribution with unknown variance. We assume that the general population has a normal distribution, its mean and variance are unknown, but there is reason to believe that the general average is equal to a. At the significance level α, the hypothesis should be tested N 0: x = a. As an alternative, one of the three hypotheses discussed above can be used. In this case, statistics is a random variable having a Student's distribution with n- 1 degrees of freedom. The corresponding experimental (observed) value is determined t ex t cr N 1: x> a it is found according to the significance level α and the number of degrees of freedom n- 1. If t ex < t cr N 1: x ≠ a, the critical value is found according to the significance level α / 2 and the same number of degrees of freedom. The null hypothesis is accepted if | t ex | II) The hypothesis of the equality of two mean values ​​of randomly distributed general populations (large independent samples). At the significance level α, the hypothesis should be tested N 0: x ≠ y. If the size of both samples is large, then we can assume that the sample means have a normal distribution, and their variances are known. In this case, a random variable can be used as statistics
,
having a normal distribution, and M(Z) = 0, D(Z) = 1. The corresponding experimental value is determined z ex... The critical value is found from the Laplace function table z cr... Under an alternative hypothesis N 1: x> y it is found from the condition F(z cr) = 0,5 – a... If z ex< z кр , then the null hypothesis is accepted, otherwise it is rejected. Under an alternative hypothesis N 1: x ≠ y the critical value is found from the condition F(z cr) = 0.5 × (1 - a). The null hypothesis is accepted if | z ex |< z кр .

III) The hypothesis of the equality of two means of normally distributed general populations, the variances of which are unknown and the same (small independent samples). At the significance level α, the main hypothesis should be tested N 0: x = y. As statistics, we use a random variable
,
having a Student's distribution with ( n x + n at- 2) degrees of freedom. The corresponding experimental value is determined t ex... From the table of critical points of the Student's distribution, the critical value is found t cr... Everything is solved similarly to hypothesis (I).

Iv) Conjecture about the equality of two variances of normally distributed general populations... In this case, at the significance level a need to test the hypothesis N 0: D(X) = D(Y). The statistics is a random variable having the Fisher - Snedecor distribution with f 1 = n b- 1 and f 2 = n m- 1 degrees of freedom (S 2 b - large variance, the volume of its sample n b). The corresponding experimental (observed) value is determined F ex... Critical value F cr under the alternative hypothesis N 1: D(X) > D(Y) is found from the table of critical points of the Fisher - Snedecor distribution by the level of significance a and the number of degrees of freedom f 1 and f 2. The null hypothesis is accepted if F ex < F cr.

Instruction. For the calculation, it is necessary to indicate the dimension of the source data.

V) The hypothesis of the equality of several variances of normally distributed general populations over samples of the same size. In this case, at the significance level a need to test the hypothesis N 0: D(X 1) = D(X 2) = …= D(X l). A random variable serves as statistics having a Kochren distribution with degrees of freedom f = n- 1 and l (n - the size of each sample, l Is the number of samples). This hypothesis is tested in the same way as the previous one. A table of critical points of the Cochren distribution is used.

Vi) The hypothesis about the importance of the correlation. In this case, at the significance level a need to test the hypothesis N 0: r= 0. (If the correlation coefficient is zero, then the corresponding values ​​are not related to each other). Statistics in this case is a random variable
,
having a Student's distribution with f = n- 2 degrees of freedom. This hypothesis is tested in the same way as hypothesis (I).

Instruction. Indicate the amount of source data.

Vii) Hypothesis about the significance of the probability of the occurrence of an event. A fairly large number of n independent trials in which the event A happened m once. There is reason to believe that the probability of this event occurring in one test is p 0... Required at significance level a test the hypothesis that the probability of an event A is equal to the hypothetical probability p 0... (Since the probability is estimated by the relative frequency, the hypothesis being tested can be formulated in another way: whether the observed relative frequency and the hypothetical probability differ significantly or not).
The number of trials is large enough, so the relative frequency of the event A distributed according to the normal law. If the null hypothesis is true, then its mathematical expectation is p 0, and the variance. In accordance with this, as a statistic, we choose a random variable
,
which is distributed approximately according to the normal law with zero mathematical expectation and unit variance. This hypothesis is tested in exactly the same way as in case (I).

Instruction. For the calculation, you must fill in the initial data.

At different stages of statistical research and modeling, it becomes necessary to formulate and experimentally verify certain assumptions (hypotheses) regarding the nature and magnitude of unknown parameters of the analyzed general population (populations). For example, a researcher makes an assumption: "the sample is drawn from a normal general population" or "the general average of the analyzed population is five." Such assumptions are called statistical hypotheses.

Comparison of the stated hypothesis regarding the general population with the available sample data, accompanied by a quantitative assessment of the degree of reliability of the conclusion obtained, is carried out using one or another statistical criterion and is called statistical hypothesis testing .

The hypothesis put forward is called zero (main) ... It is customary to denote it H 0.

In relation to the stated (main) hypothesis, one can always formulate alternative (competing) contradicting it. An alternative (competing) hypothesis is usually denoted H 1.

Purpose of Statistical Hypothesis Testing is to make a decision on the validity of the main hypothesis based on sample data H 0.

If the hypothesis put forward is reduced to the statement that the value of some unknown parameter of the general population exactly equal given value, then this hypothesis is called simple, for example: "the average per capita total income of the population of Russia is 650 rubles a month"; "The unemployment rate (the share of the unemployed in the economically active population) in Russia is 9%." In other cases, the hypothesis is called complicated.

As a null hypothesis H 0 it is customary to put forward a simple hypothesis, since it is usually more convenient to check the stricter assertion.

Hypotheses about the form of the distribution law of the investigated random variable;

Hypotheses about the numerical values ​​of the parameters of the studied general population;

Hypotheses about the homogeneity of two or more samples or some characteristics of the analyzed populations;

Hypotheses about the general form of the model describing the statistical relationship between features, etc.

Since the testing of statistical hypotheses is carried out on the basis of sample data, i.e. limited number of observations, decisions regarding the null hypothesis H 0 are probabilistic in nature. In other words, such a decision is inevitably accompanied by a certain, albeit possibly very small, probability of an erroneous conclusion in either direction.



So, in some small fraction of cases α null hypothesis H 0 may be rejected when in fact in the general population it is fair. This mistake is called error of the first kind ... And its probability is usually called level of significance and designate α .

On the contrary, in some small fraction of cases β null hypothesis H 0 accepted, while in fact in the general population it is erroneous, and the alternative hypothesis is valid H 1... This mistake is called error of the second kind ... The probability of a type II error is usually denoted β ... Probability 1 - β are called power of the criterion .

With a fixed sample size, you can choose at your discretion the value of the probability of only one of the errors α or β ... An increase in the likelihood of one of them leads to a decrease in the other. It is customary to set the probability of an error of the first kind α - significance level. As a rule, some standard values ​​of the significance level are used. α : 0.1; 0.05; 0.025; 0.01; 0.005; 0.001. Then, obviously, from two criteria characterized by the same probability α reject a valid hypothesis H 0, the one with the lesser type II error should be accepted β , i.e. more power. Reducing the probabilities of both errors α and β can be achieved by increasing the sample size.

Correct solution regarding the null hypothesis H 0 can also be of two types:

The null hypothesis will be accepted H 0, whereas in fact, in the general population, the null hypothesis is true H 0; the likelihood of such a decision 1 - α;

Null hypothesis H 0 will be rejected in favor of an alternative H 1, whereas in fact, in the general population, the null hypothesis H 0 deviates in favor of an alternative H 1; the likelihood of such a decision 1 - β is the power of the criterion.

The results of solving the null hypothesis can be illustrated using Table 8.1.

Table 8.1

Statistical hypothesis testing is carried out using statistical criterion(let's call it in general form TO), which is a function of the observation results.

Statistical criterion is a rule (formula) by which the measure of discrepancy between the results of a sample observation and the stated hypothesis H 0 is determined.

A statistical criterion, like any function of observation results, is a random variable and under the assumption of the validity of the null hypothesis H 0 is subject to some well-studied (and tabulated) theoretical distribution law with a distribution density f (k).

The choice of a criterion for testing statistical hypotheses can be carried out on the basis of various principles. Most often they use likelihood ratio, which allows you to build the most powerful criterion among all possible criteria. Its essence boils down to the choice of such a criterion TO with a known density function f (k) subject to the validity of the hypothesis H 0, so that at a given level of significance α could find the tipping point K cr.distribution f (k), which would divide the range of values ​​of the criterion into two parts: the range of acceptable values, in which the results of a sample observation look the most plausible, and the critical region, in which the results of a sample observation look less plausible in relation to the null hypothesis H 0.

If such a criterion TO is selected, and the density of its distribution is known, then the task of testing the statistical hypothesis is reduced to the fact that at a given level of significance α calculate the observed value of the criterion based on sample data K obs. and determine whether it is most or less plausible in relation to the null hypothesis H 0.

Each type of statistical hypothesis is tested using the appropriate criterion, which is the most powerful in each case. For example, testing the hypothesis about the form of the distribution law of a random variable can be carried out using the Pearson's goodness-of-fit test χ 2; testing the hypothesis about the equality of the unknown values ​​of the variances of two general populations - using the criterion F- Fisher; a number of hypotheses about unknown values ​​of parameters of general populations are tested using the criterion Z- normal distributed random variable and criterion T- Student's t, etc.

The value of the criterion, calculated according to special rules based on sample data, is called the observed criterion value (K obs.).

Criterion values ​​dividing the set of criterion values ​​into range of valid values(most plausible in relation to the null hypothesis H 0) and critical area(range of values ​​less plausible in relation to tables of distribution of a random variable TO selected as a criterion are called critical points (K cr.).

The area of ​​admissible values ​​(the area of ​​acceptance of the null hypothesis H 0) TO H 0 does not deviate.

A critical area is the set of values ​​of the criterion TO for which the null hypothesis H 0 deviates in favor of a competing H 1 .

Distinguish unilateral(right-handed or left-handed) and bilateral critical areas.

If the competing hypothesis is right-sided, for example, H 1: a> a 0, then the critical region is right-sided(Figure 1). With the right-sided competing hypothesis, the critical point (K red right-sided) takes positive values.

If the competing hypothesis is left-sided, for example, H 1: a< а 0 , then the critical region is left-sided(Figure 2). With a left-hand competing hypothesis, the critical point takes negative values (Red left-sided).

If the competing hypothesis is two-sided, for example, H 1: a¹ a 0, then the critical region is bilateral(Figure 3). With a two-sided competing hypothesis, two critical points are determined (To the red. Left-sided and To cr. right-sided).


Range of admissibility Critical

values ​​area