Mann-Whitney U test

From CEOpedia | Management online
Revision as of 13:27, 2 November 2022 by Korten S (talk | contribs)

The Mann-Whitney U test (also known as the "Wilcoxon rank-sum test" (WRS)) for independent samples tests whether the central tendencies of two independent samples are different. The Mann-Whitney U test is used when the requirements for a t-test for independent samples are not met. The question posed by the Mann-Whitney U test for independent samples is often abbreviated thus: "Do the central tendencies of two independent samples differ?"

Differences to the T-Test

The Mann-Whitney U test is the nonparametric equivalent of the independent-samples t-test and is used when the conditions for a parametric procedure are not met. Non-parametric procedures are also known as "prerequisite-free procedures" because they have lower requirements for the distribution of the measured values in the population. For example, the data need not be normally distributed and the variables need only be ordinally scaled. A Mann-Whitney U test can also be calculated for small samples and outliers.(UZH, online).

Assumptions

  • Null Hypothesis:
    • The null hypothesis assumes that both groups under investigation are studied with the same population.
    • The two independent groups must be homogeneous and have the same distribution.

If a 2-sided test occurs, the alternative hypothesis T1, which is tested against the null hypothesis, is that the first population is different from the second population. In this case, the null hypothesis is rejected.

  • The two groups studied must be drawn at random from the target population. The concept of randomness implies the absence of measurement and sampling error (Robert et al., 1988). Note that error of the latter types may be included but must remain small.
  • Each measurement or observation must correspond to a different participant. Statistically, there is independence within groups and mutual independence between groups.
  • The scale for data measurement is ordinal or continuous type. The observation values are then of ordinal, relative, or absolute scale type (N. Nadim, 14-15).
  • There is an independent variable by means of which the two groups to be compared are formed (UZH, online).


The Test

The Mann-Whitney U test first requires the calculation of a U statistic for each group. These statistics have a known distribution under the null hypothesis established by Mann and Whitney (1947). Mathematically, the Mann-Whitney U statistic for each group is defined as follows:

Ux = nxny + ((nx(nx + 1))/ 2) − Rx

Uy = nxny + ((ny (ny + 1))/ 2) − Ry


Nx is defined as the number of observations/participants in the first group, ny the number of observations/participants in the second group, Rx the sum of the ranks of the first group, and Ry the sum of the ranks of the second group. After calculating the U statistic and determining an appropriate statistical threshold (α), the null hypothesis may or may not be rejected.

H0 is rejected if, according to Mann and Whitney's tables, the p corresponding to min (Ux,Uy) (the smallest of the two calculated U) is smaller than the p or the specified α-threshold. In technical terms, reject H0 if p of min (Ux,Uy) <α threshold (N. Nadim, 14-17).

Example

20 patients of a hospital are examined. 12 of them are under cardiological treatment, while 8 are not. They all answer a questionnaire on general well-being (scores from 0 to 35, 0 representing very high, 35 very low well-being). The aim is to test whether there are differences in terms of central tendency of well-being between the cardiac patients and the other patients. The dataset to be analyzed contains, in addition to the subject number (ID), the grouping variable (Group), which takes the value 1 for cardiac patients and 2 for other patients, and the well-being value (Data).

The Mann-Whitney U test is based on the idea of ranking the data. That is, it is not calculated with the measured values themselves, but these are replaced by ranks with which the actual test is performed. Thus, the calculation of the test is based exclusively on the ordering of the data (greater than, less than). The absolute distances between the values are not taken into account.

In the first step, the measured values are ranked according to their magnitude (to be seen in the Well-being column). This ranking is independent of group membership. Subsequently, the measured values are ranked (starting from 1 and ascending), whereby a distinction is made here between the groups. If the same measured value occurs several times, the mean value is formed in so-called "linked ranks". Afterwards, rank sums are formed for the two groups by adding up the ranks within the groups.

Example Data
ID Group General well-being Ranking Group 1 Ranking Group 2
5 1 0 1
6 2 1 2
14 2 2 3
9 2 3 4
18 2 4 5
10 1 5 6
19 1 5.5 7
1 2 6 8
8 2 6.5 9
17 1 7 10
15 2 7.5 11
11 1 8 12
3 2 8.5 13
2 1 9 14
20 1 11 15
12 1 13 16
16 1 28 17
4 1 29 18
7 1 32 19
13 1 33 20
Rangingsummary 155 55

For group 1 the rank sum is 155 (n=12), for group 2 55 (n=8). To calculate U, the larger of the two rank sums is used.

n1 = Sample size of the group with the larger rank sum. n2 = Sample size of the group with the larger rank sum. R1 = Larger rank sum Thus, it follows:

Significance

If the sample size is large enough (n1+n2 > 30), significance can be tested. Here z is calculated:

Failed to parse (syntax error): {\displaystyle z = \frac{U-μ_U}{σ_U}=\frac{U-\frac{n_1*n_2}{2}}{\sqrt{\frac{n_1*n_2(n_1+n_2+1)}{12}}}}

μ= mean of the U-distribution (U-value, without difference between groups).

σ= Standard Error of the U-Value

n1= sample size of the group with the larger rank sum

n2= sample size of the group with the smaller rank sum


Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle z = \frac{U-μ_U}{σ_U}=\frac{19-\frac{12*8}{2}}{\sqrt{\frac{12*8(12+8+1)}{12}}}}

This z-value is now compared with the critical value of the standard normal distribution. For the two-sided significance level .05, it is ±1.96. If the magnitude of the test statistic is higher than the critical value, the difference between the two groups is significant (UZH, online).

References

  • Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of 2 random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18, 50‐60.
  • Nachar, Nadim. (2008). [1] The Mann-Whitney U: A Test for Assessing Whether Two Independent Samples Come from the Same Distribution. Tutorials in Quantitative Methods for Psychology. 13-17
  • UZH (2022). [2]. Mann-Whtiney-U-Test. University Zurich.