r - Hypothesis Testing: difference of distributions of a multi-level factor between two samples

3.7k views

r - Hypothesis Testing: difference of distributions of a multi-level factor between two samples

it's probably an easy question but I have been searching the Internet and some books without success. It's probably that I don't know the correct statistical terms but I'd appreciate any help:

I have a set of samples that is split into 2 subsets (e.g., people writing with their left or right hand) and for which a factor with multiple levels is known (e.g., eye color is blue, brown, and green).

I want to know whether the distribution of the levels is different.

For example:

``````data <- data.frame(
c(rep("left", 5), rep("right", 25)),
c(rep("blue", 10), rep("green", 10), rep("brown", 10))
)
colnames(data) <- c("hand", "eyecolor")
``````

I now need to know, whether the eyecolor levels are significantly different between left- and right-handed people.

It's a silly example but I hope that I got my point across.

Which hypothesis test is required? binom.test and prop.test only work on two-levels but I require more than two levels.