Frequency tableA table that observes the number of data occurrences in each class/group when a dataset is classified (grouped) by a specific column;Join tableIt is also a frequency table, but it will analyze the table that observes the frequency of data in each group when the dataset is grouped by two or more types of variables.Cross-ClassificationTable.
Introduction
In particular, if we use two attributes of sample data to construct a join table, and each attribute has only two levels, we will get a join table with two rows and two columns, also known2× 2TableOrFour-grid table. For exampleGender(Male/female) andColor vision(Normal/color blind) two attribute groups are used to build a join table. Generally, if we use two attributes of a datasetA,BTo build a join table.AThere are R levelsA1,A2 ,....AR,BThere are C LevelsB1,B2 ,...BC, then we will finally get a join table of column C in the r row.R×CTable, NIJ is levelAI andBThe frequency at J. Generally, if we use multiple attributes to create a join table, we will get a multi-dimensional join table.
Function
The basic problem of column join table analysis is: are the attributes independent? As in the previous example, is gender related to color blindness? InR×CTable, IfPI,PJ,PIJ indicates that the data in the dataset belongsAI-level frequency, which belongsBThe frequency of J andAIBFrequency of J (PI,PJ is marginal probability,PIJ belongs to the lattice probability), then the assumption that the "A and B attributes are irrelevant can be expressedH0:PIj =PI *PJ.
If the irrelevant hypothesis of a variable is rejected, we need a certain measurement to characterize the degree of correlation between variables. For example,R×CTable, AvailableColumn join CoefficientTo measure the degree of association.
Statistics-join table