I have a class of 68 students. The final consisted of 70 multiple choice questions (A,B,C,or D). I have two students who missed the same 17 questions and chose exactly the same wrong answers from the (A,B,C, or D) possibilities. What is the probability of observing this?
Is there a general framework that I can employ to filter out cheaters? For example. I have two more students who were close to the same number of questions missed. One student missed 21 and the other missed 22 questions. Of the questions missed they have 20 in common. of those 20 questions they chose the same wrong answer on 17 questions.
I also have data on the number of people who missed each question. For example of the 68 tests the class average was 51.3.
6 people missed #1
29 people missed #2
23 people missed #3
7 people missed #4
10 people missed #5
etc.
etc.
etc.