I decided to investigate this claim. Let's first check the case when the professor distributes the correct answers evenly and randomly, not favouring any one letter. Take for example a test with 30 questions, where each answer is independent of the others and there are four choices per question (A, B,C, or D). Assuming that you really have no clue which is the correct answer, any guess has a 25% probability of being correct. This is true regardless of whether you choose randomly (let's ignore for the moment that humans are notoriously bad at generating random values on their own) or if you decide to choose the same letter consistently. Here's why:
The instructor randomly chooses which letter is correct, without bias, so the probability of any letter happening to be the correct answer on a particular question is 25%. If you also guess randomly without favouring any letter (i.e. each letter is guessed, on average, 25% of the time), you should then expect that 25% of the correct answers are 'A', 25% of your guesses are 'A', thus 6.25% of your guesses (25% of 25%) are correct. The same is true of 'B', 'C', and 'D', so that overall you expect 4*6.25% = 25% of your guesses to be correct. Now if you consistently choose 'C' for the same random test, the probability that 'A' is correct is 25% but the probability that you guess 'A' is 0%. This is also true of 'B' and 'D'. When you choose 'C' 100% of the time, overall you should expect that 3*0.25*0 + 1*0.25*1 = 25% of your guesses are correct.
Since the guess is either right or wrong, this is a problem for the binomial distribution. The mean is n*p and the variance is sqrt[n*p*(1-p)], where n is the number of trials and p is the probability of the desired outcome. In our example multiple-choice test, whether you guess randomly or choose 'C' every time, you would expect, on average, to get 30*0.25 = 7.5 correct answers. To reinforce the point that both approaches to guessing are equal here, I simulated 50,000 of these tests in Excel and generated the following plot.
|Probability distribution for correct guesses on tests with random answers to four-choice questions.|
Now that it's clear that there's no advantage if answers are randomly and evenly distributed, let's investigate the case where the instructor favours one letter over the others. It turns out that no matter how biased the instructor's distribution of correct answers might be, if you randomly guess each letter 25% of the time, your probability of choosing the right answer is still 25%. Let's assign unknown probabilities for each letter being assigned as correct by the instructor: pA, pB, pC, and pD. The sum of these unknown probabilities must be unity. So when the probability of guessing A, B, C, or D is 25% each, the probability of being correct is then 0.25pA+0.25pB+0.25pC+0.25pD = 0.25(pA+pB+pC+pD) = 0.25*1 = 25%. That all changes if you consistently choose 'C' though. In this case, your probability of getting the right answer is 0pA+0pB+1*pC+0pD = pC. I varied pC between the most unfavourable case (where 'C' is never the correct answer) to the most favourable (where 'C' is always the correct answer) to generate the following plot:
|Probability distribution for correct answers when 'C' guessed consistently for different values of pC.|
I assumed that pB + pC is 54% on average, which is just a guess on my part but I feel it is reasonable because answers aren't randomly assigned a letter. Numerical answers are often ordered from smallest to largest and statements like "All of the above" are reserved only for 'D' because they'd be confusing if they weren't. I used the normal distribution to generate the random variations in individual tests, so that pB + pC can vary from 0 to 1, but is usually close to the average. I similarly used normally distributed random numbers to split up pB and pC, so that on average pC is half of (pB + pC), but can vary from 0 to 100%. Same idea with pA and pD. Random guesses by the test taker are still equally distributed on average between the four choices. Plotted below are the results of 250,000 simulated tests.
|Probability distribution for correct answers based on "realistic" multiple choice tests.|
To summarize, if you happen to know that the creators of the test favour 'C' for correct answers, guessing 'C' consistently gives you an edge. In the long run, sticking with 'C' probably gives you a very slim advantage over random guessing, though guessing randomly gives you better consistency in the results of your guesses.