Sherlock Holmes and Machine Learning
Nov. 6, 2014, 4:20 p.m.

Sir Arthur Conan Doyle's legendary hero, Sherlock Holmes, had definitely a tremendous influence on forensic science. Yet the breadth of his influence should not be limited to forensic scientists as his methods have much to teach, among many others, to those interested in machine learning.

Sherlock Holmes knows the importance of data and observation. He tries his best to avoid provisional theories, which may easily lead to confirmation bias. He knows the importance of unusual information, and also, the importance of separating what is relevant from irrelevant. Here I list some of my favourite quotes of him, putting some of them under titles that should be more familiar to those interested in information theory and machine learning. And if you haven't yet read any Sherlock Holmes stories, you may want to know that they are available for free!

"I had come to an entirely erroneous conclusion which shows, my dear Watson, how dangerous it always is to reason from insufficient data."
"Data! Data! Data! I can't make bricks without clay."

Feature selection:
"It is of highest importance in the art of detection to be able to recognize, out of a number of facts, which are incidental and which are vital."
"The principal difficulty in your case lay in the fact of there being too much evidence. What was vital was overlaid and hidden by what was irrelevant."

Entropy (the significance of high-entropy information):
"It is a mistake to confound strangeness with mystery. The most commonplace crime is often the most mysterious because it presents no new or special features from which deductions may be drawn. [...] These strange details, far from making the case more difficult, have really had the effect of making it less so."
"The more bizzare a thing is the less mysterious it proves to be."
"The very point which appears to complicate a case is, when duly considered and scientifically handled, the one which is most likely to elucidate it,"
"I have already explained to you that what is out of the common is usually a guide rather than a hindrance."

Avoid premature theorizing and confirmation bias:
"I have no data yet. It is a capital mistake to theorize before one has data. Insensibly one begins to twists facts to suit theories, instead of theories to suit facts."
"The temptation to form premature theories upon insufficient data is the bane of our profession."
"One forms provisional theories and waits for time or fuller knowledge to explode them. A bad habit, Mr. Ferguson, but human nature is weak."
"Let me run over the principal steps. We approached the case, you remember, with an absolutely blank mind, which is always an advantage. We had formed no theories. We were simply there to observe and to draw inferences from our observations."

Exclusion principle
"[...] when you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth."

Backward reasoning:
"Most people, if you describe a train of events to them, will tell you what the results would be. They can put those events together in their minds, and argue from them that something will come to pass. There are a few people, however, who, if you told them a result, would be able to evolve from their inner consciousness what the steps were which led up to that result. This power is what I mean when I talk of reasoning backwards, or analytically. "

Prior information:
[after Holmes makes a complex and correct deduction about Watson] "I have the advantage of knowing your habits my dear Watson".

Obvious can be deceptive (can we link it to the pitfall of maximum likelihood?):
"Circumstantial evidence is occasionally very convincing."
"There is nothing more deceptive than an obvious fact."

