I was thinking about writing a post in order to explain overfitting in simple terms.
The way I wanted to start was something like:
"Overfitting is the phenomenon of building a model that agrees well with the observed data but has no predictive ability (it does not agree with unseen or future data)"
This is probably fine, but somewhat formal, and sometimes a simple example is better than a long explanation. The example I needed was given to me by Bernhard Schölkopf who noticed a connection between a psychological disease called "obsessive compulsive disorder" and the overfitting phenomenon:
Someone who is O.C. will for instance take on and off his trousers five times in a row, since he once did it and it causes something positive to happen (or he thinks it helped avoid a catastrophe).
A lot of children have this, in a mild form (e.g. they do not want to step on the edge of the tiles in the floor, etc.). So maybe it is actually part of our inference engine, trying to learn decisions/actions. If this is true, then O.C. is nothing but an error in the inference enginge. Maybe the wrong capacity, leading to overfitting.