In my opinion, the field of Machine Learning still lacks theoretical foundations. Indeed, even if there are many people working on "Learning Theory", most of them (including myself) work on theoretical problems inspired by Machine Learning.
So, what is probably missing is some kind of agreement on what exactly is learning theory.
What is called "Statistical Learning Theory" is probably the most well-defined part of learning theory, as it is based on a mathematically sound framework. But unfortunately, not all learning problems can be studied in this framework, so there is room for either a broader framework, or other complementary theories.
Of course, it is rarely the case that the framework comes before the results: usually people produce research papers proving some results in some setting and ultimately, a bunch of such papers forms a theory (after many people read them, digest them and try to get the main elements out of them).
Moreover, it is probably a waste of time to try and argue which existing results are or are not learning theory results.
So some questions arise
- Why do we need to define learning theory?
- What do we expect from such a theory?
- Assuming we want to do it, what are the first steps to take?
Here are some possible answers
- Defining learning theory would, firstly allow the gathering and unification of the existing results and secondly allow to identify the remaining open questions this theory should address.
- Ideally, such a theory should allow to build better learning machines. In the case this is not possible, it should at least allow to understand why this is the case.
- The first steps could be to classify existing models of learning and extract their commonalities.
Here are some constraints one should have in mind:
- A theory is merely a formal representation (or model), of some "real" phenomenon. As such, it has limitations partly due to the necessary simplifications it introduces.
- Within a theory, equally important are positive and negative results. In other words, the limits of the theory should be clear and impossibility results should be produced to investigate these limits.
- Predictions made by a theory should be testable (this in connected to Popper's falsifiability).
The last point is probably the most important, but also the one that is most subject to controversy. Indeed, testing predictions is somewhat possible for theories about natural phenomena (Physics mainly), but, in the case of learning, the goal is not necessarily to build a theory that accounts for the forms of learning (animal or human) that we observe in Nature, but rather to build some kind of "general" theory of learning. Hence testability may become an issue. In a way, similar issues are encountered with probability theory: is this a theory about natural phenomena? can its predictions be tested?
Answering these questions requires to build some kind of connection between the abstract concepts of the theory and a concrete interpretation of it. Since this is still being debated for probability theory, it is probably far to be resolved for learning theory...
Comments