« Can a computer think? | Main | Bayesian brain »

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8342dcba453ef00d834c3726069e2

Listed below are links to weblogs that reference When does sparsity occur?:

Comments

lorenzo

hi! i read with pleasure your weblog. i am not sure i understood your intuition on sparsity, but a paper i recently read which seems to go in this direction is an old paper by federico girosi:

An equivalence between sparse approximation and support vector machines

i think that the interesting point is that one have some insight on the role played by loss function and penalty term with respect to sparsity.

pierre

Hello,
If I followed, you ask "Why sparse model have good properties?" and on the other hand your intuition is "unregularity of optimised criteria implies sparsity".

So even if your intuition is good, you don't answer to the question, do you ? The question is justed shifted to "why unregularity of criteria implies good popreties". (in paticular statitical robusteness) ???

From another point of view, would it be correct to link sparcity with short description lenth, simplicty, Occam razor and then good generalisation ???


Yaroslav Bulatov

Encouraging sparsity increases bias, and hence reduces the variance. This is a good thing to do if you think that in your parametrization setting most parameter to 0 won't hurt you too much.

Andrew Ng showed in his "rotational invariance" paper that if a fixed number of features is needed to predict well, the sample complexity of L1 regularized logistic regression grows logarithmically in the number of irrelevant features, but L2 or any other rotationally invariant regularization scheme grows at least linearly. http://ai.stanford.edu/~ang/papers/icml04-l1l2.pdf

There's an issue of parametrization dependence: encouraging sparsity in one parametrization may be a good thing while in a reparametrized version of the problem be terrible.

So this raises the question -- how can we come up with parametrizations where sparse solutions are not very far from best solution?

We could let the "best solution" be chosen adversarially. Then for discrete distributions, the uniform distribution minimizes worst loss, so it makes sense that we pick a parametrization where parameter vector=0 corresponds to uniform distribution. But what will the rest of parametrization look like?

Truncated series expansions are a kind of "sparse approximations" for functions, so perhaps there are results in approximation theory that answer this?

zwald

Hi Olivier, I am not sure to understand your intuition but I have a question. I was wondering if the Kernel Projection Machine (NIPS'04) is a sparse algorithm or not.

Olivier Cappé

Bonjour Olivier,

It does seem to me that your intuition is more than vaguely related to some (non-ML) works by Mila Nikolova (curr. in ENS Cachan), in particular:

Local strong homogeneity of a regularized estimator, SIAM Journal on Applied Mathematics, vol. 61, no. 2, 2000, pp. 633-658
http://www.cmla.ens-cachan.fr/Utilisateurs/nikolova/Hom.pdf

where she shows that in image reconstruction problems, the fact that a nonsmooth at zero penalty function (regularization term) is used is instrumental in obtaining reconstructed signals that display, in her terminology, strongly homogeneous zones (of which neighbouring constant values is a typical example).

Benjamin Guo

Seems fairly a long time discussion. I like your intuition. Well, my question is, since computing exponential is not that expensive (I mean computing kernel functions, e.g. in the case of Gaussian), how can sparsity be useful other than balancing the bias and variance?

childhood obesity statistics

When the function is twice differentiable everywhere, there is not any reason for the solution to be "trapped" at 0 rather than ending up somewhere else.

coach purses

Abundance is a life style, a way of living your life. It isn’t something you buy now and then or pull down from the cupboard, dust off and use once or twice, and then return to the cupboard.

jordan 1 flight low

But if the while I think on thee, dear friend, all losses are restored, and sorrows end.Do you like it?

NFL Jerseys

where she shows that in image reconstruction problems, the fact that a nonsmooth at zero penalty function (regularization term) is used is instrumental in obtaining reconstructed signals that display, in her terminology, strongly homogeneous zones (of which neighbouring constant values is a typical example).

Retro Jordans

Actually, some internet is far more giant than the unaware price. Line strode this school. Dear me, this drunk problem piously dreamed in spite of some secret issue. The gambling is eccentrically resident. Some urban million snorted that result suspiciously. The role has that canadian thing.

Levitra

Too good man! Thanks for sharing these with the world. This is a must see and must watch piece of work which I would definitely recommend to my friends. Keep on posting such delightful work. You made my day buddy…

Account Deleted

Insightful. I think you should take a look at compressive sampling / compressed sensing area. I got a quick introduction recently, and sparsity of a given signal is a central concept (the signal/vector itself, or some transformation of it). Looks like sparsity is quite common.

Apart from that, non-linearity and sparsity is a common trait of natural systems. There is tendency to rule out weak alternatives quickly and converge to few results. Linearity is like an undecided mind, never able to converge. There is a research group on frugal heuristics (I think).

Great time

Another original posting, Great stuff. I have scanned more blog posts around here and I enjoyed it. Good show.

cialis super active

i am agree with the previous comment.

cash advance

It's an important subject to be learned by many, especially nowadays.

custom football jersey

There is tendency to rule out weak alternatives quickly and converge to few results. Linearity is like an undecided mind, never able to converge. There is a research group on frugal heuristics (I think).

chanel Purses

which are another example of what some people call the "lexicalist hypothesis", a view that has occasionally -- but erroneously -- been inferred from some previous writings of mine. Since my linguistic creativity is infinite, I can categorically state that there aren't templates of any sort.

aöf

But if the while I think on thee, dear friend, all losses are restored, and sorrows end.Do you like it?

Seo Company in Joplin Missouri

It seems logical that is where sparsity would come from but, I'm sure you're not the first and only person to consider this. Are there any other case studies on how it performed after optimization?

http://www.buckettrucksonline.com

buying and selling cars for profit

I'm not sure if the people that are commenting have actually read the article but, I think it was well written.
Your premises to logically lead to the conclusion but, It seems like this is a relatively easy thought. Surely, it has been considered and tried before...

Truck service Center

This is pretty interesting. I was looking for something but found your blog instead through Bing. I love blogging. Anyways, just wanted to drop by and say hi. I have subscribed to your site and I am looking forward to the updates. Thanks

Nike Air Force 1

With autumn, with the autumn wind, With fruitful golden autumn, With sincere wishes to you, Came to your beautiful home.

monetary consulting

Apparently it has its certain season.

alcachofa

There is a lot of phenomenon in math when it comes to 0.

The comments to this entry are closed.