1

I'm looking at the claim in An algorithmic theory of learning: Robust concepts and random projection by R. I. Arriaga and S. Vempala (2006):

Further, it is NP-hard to learn a disjunction of k variables as a disjunction of few than k log n variable.

I believe this is meant in the PAC model, I believe without membership queries.The author states this without proof. This usually means the result is well-known or obvious but I haven't been able to track it down in an online resource (including a few papers surveying the basics) and it lacks citation.

(I asked a simpler question here but this wasn't strong enough to clear up my confusion on the claim in the paper.)

djechlin
  • 497
  • 4
  • 15
  • 1
    What is your question? Have you attempted the proof yourself? – Raphael Dec 17 '15 at 20:57
  • @Raphael my question is the title, "Why is it NP-hard to learn a disjunction of k variables as a disjunction of fewer than k log n variables?" Honestly no not really, given that it seems to be a well-known result I thought it would be easily searchable and should be if it's not. I'll spend significant time solving it myself as a last resort, as I believe this is good practice most of the time. – djechlin Dec 17 '15 at 22:03
  • Well, "why"? Because you can reduce 3-CNF-SAT to it. Not the answer you were looking for, right? ;) As a learner, trying yourself should be your first resort! Our reference question is there to help you with that. – Raphael Dec 17 '15 at 22:56
  • @Raphael No need to moralize how I should or shouldn't be learning. I'm reading a recent research paper in computational learning theory, should I seriously be penalized for being unsure whether this result can be solved as an exercise, or requires serious machinery that most experts in the field know and I don't? How many hours of work do you suggest before I try researching the answer literally in any other way? – djechlin Dec 17 '15 at 23:18
  • It is literally at the level of an exercise: see exercise 5 here: https://cs7545.wordpress.com/lecture-notes/hw3/. If you read the lecture notes you should be able to solve this exercise. – Yuval Filmus Dec 18 '15 at 01:14
  • The VC dimension of the set of all monotone disjunctions of length at most $k$ is $\Theta(k\log n)$. This is a known fact at the level of an exercise. This should imply the hardness. – Yuval Filmus Dec 18 '15 at 01:26
  • @djechlin If friendly suggestions are "penalizing" for you, you should not look for help on the internet. "How many hours of work" -- that's your decision and depends on how important the result is for you. Do you trust the author and only want to use the result in a minor place? Does your whole thesis rest on the result? Do you want to gain insight by understanding the prove? – Raphael Dec 18 '15 at 06:42
  • @Raphael While this may be standard for computational learning theorists, it is not standard for most people with a standard complexity background, so the question is very reasonable. An expert can answer it immediately, but for the rest of us it seems daunting at first. Why shouldn't an expert help here? This is so much better than the bread and butter homework questions of this site. – Yuval Filmus Dec 18 '15 at 22:39
  • @YuvalFilmus : ​ Unlike that exercise, the OP's quote appears to depend on the base of the log. ​ ​ ​ ​ –  Dec 19 '15 at 07:06
  • @RickyDemer I don't think so. The statement is that $o(k\log n)$ doesn't suffice. – Yuval Filmus Dec 19 '15 at 07:47

0 Answers0