Error Tolerance (PAC learning)

edit

(Intending to add this as a section of the PAC page rather than as its own, kept separate for now because multiple people are working on that one.)

Classification Noise

edit

In the classification noise model, a machine learning algorithm is provided a set of bit string examples with one-bit labels. The examples are undisturbed, but with probability 1-η the wrong label is provided.[1] The parameter η is called the classification noise rate.

If a PAC learning algorithm L receives samples of length n labeled according to a concept in class C, outputs a concept in class H that with probability at least 1-δ has error at most ε, and runs in time polynomial in n, 1/ε, 1/δ, and 1/(1-2ή) for η ≤ ή < 1/2, then C is efficiently PAC learnable using H in the presence of classification noise.[2]

Statistical Query Learning

edit

Statistical Query learning is a kind of active learning in which the learning algorithm can request information about the likelihood of a binary function χ being satisfied on the examples and receive an answer accurate to within a tolerance τ. A concept class C is efficiently learnable from statistical queries H if there exists a learning algorithm L such that for any sample distribution, any concept c in C that is most succinctly expressed in size(c) bits, and any error rate 0<ε<1/2, every query χ can be evaluated in time polynomial in 1/ε, n, and size(c), the inverse tolerance 1/τ is bounded by a polynomial in 1/ε, n, and size(c), the run time of L is bounded by a polynomial in 1/ε, n, and size(c), and the output of L has an error rate less than ε.[3]

The statistical query model is strictly weaker than the PAC model: any efficiently SQ-learnable class is efficiently PAC learnable in the presence of classification noise, but there exist efficient PAC-learnable problems such as parity that are not efficiently SQ-learnable.[3]

Other Noise Models

edit

The PAC learning model has also been extended to include malicious errors,[4][5] noise on inputs but not labels,[6][7][8] errors in on-line learning[9] and more.[10]

References

edit
  1. ^ Angluin, D., & Laird, P. (1988). Learning from noisy examples. Machine Learning, 2(4), 343-370.
  2. ^ Kearns, M. J., & Vazirani, U. V. (1994). An introduction to computational learning theory, chapter 5. MIT press.
  3. ^ a b Kearns, M. (1998). [www.cis.upenn.edu/~mkearns/papers/sq-journal.pdf Efficient noise-tolerant learning from statistical queries]. Journal of the ACM (JACM), 45(6), 983-1006.
  4. ^ Valiant, L. G. (1985, August). Learning Disjunction of Conjunctions. In IJCAI (pp. 560-566).
  5. ^ Kearns, M., & Li, M. (1993). [www.cis.upenn.edu/~mkearns/papers/malicious.pdf Learning in the presence of malicious errors]. SIAM Journal on Computing, 22(4), 807-837.
  6. ^ Shackelford, G., & Volper, D. (1988, December). Learning k-DNF with noise in the attributes. In Proceedings of the first annual workshop on Computational learning theory (pp. 97-103). Morgan Kaufmann Publishers Inc..
  7. ^ Goldman, S. A., & Robert, H. (1991). Sloan. The difficulty of random attribute noise. Technical Report WUCS 91 29, Washington University, Department of Computer Science.
  8. ^ Sloan, R. H. (1989). Computational learning theory: New models and algorithms (Doctoral dissertation, Massachusetts Institute of Technology).
  9. ^ Littlestone, N. (1991, August). Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow. In Proceedings of the fourth annual workshop on Computational learning theory (pp. 147-156). Morgan Kaufmann Publishers Inc..
  10. ^ Laird, P. D. (1988). Learning from good and bad data. Kluwer Academic Publishers.

Category:Computational learning theory Category:Theoretical computer science Category:Machine learning