7. Computational Learning Theory 신수용 2018-11-27
개요 Introduction PAC Sample Complexity for Finite Hypothesis Spaces for Infinite Hypothesis Spaces Mistake Bound Model 2018-11-27
7.1 Introduction Machine Learning 학습시의 의문점들 학습 알고리즘과 무관하게 학습문제가 쉬운지 어려운지 판단할 수 있는가? 성공적인 학습을 위한 training example의 개수를 판단할 수 있는가? Learner가 trainer에게 질문을 하는 것은 어떤 영향을 미치는가? Learner가 학습을 마치기 전까지 행하는 mistake의 수는 얼마인가? 학습 문제의 계산 복잡도를 판단할 수 있는가? 2018-11-27
7.1 (2) Our Goal Sample Complexity Computational Complexity Learner가 성공적인 가설을 성립하기까지 필요한 training example의 개수 Computational Complexity Learner가 성공적인 가설을 성립하기까지 필요한 계산 시간 Mistake bound Learner가 성공적인 가설을 성립하기까지 잘못 분류하는 training example의 개수 2018-11-27
7.2 Probably Learning an Approximately Correct Hypothesis Problem Setting X : set of all possible instances C : set of target concept D : probability distribution 조건 : stationary 2018-11-27
7.2 PAC (2) Error of a Hypothesis True Error(errorD(h)) training error with respect to target concept c and distribution D is the probability that h will misclassify an instance drawn at random according to D training error the fraction of training examples misclassified by h 2018-11-27
7.2 PAC (3) PAC Learnability Sample error errorD(h) = 0 : 효과없음 with respect to a set S of samples to be the fraction of S misclassified by h S = training example -> sample error = training error PAC Learnability errorD(h) = 0 : 효과없음 모든 가능한 instance를 제공하지 않으면, 제공된 training example에 대해서 다양한 가설이 존재해서 적합한 가설을 선택하기 어려움 2018-11-27
7.2 PAC (4) Training example이 임의로 선택되기 때문에, 오류가 있는 training example이 선택될 수 있음 조건 완화 error가 0이 아니어도 됨 충분히 작은 상수 보다 작은 수의 error는 허용 모든 training 예제에 대해서 성공하지 않아도 됨 충분히 작은 상수 보다 작은 수의 실패 확률은 허용 the learner probably learn a hypothesis that is approximately correct(PAC learning) 2018-11-27
7.2 PAC (5) PAC PAC-learnable PAC인 경우에 polynomial-time algorithm and polynomial function이 존재 C is PAC-learnable : 0 < < 1/2, 0 < < 1/2, learner L will with probability at least (1 - ) output a hypothesis such that in time that is polynomial in 2018-11-27
7.3 Sample complexity for finite hypothesis spaces consistent learner outputs hypothesis that perfectly fit the training data version space (VSH,D) set of all hypothesis that correctly classify the training examples D. 2018-11-27
7.3 Sample complexity for finite hypothesis spaces (1) Consistent learner에게 필요한 예제의 수는 version space가 적합한 가설만을 가지게 되는 예제의 수와 같다. . Version space is said to be with respect to c and D, if every hypothesis h in VSH,D has error less than with respect to c and D all the hypotheses consistent with the observed training examples happen to have true error less than 2018-11-27
7.3 Sample complexity for finite hypothesis spaces (3) -exhausting the version space H is finite, D is a sequence of , the probability that the version space VSH,D is not -exhausted is less than or equal to PAC 도입 2018-11-27
7.3 (4) Agnostic Learning and Inconsistent Hypothesis agnostic learner target concept가 H에 의해서 표현 가능한지 여부에 대해서 가정하지 않고, 최소한의 training error를 가지는 hypothesis를 찾는 학습자 hbest hypothesis from H having lowest training error over the training example 2018-11-27
7.3 (5) 실제 학습시 error : Hoeffding bounds 지금까지 제한 조건 : errorD(hbest) = 0 deviation between the true probability of some event and its observed frequency over m independent trials 2018-11-27
7.3 (6) Conjunctions of Boolean Literals Are PAC-Learnable |H| = 3n FIND-S algorithm을 쓰면 boolean conjunction은 PAC-learnable PAC-Learnability of Other Concept Classes Unbiased Learners 2018-11-27
7.3 (7) k-terms DNF and k-CNF concepts Set C of all definable target concepts corresponds to the power set of X : unbiased class : exponential sample complexity k-terms DNF and k-CNF concepts polynomial sample complexity를 가지지만 polynomial time에 학습불가능한 경우 k-DNF 2018-11-27
7.4 Sample Complexity for Infinite Hypothesis Spaces Sample Complexity를 |H| term을 사용해서 표현시 나쁜점 quite weak bounds infinite hypothesis space는 표현 불가능 H’s Complexity의 다른 표현법(instead of |H|) Vapnik-Chervonenkis dimension of H VC dimension, or VC(H) 보다 tight한 bound를 제공 number of distinct instances from X that can be completely discriminated using H 2018-11-27
7.4 Sample Complexity for Infinite Hypothesis Spaces (2) Shattering a Set of Instances Shatter A set of instances S is shattered by hypothesis space H if and only if for every dichotomy of S there exists some hypothesis in H consistent with this dichotomy The Vapnik-Chervonenkis Dimension size of the largest finite subset of X shattered by H. If arbitrarily large finite sets of X can be shattered by H, then 2018-11-27
7.4 Sample Complexity for Infinite Hypothesis Spaces (3) 2018-11-27
7.4 Sample Complexity for Infinite Hypothesis Spaces (4) Illustrative Examples X : set of real numbers H : set of intervals on the real number line S = {3.1, 5.7} 4개의 hypothesis가 필요 (1 < x <2), (1 < x <4), (4 < x < 7), (1 < x <7) VC(H) = 2 (at least) S = {1.5, 4.2, 5.1} Not shattered -> VC(H) = 2 (final) H : infinite -> VC(H) : finite 2018-11-27
7.4 (5) X : points on the x, y planes H : all linear decision surfaces in the plane VC(H) = 3 VC dimension of linear decision surfaces in an r dimensional space : r + 1 ex) (a) (b) 2018-11-27
7.4 (6) Sample Complexity and the VC Dimension training example 개수의 upper tight bound sufficient number for successful learning lower bound on sample complexity necessary number for successful learning 2018-11-27
7.5 The Mistake Bound Model of Learning total number of mistakes it makes before it converges to the correct hypothesis Mistake Bound for the FIND-S Algorithm FIND-S Initialize h to the most specific hypothesis For each positive training instance x remove from h any literal that is not satisfied by x Output hypothesis h. 2018-11-27
7.5 The Mistake Bound Model of Learning (2) 최악 경우의 mistake bound : n + 1 ex) Mistake Bound for the HALVING Algorithm Having Algorithm version space + majority vote worst case mistake bound 2018-11-27
7.5 The Mistake Bound Model of Learning (3) Optimal Mistake Bounds MA(c) A가 c를 학습하기 위해서 발생시키는 최대 error optimal mistake bound (Opt(C)) minimum over all possible learning algorithm A of MA(C) 2018-11-27
7.5 (4) WEIGHTED-MAJORITY Algorithm VC와 Mistake Bound와의 관계 weighted vote 이용 Relative mistake bound for Weighted-Majority 2018-11-27