On Approximately Identifying Concept Classes in the Limit

1 downloads 0 Views 212KB Size Report
a concept and propose a framework of approximate learning in case that a ... an upper-best approximation of any concept can be identi able in the limit.
On Approximately Identifying Concept Classes in the Limit Satoshi Kobayashi

Takashi Yokomori

Department of Computer Science and Information Mathematics, The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu, Tokyo 182, Japan, e-mail:fsatoshi,[email protected] In this paper, we introduce various kinds of approximations of a concept and propose a framework of approximate learning in case that a target concept could be outside the hypothesis space. We present some characterization theorems for approximately identi ability. In particular, we show a remarkable result that the upper-best approximate identi ability from complete data is collapsed into the upper-best approximate identi ability from positive data. Further, some other characterizations for approximate identi ability from positive data are presented, where we establish a relationship between approximate identi ability and some important notions in quasi-order theory and topology theory. The results obtained in this paper are essentially related to the closure property of concept classes under in nite intersections (or in nite unions). We also show that there exist some interesting example concept classes with such properties ( including specialized EFS's ) by which an upper-best approximation of any concept can be identi able in the limit from positive data. Abstract.

1 Introduction Although computational learning theory has provided various kinds of frameworks for analysing the process of learning from a computational point of view, most of learning models take there the strong assumption that a target concept should be in a xed class of concepts, called hypothesis space. While such a limitation of a target concept permits rigorous studies on the computational complexity of learning some xed classes of concepts, it sometimes causes the divergence of learning process in practice when a target concept could be outside the hypothesis space. One approach to overcome such diculties is, as studied in [Muk93a], equipping a learner with the ability to refute the hypothesis space, in case that a target concept is not contained in it. Another one is permitting a learner to output an approximate concept, such as a minimal concept, when a target concept is outside the hypothesis space, which was recently studied by [Muk94] and [Sak91] using the framework of Gold's identi cation in the limit from positive data [Gol67]. This paper closely concerns the latter approach. On the other hand, some interesting works attempting to generalize the Valiant's Probably Approximately Correct (or PAC) learning model [Val84] have been reported, where the target concept (or function) assumption is somehow weakened[PV88] [Hau92][KSS92]. In [KSS92], Kearns, et al. proposed an extended PAC learning

model, called agnostic learning, in which we make no assumptions on the target concept (function). They provided some positive and negative results which outline the possibilities of agnostic learning. This paper can also be regarded as an attempt to investigate the possibilities of agnostic learning in the framework of identi cation in the limit. In this paper, we introduce various kinds of approximations of a concept. Based on those notions, we propose a framework for analyzing the approximate learnability in case that a target concept could be outside the hypothesis space. We present some characterization theorems for approximate identi ability. In particular, we show a remarkable result that the upper-best approximate identi ability from complete data is collapsed into the upper-best approximate identi ability from positive data. Further, many interesting characterizations for approximate identi ability from positive data are presented, where we establish relationships between approximate identi ability and some important notions in quasi-order theory or topology theory. The results obtained in this paper are essentially related to the closure property of concept classes under in nite intersections (or in nite unions). We also show that there exist some rich and interesting example concept classes with such properties ( including specialized EFS's ) by which an upper-best approximation of any concept can be identi able in the limit from positive data.

2 Preliminaries 2.1 Fundamental De nitions and Notations A universal set U is a recursively enumerable set. A subset of U is called a concept, and a set of concepts is called a concept class. The set of all concepts over U is denoted by 2U . For any concept L, we also regard L itself as a characteristic function from U to f0; 1g such that for any w 2 U , L(w) = 1 i w 2 L. For any concept class C , the complementary class of C is de ned to be fU 0 L j L 2 Cg and is denoted by C mp(C ). By  (), we denote the inclusion (proper inclusion) relation. A concept class C has an in nite ascending (descending) sequence i there exists an in nite sequence of concepts L1 ; L2 ; ::: in C such that L1  L2  1 11 (L1  L2  1 1 1). A binary relation  is a quasi-order on U i for each a, b, c in U , the following properties hold: (1) a  a (re exivity), (2) if a  b and b  c, then a  c (transitivity).

In case a  b and b  a, we say that a and b are equivalent with respect to . We write a < b i a  b and b 6 a. In case a 6 b and b 6 a, we say that a and b are incomparable. We know the following fundamental fact.

Lemma 1. Let S be an in nite set of nonequivalent elements with respect to a quasi-order on U . Then, S contains either an in nite linearly ordered sequence or an in nite set of pairwise incomparable elements. Wa say that a quasi-order  on U is a well quasi-order ([Kru72]), i the following conditions hold:

(1) there exists no in nite sequences w1 ; w2 ;::: of elements in U such that for any positive integer i, wi+1 < wi holds, (2) there exists no in nite sets of pairwise incomparable elements in U .

We say that a concept L is a lower (an upper) set with respect to  i for any w1 and w2 in U such that w1  w2 (w2  w1 ), w2 2 L implies w1 2 L. The concept class consisting of all concepts that are upper sets with respect to a quasi-order  is called an upper-closed class with respect to , and is denoted by C ls(). For any concept class C , we de ne a quasi-order induced by C , denoted by C as follows: for any w1 , w2 in U ,

w1 C w2

8L 2 C (w1 2 L implies w2 2 L). Example 1. Let 6 be a nite alphabet and 6 3 be the set of all words over 6 . In this example, we consider 6 3 as a universal set. We de ne w1 sp w2 i w2 is a supersequence of w1 . The subsequence order sb is de ned by : w1 sb w2 i w2 is a subsequence of w1 . By SPC and SBC , we denote the upper-closed classes with respect to sp and sb , respectively. Then, SPC and SBC coincide with sp and sb, respectively. ut i

2.2 Topological Spaces

Let X be a non-empty set and T be any class of subsets of X . Then we de ne

[ T \ T

= fx

j x 2 A; for some A 2 T g; = fx j x 2 A; for every A 2 T g: S T For an empty class, we de ne ; = ; and ; = X . A class T of subsets of X isSclosed T under nite unions ( nite intersections) i for any nite subclass G of T , G ( G ) is contained in T as an element. A class T of subsets of X is closed under in nite S Tunions (in nite intersections) i for any (possibly in nite) subclass G of T , G ( G ) is contained in T as an element. Let X be a non-empty set. A class T of subsets of X is called a topology on X i it satis es the following conditions:

(1) T is closed under in nite unions, (2) T is closed under nite intersections.

A topological space (X; T ) consists of two objects; a non-empty set X and a topology on X . The sets in the class T are called open sets of topological space (X; T ). An element in X is called a point. A closed set in a topological space (X; T ) is a set whose complement is an open set. If A is a subset of X , then its closure, denoted by Cs(A), is the intersection of all closed supersets of A. If A is a subset of X , then its interior, denoted by It(A), is the union of all open subsets of A. For any open set A of a topological space (X; T ), a class G of open subsets of X is called an open cover of A i each point in A belongs to at least one element in G . A subclass of an open cover which is itself an open cover is called a subcover. An open set A is said to be compact i every open cover of A has its nite subcover.

T

2.3 Fundamentals for Identi cation in the Limit Throughout this paper, let U be a universal set. For a given concept L, a positive presentation of L is an in nite sequence  = w1 ; w2 ; ::: such that fwi j i  1g = L. In case of L = ;, a positive presentation of L is an in nite sequence of a special symbol # such that # 62 U . A negative presentation of L is de ned to be a positive presentation of U 0 L. For a given concept L, a complete presentation of L is an in nite sequence

= (w1 ;l1 ); (w2 ; l2); ::: of pairs on U 2 f0; 1g such that fwi j i  1g = U and contains (w; 1) i w 2 L. An element w such that (w; 1) (respectively, (w; 0)) appears in is called a positive example (respectively, a negative example) in . Let N be the set of all positive integers. A class C = fLi gi2N of concepts is an indexed family of recursive concepts (or, indexed family, for short), i there exists a recursive function f : N 2 U ! f0; 1g such that

f (i; w) =



1 0

if w 2 Li , otherwise:

Let L be any concept in C . We say that an algorithm M identi es L in the limit from positive data i for any positive presentation of L, the in nite sequence, g1 ; g2 ; g3 ;:::, of integers produced by M converges to an integer n such that L = Ln . An indexed family C is identi able in the limit from positive data i there exists an algorithm M such that M identi es every concept in C in the limit from positive data. Identi ability from complete data and negative data is de ned in a similar manner. Note that an indexed family C may contain an empty concept ; in our framework, where the following fundamental results on the identi ability from positive data hold as in the original framework[Ang80]. A nite subset T of Li in C is a nite tell-tale of Li i for any Lj in C , T  Lj implies Lj 6 Li . Angluin showed an important result on a characterization of the identi ability in the limit from positive data.

Theorem 2. [Ang80] An indexed family C is identi able in the limit from positive data i there exists an e ective procedure that on any input i enumerates a nite tell-tale of Li 2 C . ut We say that an indexed family C has in nite elasticity i there exist an in nite sequence w0 ; w1 ; w2 ; ::: of elements in U and an in nite sequence L1 ; L2 ; ::: of concepts in C such that, for any k  1, fw0 ; w1 ; :::;wk01 g  Lk and wk 62 Lk hold. An indexed family C has nite elasticity i C does not have in nite elasticity. An indexed family C has nite thickness i for any w, the cardinality of the set fL 2 C j w 2 Lg is nite. The nite thickness is a sucient condition for the nite elasticity.

Theorem 3. [Wri89][MSW90] An indexed family C is identi able in the limit from positive data if C has nite elasticity.

ut

As is proved in [MS93], various kinds of operations on concept classes preserve the property of nite elasticity. Let C be a concept class. By Intsct(C ), we denote the smallest class that contains C [f;g and is closed under nite intersections. Later we will use the next lemma.

Lemma 4. [KY94a] Let C be an indexed family which has nite elasticity. Then, Intsct(C ) has nite elasticity. ut 2.4 A Framework of Approximate Learning

Let C be a concept class and X be a concept (not always in C ). A concept Y 2 C is called a C -upper approximation of a concept X i X  Y and for any concept C 2 C such that X  C , C 6 Y holds. A concept Y 2 C is called a C -upper-best approximation of a concept X i X  Y and for any concept C 2 C such that X  C , Y  C holds. A concept Y 2 C is called a C -lower approximation of a concept X i Y  X and for any concept C 2 C such that C  X , Y 6 C holds. A concept Y 2 C is called a C -lower-best approximation of a concept X i Y  X and for any concept C 2 C such that C  X , C  Y holds. By C X (C X ), we denote the C -upper-best (C -lower-best) approximation of a concept X . C has upper approximation property (u.a.p.) i for any concept X , there exists a C -upper approximation of X . C has upper-best approximation property (u.b.a.p) i for any concept X , there exists a C -upper-best approximation of X . Similarly, C has lower approximation property (l.a.p.) i for any concept X , there exists a C -lower approximation of X . C has lower-best approximation property (l.b.a.p.) i for any concept X , there exists a C -lower-best approximation of X . Then, we have the following.

Lemma 5. A concept class C has u.b.a.p. i C is closed under in nite intersections.

Proof. For proving the if part, assume that C is closed under in nite intersections. Let STbe any concept. We have F = fC 2 C j S  C g = 6 ;, since U = T ; 2 C. Let X = F . Then, we have S  X . Further, for any concept C 0 2 C such that S  C 0 , X  C 0 holds, since C 0 2 F . Therefore, X 2 C is C -upper best approximation of S . Hence, C has u.b.a.p. Conversely, let us assume C has u.b.a.p., and consider any subclass F of C . Let X = T F and de ne Y = C X . From the assumption, we have Y 2 C . By de nition, X  Y holds. We will show that Y  X as follows. Assume that there exists an element u such that u 2 Y and u 62 X . Then, by the de nition of X , we have that there exists a concept A 2 F such that u 62 A. For the concept A, we have X  A and Y 6 A, which contradicts the fact that Y is the C -upper-best approximation of X . Therefore, we have Y  X . Hence, X = Y 2 C holds. This completes the proof. ut

In a similar manner, we have the following.

Lemma 6. A concept class C has l.b.a.p. i C is closed under in nite unions.

ut

These results imply that concept classes with u.b.a.p. and l.b.a.p. satisfy the de nition of topology on U . We will present some topological characterizations of approximate identi ability from positive data in section 4. Furthermore, it is easy to see the followings.

C with U 2 C (; (ascending) sequences, then C has u.a.p. (l.a.p.)

Lemma 7. If a concept class

2 C)

has no in nite descending

ut

Let C be an indexed family and L be a (possibly non-recursively enumerable) concept. We say that an algorithm M identi es C -upper (C -upper-best) approximation of L in the limit from positive data i for any positive presentation of L, the in nite sequence, g1 ; g2 ; g3 ; :::, of integers produced by M converges to some positive integer n such that Ln is C -upper (C -upper-best) approximation of L. A concept class C1 is upper (upper-best) approximately identi able in the limit from positive data by an indexed family C2 i there exists an algorithm M such that M identi es C2 -upper (C2-upper-best) approximation of any concept L 2 C1 in the limit from positive data. We say that an algorithm M identi es C -lower (C -lower-best) approximation of L in the limit from positive data i for any positive presentation of L, the in nite sequence, g1 ; g2 ;g3 ; :::, of integers produced by M converges to some positive integer n such that Ln is C -lower (C -lower-best) approximation of L. A concept class C1 is lower (lower-best) approximately identi able in the limit from positive data by an indexed family C2 i there exists an algorithm M such that M identi es C2 -lower (C2 -lower-best) approximation of any concept L 2 C1 in the limit from positive data. In a similar manner, the upper(-best) or lower(-best) approximate identi ability in the limit from complete data and from negative data is de ned. The following notion of M- nite thickness was introduced by Sato and Moriyama ([SM93]). An indexed family C satis es MEF-condition, i for any nonempty nite set T  U and any Li 2 C with T  Li, there exists a C -upper approximation Lj of T such that Lj  Li . An indexed family C satis es MFF-condition, i for any nonempty nite set T  U , the cardinality of fLi 2 C j Li is a C -upper approximation of T g is nite. An indexed family C has M- nite thickness i C satis es both MEF- and MFF-conditions. Mukouchi presented an interesting result on a sucient condition for the upper approximate identi ability from positive data using the following lemma. Lemma 8. [Muk94] Let C be an indexed family which satis es MEF -condition and has nite elasticity, let L  U be a nonempty concept, and let Ln 2 C be a concept. (a) If L  Ln , then there exists a C -upper approximation Lj 2 C of L such that L j  Ln . (b) If Ln is a C -upper approximation of L, then there exists a nite subset T of L such that Ln is a C -upper approximation of T . ut

Theorem 9. [Muk94] If an indexed family C has M- nite thickness, nite elasticity, and U as an element, then 2U is upper approximately identi able in the limit from positive data by C . ut

Further, Sato and Moriyama presented the following result. Theorem 10. [SM93] If an indexed family C has M- nite thickness and every concept in C has a nite tell-tale, C is identi able in the limit from positive data. ut

3 A Characterization of Upper-Best Approximate Identi ability from Positive Data In the rest of the paper, we argue on the issue of upper-best approximately identifying the class 2U in the limit by some indexed family. In this section, we present a

characterization theorem for such an approximate identi ability from positive data. By Theorem 2 and Theorem 10, we have the following.

C be an indexed family with u.b.a.p. Then, the followings are equivalent. (1) C is identi able in the limit from positive data. (2) Every concept in C has a nite tell-tale. ut

Theorem 11. Let

The following is a characterization for the upper-best approximate identi ability from positive data.

C be an indexed family with u.b.a.p. Then, the followings are equivalent. (1) 2U is upper-best approximately identi able in the limit from positive data by C . (2) 2U is upper-best approximately identi able in the limit from complete data by C. (3) C has nite elasticity. (4) C has no in nite ascending sequences. Theorem 12. Let

Proof.

(1))(2) : Immediately from the de nition. (2))(4) : Assume that C has an in nite ascending sequence L~ 1  L~ 2  L~ 3  1 1 1

and let F = fL~ i j i  1g. The condition (2) implies that there exists an algorithm M which upper-best approximately identi es 2U in the limit from complete data with respect to C . In the following,  = w1 ; w2 ; ::: is a positive presentation of U such that for any i; j ( 1) with i 6= j , wi 6= wj holds. We will de ne an in nite sequence L~ p0 ; L~ p1 ; ::: of concepts in F and an in nite sequence (w1 ; l1 ); (w2 ; l2 ); ::: of pairs on U 2 f0; 1g fed to M as follows:

Stage 0 : n0 = 0; p0 = 0; L~ p0 = ;; initialize M ; Stage i (i  1) : (i) Let L~ pi be a concept in F such that L~ pi 0 (L~ pi01 [ fw1 ; :::; wni01 g) 6= ;; (ii) Additionally feed M a sequence,

(wni01 +1 ; L~ pi (wni01 +1 )); (wni01 +2 ; L~ pi (wni01+2 )), (wni01 +3 ; L~ pi (wni01 +3 )); ::: of pairs on U 2 f0; 1g until the outputs of M converges to some index gi; (iii) Let ni be the total number of pairs fed to M up to this point; (iv) Go to Stage i + 1;

At the rst step of each ith (i  1) stage, L~ pi can be de ned, since L~ 1  L~ 2  L~ 3  1 1 1 is an in nite sequence and the cardinality of fw1 ; :::; wni01 g is nite. At the second step of each ith (i  1) stage, the output of M converges to some index gi , since for any concept L, M can identify a C -upper-best approximation of L in the limit. Further, L~ pi 0 fw1 ; :::; wni01 g  Lgi  L~ pi holds for each i  1, since at the ith stage, the set of all positive examples in a complete presentation fed to M contains L~ pi 0 fw1 ; :::; wni01 g and is contained in L~ pi . Therefore, we have for each i  1, Lgi+1 0 Lgi  L~ pi+1 0fw1 ; :::; wni g0 L~ pi 6= ;. Thus, M changes its conjectures in nitely many times.

Then, we de ne a language L3 using the in nite sequence 1 which is fed to M in the de nition above: w 2 L3 i (w; 1) belongs to 1 . (Recall that L3 may not be recursively enumerable.) We have that the C -upper-best approximation of L3 can not be identi able in the limit from the complete presentation 1 by M , since M on input 1 changes its conjectures in nitely many times. This is a contradiction. (4))(3) : Assume that C has in nite elasticity. Then there exists an in nite sequence w0 ; w1 ; w2 ; ::: of elements in U and an in nite sequence L1 ; L2 ; ::: of concepts in C such that for T any k  1, fw0 ; w1 ;:::; wk01 g  Lk and wk 62 Lk hold. Let us de ne concepts L0i = fLj j j  ig (i  1). We have that for each i  1, L0i 2 C , since C is closed under in nite intersections by Lemma 5. It is easy to see L01  L02  L03  1 1 1, which implies that C has an in nite ascending sequence. This completes the proof. (3))(1) : Note that u.b.a.p. of C implies M- nite thickness of C . Therefore, we obtain this implication immediately from Theorem 9. ut

Corollary 13. 2U is upper-best approximately identi able in the limit from com-

plete data by an indexed family C i 2U is upper-best approximately identi able in the limit from positive data by C . Proof. Note that the upper-best approximate identi ability requires the u.b.a.p. of C. ut

Thus, it is remarkable that the upper-best approximate identi ability from complete data is collapsed into the upper-best approximate identi ability from positive data. By duality, we have the following.

C be an indexed family with l.b.a.p. Then, the followings are equivalent. (1) 2U is lower-best approximately identi able in the limit from negative data by C . (2) 2U is lower-best approximately identi able in the limit from complete data by C . (3) C mp(C ) has nite elasticity. (4) C has no in nite descending sequences. ut

Theorem 14. Let

Further, by Theorem 12, Theorem 14, Lemma 7, we have the following.

Corollary 15. If 2U is upper-best (lower-best) approximately identi able in the

limit from complete data by an indexed family C , then C [ f;g (C [ fU g) has l.a.p. (u.a.p.) ut Example 2. Consider an alphabet 6 = fa;bg, and an indexed family C1 consisting of languages L0 = ;, L1 = 6 3 , L2 = faj j j  1g [ fbg and Li = faj j 1  j  ig (i  3). Note that C1 is closed under in nite intersections and in nite unions, thus C1 has u.b.a.p. and l.b.a.p. by Lemma 5 and Lemma 6. It holds that C1 has in nite ascending sequence. However, each language Li 2 C1 has a nite tell-tale Ti , where T0 = ;, T1 = fbbg, T2 = fbg and Ti = Li (i  3). Therefore, by Theorem 11 and Theorem 12, we have that C1 is identi able in the limit from positive data, but 2U is not upper-best approximately identi able in the limit from complete data by C . On the other hand, by Theorem 14, 2U is lower-best approximately identi able in the limit from negative data by C1 since C1 has no in nite descending sequences. u t

C be an indexed family which has M- nite thickness and nite elasticity. Then, 2U is upper-best approximately identi able in the limit from positive data by Intsct(C ).

Theorem 16. Let

Proof. First we will show the closure property of Intsct(C ) under in nite T intersecT tions. It suces to show that for any in nite subclass F of C with F 6= ;, F can actually be constructed by T a nite intersection of concepts in C . Let F = fL1 ; L2 ; :::g, L = F , and w1 ; w2 ; ::: be some enumeration of elements in L. By Lemma 8 (a), for each Li 2 F , there exists a concept L0i 2 C such that L  L0i  Li and L0i is a C -upper approximation of L. Let F 0 = fL0i j Li 2 Fg, then we have T L = F 0. Furthermore, by Lemma 8 (b), for each L0i 2 F 0 , there exists a nite subset Fi of L such that L0i is a C -upper approximation of Fi . Thus, we can de ne a function h : N ! N, by h(k ) = minfj j L0k is a C -upper approximation of fw1; w2 ; :::; wj gg. In case that h(k) is bounded by some positive integer n, L can be represented by some nite intersection, since C has M- nite thickness. Consider the case that h(k) is not bounded by any positive integer. Then, there exists an in nite sequence k1 ; k2 ; ::: with k1 < k2 < 1 1 1 and h(k1 ) < h(k2 ) < 1 1 1. By the de nition of h(k ), for each L0k , there exists a concept L00k 2 C such that fw1 ; :::; wh(k)01g  L00k and L00k  L0k . We have wh(k) 62 L00k , since L0k is a C -upper approximation of fw1 ;:::; wh(k) g. Then, the in nite sequences wh(k1 ) ; wh(k2 ) ; ::: and L00k1 ; L00k2 ; ::: satisfy the condition of in nite elasticity, which is a contradiction. Therefore, Intsct(C ) is closed under in nite intersections. We will complete the proof of the theorem. By Lemma 4, we have that Intsct(C ) has nite elasticity. Hence, by Lemma 5, Theorem 12 and the closure property of Intsct(C ) under in nite intersections, the claim holds. ut

C be an indexed family which has nite thickness. Then, 2U is upper-best approximately identi able in the limit from positive data by Intsct(C ). Proof. Note that the nite thickness of C implies the nite elasticity and the M- nite thickness of C . ut Example 3. The class PAT of pattern languages has nite thickness ([Ang80]). Corollary 17. Let

Therefore, 2U is upper-best approximately identi able in the limit from positive data by Intsct(PAT ). In order to give another interesting example, let us consider the group on integers by +(addition) and the class P lus consisting of all its subgroups and ;. By a well known result in group theory, we have that P lus is closed under in nite intersections. Further, it holds that P lus has nite thickness. Therefore, 2U is upper-best approximately identi able in the limit from positive data by P lus. ut We show that there exist rich classes by which 2U is upper-best approximately identi able in the limit from positive data. Elementary formal systems (EFS's, for short), originally introduced by Smullyan [Smu61], are a kind of logic system where we use string patterns instead of terms in rst order logic. For more detailed de nition or theoretical results on learning EFS's, refer to [ASY92],[Shi94]. By LB, we denote the class of all length bounded EFS's. By LBn , we denote the class of length bounded EFS's with at most n axioms. We denote, by L(LB)

(L(LB n )), the class of all languages de ned by EFS's in LB (LBn ) with a xed unary predicate symbol. It is known that L(LB) coincides with the class of context sensitive languages.

Theorem 18. [Shi94] For any n  1, L(LBn) have nite elasticity. Lemma 19. [SM93] For any n  1, L(LBn) have M- nite thickness.

ut ut

n  1, 2U is upper-best approximately identi able in the limit from positive data by Intsct(L(LBn )). ut

Theorem 20. For any

4 Topological Characterizations for Approximate Identi ability In this section, we focus on the concept classes which have u.b.a.p. and l.b.a.p. We give some topological characterizations of the upper-best approximate identi ability from positive data for such classes. We then present a characterization theorem of concept classes by which 2U is both upper-best and lower-best approximately identi able in the limit from complete data. We begin by the next fundamental lemma.

Lemma 21. For any concept class C , the followings are equivalent. (a) C has u.b.a.p. and l.b.a.p. (b) C is an upper-closed class for some quasi-order on U . Proof.

(a))(b) : It suces to show C = C ls(C ).

Consider any concept L 2 C and any elements w1 and w2 in U such that w1 C w2 and w1 2 L. By the de nition of C , we have w2 2 L. Therefore, L is an upper set with respect to C . Hence, L 2 C ls(C ), which implies C  C ls(C ). It is only left to show C ls(C )  C . Consider any concept L 2 C ls(C ) and a concept G1 = fA 2 C j A  Lg. From the Sassumption (a) and Lemma S G 2subclass 6, C holds. Therefore,Sit suces to show L = G1 . By de nition, we have 1 S G  L. We will prove L  G1 . 1 Consider any element w 2 L and a concept subclass G2 = fA 2 C j w 2 Ag T (6= ;, since U = ; 2 C holds by theTassumption (a) and Lemma 5). From the assumption (a) and Lemma T 5, we have G2 2 C . By the de nition T of G2, it is easy to see that for any w 0 in G2 , w C w0 holds. Therefore, we T have G2  L, since L is an upper set S with respect to C and contains w. Hence, G2 2 G1 holds, which implies w 2 G1 . This completes the proof. (b))(a) : We only show here that C has u.b.a.p. The dual argument can be applied to the proof for the existence of lower-best approximations. Let  be a quasi-order under consideration. By Lemma 5, it suces to show that C is closed under in nite intersections. It is clear that T C contains U and ; asTelements. Let F be any non-empty subclass of C such that F 6= ;. We can show F 2 C as follows.

x be an element in T F and y be an element in U such that x  y . By x 2 F , we have x 2 A for each A 2 F . Therefore, it holds that y 2 A for each A 2 F , since each AT2 C is an upper set with respect to . Hence, we Thave y 2 T F , which implies that F is an upper set with respect to . Therefore, F 2 C holds from the assumption (b). This completes the proof. ut Let T

Lemma 22. Let C be a concept class with u.b.a.p. and l.b.a.p., and consider topo-

logical spaces T1 = (U; C ) and T2 = (U; C mp(C )). Then, the followings hold. (1) For topological space T1 , 8X  U It(X ) = C X . (2) For topological space T2 , 8X  U Cs(X ) = C X . Proof. Immediately from the de nitions of closure and interior operations.

ut

Theorem 23. Let C be an indexed family with u.b.a.p. and l.b.a.p., and consider a topological space T1 = (U; C ). Then, the followings are equivalent. (1) 2U is upper-best approximately identi able in the limit from complete data by C. (2) 2U is upper-best approximately identi able in the limit from positive data by C . (3) C is identi able in the limit from positive data. (4) Each concept in C has a nite tell-tale. (5) C has nite elasticity. (6) C is a well quasi-order. (7) Every open set in T1 is compact. Proof.

(1),(2) (2))(3) (3))(4) (4))(6)

: : : :

From Theorem 12. From the de nition of upper-best approximate identi ability. From Theorem 2. We rst prove that there exists no in nite sequences w1 ; w2 ; w3 ; ::: such that for any i  1, wi+1

Suggest Documents