Feb 11, 2012 - The Parable of Zoltan. James C. Bezdek. I learned about fuzzy sets in 1969 when I was a graduate student in Applied Mathematics at Cornell ...
The Parable of Zoltan James C. Bezdek I learned about fuzzy sets in 1969 when I was a graduate student in Applied Mathematics at Cornell University. Subsequently, I based my PhD thesis on Fuzzy Clustering. The notion of fuzzy sets was not only novel then, but controversial. And its basic premise – that there is a type of imprecision which cannot be adequately accounted for with probability - continues to bother many engineers and scientists today. This note is about the turbulence that is still created by this division of beliefs about mathematical models of uncertainty. The growth of the theory and applications of fuzzy sets in the 1970s-1980s created a demand at conferences for tutorials about fuzzy models, and I sometimes gave such lectures on the use of fuzzy sets in pattern recognition. A common question then that persists to the present day was "can you give us an example that shows a real difference between fuzzy and probabilistic uncertainty?". My response to that question led me to propose an example called the "potable drinks" example. I often used this example in the late 1980s, and finally published it, first in Bezdek and Pal [1, 1992], and then again, in my introduction to fuzzy models that served as a preamble to the inaugural issue of the IEEE Transactions on Fuzzy Systems [2, 1993]. Here is the example: The Potable Drinks Example (circa 1985; cf. [1, 2]) One of the first questions asked about this scheme [fuzzy sets], and the one that is still asked most often, concerns the relationship of fuzziness to probability. Are fuzzy sets just a clever disguise for statistical models? Well, in a word, NO. Perhaps an example will help. Let the set of all liquids be the universe of objects, and let fuzzy subset L = {all (potable (= "suitable for drinking") liquids}. Suppose you had been in the desert for a week without drink and you came upon two glasses labeled A and B as in the left half of Figure 1 (memb = "membership", and prob = "probability").
Figure 1. Glasses for the weary traveler - disguised and unmasked! Confronted with this pair of glasses, assuming that you will drink from the one that you choose - which one would you choose to drink from? Most readers familiar with the basic ideas of fuzzy sets, when presented with this experiment, immediately see that while A could contain, say, swamp water, it would not (discounting the possibility of a Machiavellian fuzzy modeler) contain liquids such as leaded gasoline. That is, they would know that a membership of 0.91 in L means that the contents of A are "fairly similar" to perfectly potable liquids (e.g., pure water). On the other hand, the probability that B is potable = 0.91 means that over a long run of experiments, the contents of B are expected to be potable in about 91% of the trials. And the other 9%? In these cases the contents will be unsavory (indeed, possibly deadly) - about 1 chance in 10. Thus, most observers will opt for a chance to drink swamp water, and will choose A.
THE PARABLE OF ZOLTAN: BEZDEK : FEBRUARY 11, 2012 : PAGE 1 OF 6
Another facet of this example concerns the idea of observation. Continuing then, suppose that we examine the contents of A and B, and discover them to be as shown in the right half of Figure 1 - that is, A contains beer, while B contains hydrochloric acid. After observation then, the membership value for A will be unchanged (well, this being beer, you might upgrade the membership value to 0.98 or so), whilst the probability value for B clearly drops from 0.91 to 0.0. Finally, what would be the effect of changing the numerical information in this example? Suppose that the membership and probability values were both 0.50 - would this influence your choice? Almost certainly it would. In this case many observers would switch to a swig of the liquid in B, since it offers a 50% chance of being drinkable, whereas a membership value this low would presumably indicate a liquid unsuitable for drinking (this depends, of course, entirely on the membership function of the fuzzy set L). In summary, my example shows that these two types of models possess philosophically different kinds of information; fuzzy memberships, which represent similarities of objects to imprecisely defined properties; and probabilities, which convey information about relative frequencies. Moreover, interpretations about and decisions based on these values also depend on the actual numerical magnitudes assigned to particular objects and events. See [3] for an amusing contrary view with lots of respondents and arguments.
Response from the probabilistic community to the potable drinks example was immediate – and predictable. Woodall and Davis [4] sent me a letter of comments on the example that was published in IEEE TFS 2(1). Here is their general summary from that letter: "We have found that many of those advocating the use of fuzzy logic have justified their methods by offering very limited views of probability. In our opinion, probability can be used to represent the information claimed to be provided only by memberships." I published my response to them in a note titled "The Thirsty Traveler visits Gamont : A Rejoinder to "Comments on "Fuzzy Sets - What are They and Why?"" [5]. As we approach the 50th anniversary of the first paper on fuzzy sets [6], I think it appropriate to revisit Woodall and Davis' comments and my response to them (retitled, and slightly updated here and there to account for events that have happened during the ensuing 20 years). The Gamont Chronicles Does the Woodall and Davis letter offer anything new? Have they finally seen through us ? I don't think so. Quoting their letter: "In our opinion, probability can be used to represent the information claimed to be provided only by memberships". Woodall and Davis suggest altering my example so that a probability model behaves more like my fuzzy model of the liquids in the glasses. This sounds like - "OK, maybe fuzzy uncertainty exists, but I can still handle it at least as well, if not better, with a probability model". Let me recount for you the Gamont Chronicles, a short adventure that illustrates what I think their construction really means. Figure 2 is a map of the region known as Gamont1. In the West, Data City is a bustling place; its members represent various populations that are distributed across the metropolitan area in many different ways. Like most big cities, it has bad neighborhoods; Imprecise Alley is one. You have been in neighborhoods like it - nothing there is ever really certain. At the Eastern end of Gamont is a municipality known as Some Solution - (it may be near Truth or Consequences, NM, but I'm not sure of this). Data City is connected to Some Solution by a modern superhighway named Statistics Parkway that passes through Chancetown. There is another way to get to Some Solution from Data 1
You can find some pretty interesting definitions for the word Gamont on the internet. I used to play the board game "DUNE" with my kids in the 1970s, and there were 3 or 4 "worthless cards" that could be drawn. One of them was titled "Trip to Gamont". Another was the "Jubba Cloak". And so on. THE PARABLE OF ZOLTAN: BEZDEK : FEBRUARY 11, 2012 : PAGE 2 OF 6
City on a very rugged dirt path called Roughly Right Road, which passes through the mountain village of Vagueville. Finally, there's a hamlet (really, just a piglet in this tale) in Gamont - we'll get to it later.
Figure 2. A Trip through Gamont Data City was established centuries ago, and its residents began traveling to Some Solution along the route that is now Statistics Parkway when it was still unpaved - that is, a path without much real foundation. You know progress. Both the parkway and the means for using it improved steadily, and an enterprising businessman named Mr. Probability opened a (Mercedes) dealership there about 1620. Many residents of Data City were delighted, for their families just fit into Probability's current models. As new models came out, inhabitants enjoyed decking one out with all the latest parameters, loading it up with little data sets, and zipping over to their favorite neighborhood in Some Solution, a down-to-earth place known as Useful Fit. It was a long trip, so they often stopped at the Inferential Principle Cafe in Chancetown. There were lots of menu choices, such as method-of-moments meatloaf, least-squares soup, maximum likelihood pie, entropy eggplant (Parmesan) and Bayesian pudding. Each family had its favorites, and sometimes they bickered a little about the confidence they should place on a particular selection, but it never caused real problems, because most choices led to pretty much the same results. But alas, some residents of Data City just could not fit their families into any model offered by the dealership. These were, in the main, that wretched clan that lived in Imprecise Alley. Mr. Probability felt that the Imprecise Aliens (you might have expected them to call themselves Alleyans, but they had very little formal training) could and should - MAKE their families fit into one of his many models. And, since Mr. Probability had the only dealership in Data City, sometimes Imprecise Aliens did just that. But they were uncomfortable, and they spent most of their time trying to interpret the rules for traveling on Statistics Parkway, which by that time had become very complicated indeed. Since they only rarely were able to use Mr. Probability's models, Imprecise Aliens hardly ever got out of Data City. Sometimes they packed up picnic lunches, and hiked to Some Solution by taking the mountain shortcut through Vagueville (it was quite a bit more direct, and fun too, but of course Roughly Right Road was no place for a Mercedes). These were a merry and hardy people, thick of skin and bright of eye, but they were the object of much
THE PARABLE OF ZOLTAN: BEZDEK : FEBRUARY 11, 2012 : PAGE 3 OF 6
scorn by most of the folks in Data City, who felt that anyone who really needed to get to Some Solution could always use Statistics Parkway (even if they had to hitchhike). Then, an incredible thing happened in 1965. A stranger - a man from another time and place I guess, because he had an odd name like Zoltan, or something like that - moved to Data City and opened up a second dealership. Zoltan's ideas about travel were pretty radical. He sold Land Rovers. The Imprecise Aliens quickly learned that these new vehicles could easily bounce along Roughly Right Road, pass through Vagueville (taking in sights never seen by those who traveled only by Mercedes), and arrive at Some Solution with plenty of time to find Useful Fit. Families with unusually imprecise children - you know, the kind that never fitted into Probability's models at all - seemed especially comfortable in this new vehicle. Like all new models, those sold by Zoltan had some design flaws and manufacturing glitches, but after these were worked out, they became very reliable indeed. So, Land Rovers quickly multiplied, and this had some consequences. Mr. Probability lost a little business. Not because the Mercedes was outdated; rather, it seemed more natural to sometimes take the direct route. Moreover, opening the new route to Some Solution led to the discovery of Fuzzy Controlton, a tiny hamlet tucked away deep in the Ambiguous Mountains . The residents of Fuzzy Controlton were considered inverted - their lives swung back and forth like pendulums. But they seemed to know a lot about Useful Fit, and even though they had no formal training in Cartography, they helped the Imprecise Aliens prepare a very detailed map of it. Now the Imprecise Aliens knew that the Land Rover would run on Statistics Parkway, but they also knew it was silly to drive to Some Solution in one via the Parkway, since the Mercedes was much better suited to this task. So, they saved part of their meager incomes, and most of them finally owned both a Mercedes and a Land Rover; each was used for the trips it was most well suited to. This maximized the resale value of both vehicles, and the Imprecise Aliens were a pretty happy lot. Mr. Probability's dealership was not in trouble; his models worked, and worked well indeed, for many families in Data City - especially for the normally distributed ones who lived in the central limit district. But he was worried. As he saw it, there were only two choices. First, he could encourage other Data Citizens to have a Land Rover and a Mercedes. This made sense, for then every family would have the correct model for every trip. But he felt that not every resident wanted both, or even needed both. It would be better, he thought, to show the Data Citizenry that he could modify any Mercedes so that it, too, could make the trip from Data City to Some Solution via Vagueville. He even did it, converting a 300SL so that it had an extra gas tank, four wheel drive, knobby RV tires, KC lights and the like - it was quite a sight! Prospective buyers tried it out sometimes. Oh, it made the trip alright, but it wasn't nearly as easy or comfortable to get to Useful Fit this way , and the resale value ? You decide. There was a third choice; opening an entirely new route - one that incorporated all the good features of both routes and both vehicles. This really seemed like the best choice of all, and Zoltan happily agreed to do whatever he could to expedite it. But Mr. Probability longed for the old days - the days when the only way to Some Solution was through Chancetown. So his engineers didn't spend much time on this intriguing and eminently sensible idea. As we leave Gamont, Zoltan and the Imprecise Aliens were last seen headed for Fuzzy Controlton - they said something about a parade. And the older, more respected dealer - really, the dealer with a much more solid foundation? Well, he was arguing with his sales force about new ways to make the Imprecise Aliens fit into his latest models. Maybe he always will - I don't know. The use of mathematical models at cross purposes. Woodall and Davis state that the "the fuzzy modeler has had the opportunity to sample the liquid in glass A, while the person evaluating glass B knows only that its contents were randomly selected from some population of liquids, 91% of which are potable". Then they say that, in their "alternative use of probability, the contents of glass B are examined in the same manner as those in glass A". The implication of this is that the evaluator (the one choosing the glass) has somehow seen A, but not B. Not so. The evaluator has seen only the labels of A and B, not (yet) sampled their contents. If you want to know how these labels might arise, consider two bottling plants. One, run by the Imprecise Aliens, assigns a membership to every liquid they intend to bottle, based on tests, and the memberships are "tuned" until a membership function satisfying the objectives of the modelers are found. Then, whenever a run of any liquid is THE PARABLE OF ZOLTAN: BEZDEK : FEBRUARY 11, 2012 : PAGE 4 OF 6
bottled, every bottle gets a label showing the membership value for the liquid being processed. The other plant has the same information about the liquids. However, in order to assign probabilities, it will be necessary for the second plant to decide somehow the exact boundary between the liquids that are potable, and those that are not. Why? Because probabilities refer to crisp events having hard boundaries in the sample space. Woodall and Davis may have missed this point: in probability, every liquid either is or is not potable, but in the fuzzy model, this determination is not needed, and is never made. When this second plant bottles, they let the machine randomly select, with known prior probabilities of selection (not of being potable - that is already done, and has no further bearing on the label a particular bottle receives), liquids from different storage tanks. Labels for these bottles are assigned accordingly (the manager of the plant, a Mr. Probability, was heard to say - "they can take their chances"). When I give this example, some people think I dislike probability, or don't understand it, or I am trying to discredit, or belittle it, or replace it. These people are wrong - they have missed the point. My example merely illustrates that modelers of processes may have choices. If there is no choice - fine; use what you have. But if there is a choice, pick the model that solves your problem best. The Gamont Chronicles advertise this point, and this point alone: drive nails with hammers, and screws with screwdrivers. I can drive screws with a hammer, and, with a little more effort, nails with a screwdriver. But is this the best use of these tools? Woodall and Davis suggest a new scheme which enables probability to be used "in a way which is much more closely analogous to this use of membership", and that "this use of probability makes the argument for the usefulness of memberships much less convincing". Apparently they agree that potability is an imprecisely defined, non-random property. I fully agree with their conclusion, viz., that "probability can be used to represent the information claimed to be provided only by memberships". To me, this amounts to using probability as a means for getting good estimates of membership functions for non-random processes. I guess this is a step in the right direction.
Now it's 2012 instead of 1992. Has anything changed about this debate in the past two decades? Not really. There is still a large, loud and staunchly resistant part of our scientific community that derides the notion that fuzzy models can ever be useful. Someone asked Jim Keller recently – "aren't fuzzy sets just a cult of personality?" What would convince these folks? Nothing, I suspect. I could ask them to google up any of a hundred topics such as "fuzzy sets" (About 1,790,000 results, October 15, 2011), "fuzzy clustering" (1,620,000 results), "fuzzy control" (5,400,000 results – Fuzzy Controlton is very well populated now!), or "fuzzy patents" (3,470,000 results). Discounting the multiple hits, the false hits, and so on, this represents a substantial and reliable literature – one that guarantees us that fuzzy models are not going away anytime soon. But, if you are determined to use probability, Glenn Shafer [7] admonishes you to remember that "The interpretation of belief functions is controversial because the interpretation of probability is controversial" And F. R. Moulton [5] still offers the best advice for all of us: "...every set of phenomena can be interpreted consistently in various ways, in fact, in infinitely many ways. It is our privilege to choose among the possible interpretations the ones that appear to us most satisfactory, whatever may be the reasons for our choice. If scientists would remember that various equally consistent interpretations of every set of observational data can be made, they would be much less dogmatic than they often are, and their beliefs in a possible ultimate finality of scientific theories would vanish." Zen maxim: "Great Doubt: great awakening. Little Doubt: little awakening. No Doubt: no awakening"
References [1] Fuzzy Models for Pattern Recognition, eds. J. C. Bezdek and S. K. Pal, IEEE Press, Piscataway, NJ, 1992. [2] J.C. Bezdek, Fuzzy Models - What are They and Why?, IEEE Trans. Fuzzy Systems, 1(1), 1-6, 1993.
THE PARABLE OF ZOLTAN: BEZDEK : FEBRUARY 11, 2012 : PAGE 5 OF 6
[3] P. Cheeseman, An Inquiry into Computer Understanding, Comp. Intell., 4, 57-142, 1988 (with 22 commentaries/replies). [4] W.H. Woodall and R. E. Davis, R. E., Comments on "Editorial: Fuzzy Models - what are they and why?", IEEE Trans. Fuzzy Systems, 2(1), 43, 1994. [5] J. C. Bezdek, The thirsty traveller visits Gamont: A rejoinder to "Comments on 'Editorial: Fuzzy Models - what are they and why?' ", IEEE Trans. Fuzzy Systems, 2(1), 43, 1994. [6] L.A. Zadeh, Fuzzy Sets, Information and Control, 8, 338-352, 1965. [7] G. Shafer, Rejoinder to comments on "Perspectives on the theory and practice of belief functions." International Journal of Approximate Reasoning, 6, 445-480. 1992. [8] F.R. Moulton, The Velocity of Light. Scientific Monthly, 48, 481-484, 1939.
THE PARABLE OF ZOLTAN: BEZDEK : FEBRUARY 11, 2012 : PAGE 6 OF 6