Bayesian confusions surrounding simplicity and likelihood in

10 downloads 0 Views 306KB Size Report
Oct 5, 2011 - Bayesian models of the Occamian simplicity principle) and objective .... to Leeuwenberg & Boselie, 1988; Leeuwenberg, van der Helm, &.
Acta Psychologica 138 (2011) 337–346

Contents lists available at SciVerse ScienceDirect

Acta Psychologica journal homepage: www.elsevier.com/ locate/actpsy

Bayesian confusions surrounding simplicity and likelihood in perceptual organization Peter A. van der Helm ⁎ Radboud University Nijmegen, Donders Institute for Brain, Cognition, and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands

a r t i c l e

i n f o

Article history: Received 23 July 2011 Received in revised form 9 September 2011 Accepted 12 September 2011 Available online 5 October 2011 PsycINFO classification: 2323 Visual Perception Keywords: Bayes' rule Likelihood Occam's razor Perceptual organization Simplicity Veridicality

a b s t r a c t In the study of perceptual organization, the Occamian simplicity principle (which promotes efficiency) and the Helmholtzian likelihood principle (which promotes veridicality) have been claimed to be equivalent. Proposed models of these principles may well yield similar outcomes (especially in everyday situations), but as argued here, claims that the principles are equivalent confused subjective probabilities (which are used in Bayesian models of the Occamian simplicity principle) and objective probabilities (which are needed in Bayesian models of the Helmholtzian likelihood principle). Furthermore, Occamian counterparts of Bayesian priors and conditionals have led to another confusion, which seems to have been triggered by a dual role of regularity in perception. This confusion is discussed by contrasting complete and incomplete Occamian approaches to perceptual organization. © 2011 Elsevier B.V. All rights reserved.

1. Introduction Bayes' rule (Bayes, 1763/1958) is a powerful mathematical tool to model all kinds of things in terms of probabilities. In this article, I discuss two separate sources of confusion related to Bayes' rule. One is the distinction between subjective and objective probabilities, and the other is the distinction between priors (or unconditionals) and conditionals (or likelihoods). I show that they have led to conflated lines of reasoning, and I show how unconflated lines of reasoning look like. I discuss these issues in the context of research on perceptual organization, which is the process by which the visual system structures incoming proximal stimuli into interpretations in terms of wholes and parts, that is, into hypotheses about the organization of the distal scenes. The issue of subjective versus objective probabilities is introduced in Section 2 and discussed in Section 3, and the issue of priors versus conditionals is introduced in Section 4 and discussed in Section 5. 2. Subjective versus objective probabilities Imagine one wants to model the outcome of randomly selecting a letter in a randomly selected English text. To this end, one needs the

⁎ Tel.: + 31 24 3612625. E-mail address: [email protected]. URL: http://www.socsci.ru.nl/~peterh. 0001-6918/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.actpsy.2011.09.007

objective (i.e., the actual, or the right) frequencies of occurrence of letters in English texts. For instance, in English, the most frequently occurring letter is E so that, objectively, E has the highest probability of being selected. Such objective probabilities also underlie the Morse Code and Shannon's (1948) classical information theory, for instance. Notice that these objective probabilities may not be suited to model the outcome of an experiment in which participants are asked to guess which letter is most likely to be selected. Participants invoke their own, subjective, ideas about frequencies of occurrence of letters and these may well disagree with the objective frequencies of occurrence. In other words, they use subjective probabilities, that is, probabilities which reflect a person's beliefs regarding the occurrence of things — irrespective of whether these beliefs are veridical (i.e., truthful). By the same token, in perception research, one might test people to assess the probabilities that they give certain interpretations for certain proximal stimuli. This way, one might model the outcome of the human perceptual organization process in terms of the probabilities people assign subjectively to interpretations. Notice that these subjective probabilities primarily reflect how likely humans are to give certain interpretations, that is, they do not necessarily reflect how likely these interpretations are to agree with the actual distal scenes. To assess the latter, one would also need the actual frequencies of occurrence of distal scenes in the world. This distinction is crucial, for instance regarding amodal completion, that is, regarding the question of how the visual system deals with everyday scenes yielding proximal stimuli that may be interpreted as objects partly occluding themselves or others. After all, for such proximal stimuli, the

338

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

visual system concludes to interpretations without knowing what the distal scenes actually comprise. In many domains, including perception research, a problem is that the objective probabilities are unknown, if not unknowable. That is, despite suggestions (e.g., Brunswick, 1956), it seems impossible to establish objectively the frequencies of occurrence of distal scenes in the world. The point is that counting requires categorization and that any categorization of distal scenes is a subjective one (Hoffman, 1996). This fundamental problem may be exemplified by way of Bertrand's paradox (Bertrand, 1889). In Fig. 1, this paradox is illustrated for the question of what the probability is that a randomly picked outer-circle chord crosses the inner disk (see Fig. 1a). As illustrated in Fig. 1b,c, the chords can be categorized (or parameterized) in different ways — yielding different assessments of this probability. In this case, as well as in perceptual organization, one may have compelling arguments to choose a specific categorization, but the point is that it remains a subjective categorization which, therefore, yields subjective probabilities. Hence, to be clear, by objective probabilities I mean probabilities reflecting the actual or right frequencies of occurrence of things in the world, and by subjective probabilities I mean any other choice of probabilities. For instance, however compelling they may be, not only probabilities based on intuition or on outcomes of perception experiments but also artificially designed probabilities (see next section) are subjective probabilities — simply because they do not necessarily agree with objective probabilities in the world. Bayesian models, for instance, usually start from subjective probabilities. In some cases, this is simply because the very objective is to model subjective judgements, but in other cases, it is because the required objective probabilities are unknown. As said, also in perceptual organization, the objective probabilities are unknown. Yet, for the sake of the argument, let us assume that they can be established. I do not think this is possible, but as I discuss next, this assumption does underlie one of the principles that has been proposed to guide the perceptual organization process.

3. Perceptual organization Perceptual organization is the process by which the visual system structures incoming proximal stimuli into interpretations in terms of wholes and parts. It is unclear exactly how it achieves this amazing feat, but a long-standing debate concerns the question of whether this process is guided by the Helmholtzian likelihood principle or by the Occamian simplicity principle (for an extensive review, see van der Helm, 2000). The Helmholtzian likelihood principle, on the one hand, holds that, for a proximal stimulus, the visual system chooses the interpretation

a

b

c

Fig. 1. Bertrand's paradox for chords (straight lines between two points on a circle). (a) Question: if the radius of the inner disk is half the radius of the outer circle, then what is the probability that a randomly picked outer-circle chord crosses the inner disk? (b) Answer 1: if chords orthogonal to a specific outer-circle diameter are taken to form a category then, within every category, half the chords cross the inner disk, so, picking such a chord has a probability of 0.50. (c) Answer 2: if chords starting at a specific outer-circle point are taken to form a category then, within every category, one-third of the chords cross the inner disk, so, picking such a chord has a probability of 0.33. Hence, the probability depends on how the chords are categorized.

most likely to be true (von Helmholtz, 1909/1962). Feldman (2009), for instance, characterized this principle as follows: “Choose the interpretation most likely to be true. The rationale behind this idea seems relatively self-evident, in that it is clearly desirable (say, from an evolutionary point of view) for an organism to achieve veridical percepts of the world.” (p. 875) Hence, models of this principle assume that the visual system has access to candidate interpretations as well as to their objective probabilities in the world. The Occamian simplicity principle, on the other hand, holds that the visual system chooses the most simple interpretation, that is, the one that due to regularities can be defined by the least amount of information in terms of descriptive parameters. Hochberg and McAlister (1953) introduced this principle as follows (see also Attneave, 1954): “The less the amount of information needed to define a given organization as compared to the other alternatives, the more likely that the figure will be so perceived.” (p. 361) To specify this further, they defined information loads (or complexities) by: “The number of different items we must be given, in order to specify or reproduce a given pattern.” (p. 361) Hence, models of this principle need a formal coding language to describe and thereby categorize candidate interpretations, and a metric to quantify their complexities. Notice that the Helmholtzian likelihood principle is about unconscious inference and holds that the visual system chooses the interpretation which objectively is most likely to be true, that is, not that it chooses the one which persons subjectively believe is most likely to be true. It is true that such subjective beliefs result from unconscious inference, but this is also what the Occamian simplicity principle implies, and the central question is which principle drives the unconscious inference leading to such subjective beliefs. In this respect, the Helmholtzian likelihood principle is appealing because it suggests that the visual system is highly veridical in terms of the external world, and the Occamian simplicity principle is appealing because it suggests that the visual system is highly efficient in terms of internal resources. The debate between proponents of these two perceptual principles peaked in the 1980s (see, e.g., Boselie & Leeuwenberg's, 1986, reaction to Rock, 1983, and to Pomerantz & Kubovy, 1986; Sutherland's, 1988, reaction to Leeuwenberg & Boselie, 1988; Leeuwenberg, van der Helm, & van Lier's, 1994, reaction to Biederman, 1987). Later, Chater (1996) refueled the debate with an intriguing stance: he argued that the whole debate was misguided because, as he claimed, the two principles are formally equivalent. Though his proof of this claim has been refuted (van der Helm, 2000), this claim did find followers. Therefore, in the next subsections, I discuss this claim from a different and less technical angle. That is, I argue that it confused Bayesian approaches using subjective probabilities and Bayesian approaches using objective probabilities. To set the stage, I first give an overview of several issues and developments relevant to the simplicity versus likelihood debate in perception. 3.1. Simplicity versus likelihood Most people will agree that some degree of veridicality is a prerequisite of the human perceptual organization process, simply because it has to guide us through the world. This does not mean that

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

it must be highly veridical (cf. Mark, Marion, & Hoffman, 2010), but if it would not be fairly veridical, it would probably not have survived during the evolution. Furthermore, one cannot exclude that, over time, the human visual system has adapted somehow to the statistics in the world. However, it is unclear how this could be verified. That is, the Helmholtzian likelihood principle may be appealing because of its high degree of veridicality, but it is unclear how scientists might establish objective probabilities of objective categories of distal scenes. Notice that this cannot be done by relying on the (to be explained) outcomes of perception experiments, because that would lead to circular “we see what we see” arguments (Hoffman, 1996). The Occamian simplicity principle promises an alternative which does not suffer from this problem. It is a modern version of Occam's razor (William of Occam, ±1290–1349) and a descendant of the Gestalt law of Prägnanz which expresses the idea that the brain, like any physical system, tends to settle in stable states (in dynamic-systems theory, such states are called attractors). For perception, Koffka (1935) formulated this idea by: “Of several geometrically possible organizations that one will actually occur which possesses the best, the most stable shape.” (p. 138) As said, to model this by simplest interpretations in informationtheoretic terms, Occamian simplicity approaches adopt descriptive coding languages and complexity metrics. Because of its high degree of efficiency, the Occamian simplicity principle is appealing, but since it does not aim specifically at veridicality, a relevant question of course is whether it is sufficiently veridical to guide us through the world — I address this question in the next subsections (arguing that it is fairly veridical). Furthermore, initially, it raised the question of which coding scheme is to be used to establish categories and their complexities. Important in this respect has been that both theoretical findings in mathematics (Chaitin, 1969; Kolmogorov, 1965; Solomonoff, 1964a, b) and empirical findings in psychology (Simon, 1972) showed that simplicity is a fairly stable concept. That is, regarding complexity rankings, it may matter which coding scheme is used, but not much. To give a gist, if book A is thicker than book B in, say, English, then it will also be the thicker one in nearly every other language. In mathematics, this finding formed the basis of the flourishing domain of algorithmic information theory (AIT), also known as the theory of Kolmogorov complexity or as the minimum description length theory (see Li & Vitányi, 1997). Notice that AIT relies on the abstract notion of Kolmogorov complexity, without proposing a concrete coding scheme. In psychology, however, Simon (1972) rightfully demanded: “If an index of complexity is to have significance for psychology, then the encoding scheme itself must have some kind of psychological basis.” (p. 371) At the time, perceptual coding languages lacked such a psychological basis, but later, such a basis has been provided for the coding language used nowadays in structural information theory (SIT), which is a formal theory initiated by Leeuwenberg (1969, 1971). Here, the existence of this basis is more relevant than its specifics, but the following may give a gist of how it was established (details can be found in the given references). van der Helm and Leeuwenberg (1991) first presented a formalization of visual regularity, revealing the unique “transparent holographic” nature of the regularities SIT's coding scheme exploits to obtain simplest interpretations. This formalization also led to a theoretically compelling and empirically successful complexity metric (van der Helm, 1994; van der Helm, van Lier, & Leeuwenberg, 1992). The holographic approach (van der Helm & Leeuwenberg, 1996, 1999, 2004) then provided evidence that this transparent holographic nature is indeed pertinent to

339

the human detection and detectability of single and combined regularities, whether or not perturbed by noise (see also Csathó, van der Vloed, & van der Helm, 2003, 2004; Nucci & Wagemans, 2007; Treder & van der Helm, 2007; van der Helm, 2010, 2011; van der Helm & Treder, 2009; Wenderoth & Welsh, 1998). Of course, like any theory, SIT is work in progress and still has limitations and open ends. It does not pretend to capture the full richness of perception, but it does pretend to capture basic principles in perceptual organization. Like any formal theory, SIT formulates these principles using symbols — not for the sake of using symbols, but to capture potentially relevant relationships between the things the symbols stand for. It also provides a computer model which applies these principles to patterned sequences of symbols (van der Helm, 2004). It would be a misunderstanding, however, to think that SIT assumes that the visual system converts visual stimuli into symbol strings. It is true that, in the SIT literature, relatively much attention is paid to how symbol strings might represent visual stimuli, but this merely serves to indicate how, in the empirical practice, the formal principles might be applied to visual stimuli in order to get testable quantitative predictions. All in all, the foregoing shows that the starting points of the Helmholtzian likelihood principle and the Occamian simplicity principle differ fundamentally. Yet, proposed models of these principles often yield similar predictions, which raises the question of how different they are in practice. For instance, Feldman (1997, 2003, 2009) presented a simplicity approach, called minimal model theory, and referring to Chater's (1996) claim, he claimed that this approach is highly veridical. To explore this question further, I next turn to Bayesian formulations of both principles. 3.2. Bayesian formulations Bayes' rule is a powerful mathematical tool to model all kinds of things in terms of probabilities. It is given by: pðH jDÞ ¼

pðH Þ⁎pðDjHÞ : pðDÞ

ð1Þ

In words, Bayes' rule holds that, for data D to be explained, the posterior probability p(H|D) of hypothesis H is proportional to the prior probability p(H) that H occurs, multiplied by the conditional probability p(D|H) that D occurs if H were true. The probability p(D) that D occurs is the normalization factor. Of course, Bayes' rule is just a formula and it only gets meaning by assigning meaningful things to the symbols in it. In perceptual organization, it can be applied to determine the posterior probability p (H|D) of a candidate interpretation H of a proximal stimulus D. Such an interpretation, or scene model, comprises a hypothesized organization of the distal stimulus, that is, it comprises hypothesized distal objects which could fit the proximal stimulus. The prior p(H) then is the probability that interpretation H occurs independently of proximal stimulus D (it is therefore said to account for viewpoint-independent properties of H). Furthermore, the conditional p(D|H) then is the probability that proximal stimulus D occurs if interpretation H were true (it is therefore said to account for viewpoint-dependent properties of H). To be clear, this is the standard way of formulating things (see also Chater, 1996; Feldman, 2009; Gigerenzer & Murray, 1987). Hence, the prior p(H) indicates how likely H is in itself, and the conditional p(D|H) indicates how likely D is under H. Bayes' rule then yields the posterior p(H|D) which indicates how likely H is for given D. In general, Bayesian approaches aim at establishing a posterior probability distribution over the hypotheses, but a specific goal is to select the most likely hypothesis, that is, the one with the highest posterior probability under the employed prior and conditional

340

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

probabilities. To formulate this specific goal, the normalization factor can be omitted, yielding: Select the H that maximizes pðH jDÞ ¼ pðHÞ⁎pðDjHÞ:

ð2Þ

So far so good, but for a modeler who wants to model something in this way, the key question now is: where do I get the priors and conditionals from? To this end, as is customary in Bayesian approaches, one might subjectively choose certain probabilities, whether or not backed-up by compelling arguments. This might also be useful to model the outcome of the perceptual organization process (for fine examples, see Knill & Richards, 1996). Notice, however, that this is not guaranteed to yield compliance with either the Helmholtzian likelihood principle or the Occamian simplicity principle. To achieve compliance with the Helmholtzian likelihood principle, on the one hand, a Bayesian modeler would need the objective prior and conditional probabilities for the candidate interpretations. Though it is unknown what these probabilities might be, let us suppose they are available. Referring to these objective “worldly” probabilities by pw, the Helmholtzian likelihood principle can be formulated in Bayesian terms by: Select the H that maximizes pw ðH jDÞ ¼ pw ðHÞ⁎pw ðDjHÞ:

ð3Þ

This Bayesian formulation indicates that, according to the Helmholtzian likelihood principle, the objectively most likely interpretation is also the one that is most likely to result from the perceptual organization process. To achieve compliance with the Occamian simplicity principle, on the other hand, a Bayesian modeler may start from complexities c as yielded by some descriptive coding scheme, and then artificially assign higher probabilities to interpretations with lower complexities (as said, regarding complexity rankings, it does not seem to matter much which coding scheme is used). In particular, apart from normalization, one might assign the artificial probabilities pa = 2 − c to interpretations with complexity c. These complexity-based subjective probabilities are called algorithmic probabilities in AIT (Li & Vitányi, 1997), and precisals in SIT (van der Helm, 2000). Their usefulness is explicated next. The Occamian simplicity principle holds that the interpretation with the simplest descriptive code is selected, that is, the one with the lowest complexity c. Analogous to the Bayesian terminology, the prior complexity c(H) refers to the complexity of interpretation H independently of proximal stimulus D, and the conditional complexity c(D|H) refers to the complexity of proximal stimulus D starting from interpretation H. In other words, the prior c(H) indicates how good H is in itself, and the conditional c(D|H) indicates how well D fits H. The total, posterior, complexity c(H|D) of H then is given by the sum of the prior and conditional complexities. This sum indicates how well H fits D, so that the Occamian simplicity principle can be formulated by: Select the H that minimizes cðH jDÞ ¼ cðHÞ þ cðDjHÞ:

ð4Þ

Now, notice that under the conversion pa = 2 − c, this minimization formula is equivalent to the following maximization formula: Select the H that maximizes

pa ðHjDÞ ¼ pa ðH Þ⁎pa ðDjHÞ:

ð5Þ

Hence, whereas the minimization formula in Eq. (4) gives an information-theoretic formulation of the Occamian simplicity principle, the maximization formula in Eq. (5) gives a Bayesian formulation of this same Occamian simplicity principle. This Bayesian formulation indicates that, according to this principle, the most simple interpretation is the one that is most likely to result from the perceptual organization process.

3.3. Simplicity is not equivalent to likelihood The equivalence of the Occamian Eqs. (4) and (5) has also been put forward by Chater (1996). However, Chater (1996) claimed further, and others followed suit with or without minor reserves, that the Occamian simplicity principle as formulated in Eq. (5) is equivalent to the Helmholtzian likelihood principle as formulated in Eq. (3). This would imply that simplicity is highly veridical, but this obviously holds if and only if pa = pw and there is no evidence that this might be true in this world. In fact, the error in Chater's (1996) claim is that it mistook the Bayesian formulation of the Occamian simplicity principle for the Helmholtzian likelihood principle (van der Helm, 2000). This is perhaps an easily made mistake, but notice that the former relies on fairly stable, quantifiable, subjective probabilities pa, whereas the latter relies on unknown objective probabilities pw. So, a proof of equivalence of the principles is fundamentally out of the question. Interestingly, Feldman (2009) noted that a bias towards simplicity can be discerned in Bayesian approaches. He referred to MacKay (2003) who argued that a category of more complex instances spreads probability mass over more instances than a category of simpler instances does, so that individual instances in such a smaller category tend to get higher probabilities. Notice that this presupposes (a) a correlation between complexity and category size, and (b) that every category gets an equal probability mass. These assumptions cannot be justified within the Helmholtzian likelihood paradigm. In fact, they rather stem from the intuition of Bayesian modelers who, thereby, actually implement the Bayesian formulation of the Occamian simplicity principle — even if they do so without using a concrete descriptive coding language. This may be exemplified as follows. Imagine a world with objects generated by, each time, first selecting randomly a complexity category, and then selecting randomly an instance from that category. Thus, in the first step, every category has the same probability of being selected, and in the second step, every instance in the selected category has again the same probability of being selected. The instances in a category of complexity c are described by c parameters, so that the category size is proportional to 2 c; this implies that the probability that a particular instance is selected is proportional to pa = 2 − c. This is the kind of world MacKay (2003) seemed to have in mind. Only in such a very specific world, the Occamian simplicity principle would be highly veridical and equivalent to the Helmholtzian likelihood principle. This, however, does not answer the question of how the two principles are related in other imaginable or actual worlds. As I discuss next, this question has been at the center of AIT research.

3.4. Simplicity is a general-purpose principle In many domains, objective probabilities are unknown. This is a troublesome problem, because it obstructs the making of reliable inferences. In fact, the desire to circumvent this problem was what drove AIT in the first place. As mentioned, AIT was founded on the finding that, regarding complexity rankings, it does not seem to matter much which coding scheme is used. Starting from this finding, AIT elaborated Solomonoff's (1964a, b) idea to design artificial probabilities, which are subjective by nature, but which can yet be said to be universal in that they might be used to make fairly reliable inferences in many different situations. To this end, AIT used Solomonoff's abstract and incomputable notion of the Kolmogorov complexity K(x) of things x (concrete and computable complexity metrics, like the one in SIT, can be seen as domain-specific approximations thereof). Then, AIT compared the artificial probabilities pa(x) = 2 − K(x) to probabilities drawn from any so-called enumerable probability distribution P(x) (see Li & Vitányi, 1997). It is unknown if the objective probability distribution over things in the world is enumerable, but this way, AIT was able to draw

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

conclusions for the infinite number of enumerable probability distributions, that is, for an infinite number of imaginable worlds. The result of AIT's quest was the so-called fundamental inequality: −K ðxÞ

2

−K ðxÞþK ðP Þ

≤ P ðxÞ ≤ 2

:

ð6Þ

In words, for any enumerable probability distribution P(x) over things x, the maximal difference between P(x) and the artificial probability pa(x) = 2 − K(x) is determined by the complexity K(P), and this maximal difference is smaller the simpler P is. Roughly, this complexity K(P) is given by the number of categories to which P assigns probabilities, that is, the more different categories to be considered, the more different probabilities to be assigned, the more complex the probability distribution is. Informally, the foregoing suggests that the simpler a world at hand is, the more veridical the Occamian simplicity principle promises to be in that world. If the visual system is indeed guided by the Occamian simplicity principle, then this might well influence how human-made worlds are arranged. That is, human-made worlds like cities tend to be visually simpler than natural worlds like jungles, and this might well be because humans tend to arrange their environment such that their visual system yields more reliable percepts (cf. Allen, 1879; van der Helm, 2011). In this respect, notice that human and animal jungle inhabitants indeed tend to rely more on smells and sounds than on vision. To be clear, the foregoing does not imply that Occamian simplicity is highly veridical, that is, it does not imply that Occamian simplicity and Helmholtzian likelihood are close (let alone equivalent). It does imply, however, that they might be close — depending on the objective probability distribution in a world at hand. That is, the significance of the foregoing is that it suggests that Occamian simplicity might provide a fair degree of veridicality in many different worlds — possibly including the world at hand. Notice that this applies to simplicity-guided systems entering those worlds, that is, not to other information-processing systems that might be present in those worlds. Furthermore, in different worlds, different regularities may dominate and this may affect the veridicality of a simplicity-guided system focusing on specific regularities, but nearly any set of specific regularities is bound to capture a substantial part of other regularities too. In sum, whereas the Helmholtzian likelihood principle can be said to be a special-purpose principle in that it is highly adapted to one world with a supposedly known objective probability distribution, the Occamian simplicity principle can be said to be a general-purpose principle in that it promises to be fairly (possibly sufficiently) adaptive to many different worlds without having to know their objective probability distributions. In vision, the Helmholtzian likelihood principle may be evolutionary appealing, but the foregoing implies that the Occamian simplicity principle is a serious contender. That is, considering the survival value of adaptability to changing environments, the evolution may well have favoured a general-purpose principle over a special-purpose one. 4. Priors versus conditionals Whereas the previous section dealt with the confusion between subjective and objective probabilities, the next section deals with confusions surrounding priors and conditionals. These issues stand apart from each other in the sense that the former is a theoretical issue whereas the latter rather is a modeling issue. Yet, both are related to Bayes' rule — after all, the distinction between priors and conditionals is typically a Bayesian distinction. As discussed in Section 3.2, priors and conditionals cannot only be expressed in terms of probabilities but, within the Occamian simplicity paradigm, also in terms of complexities. In the next section, I show that the distinction between priors and conditionals may lead to confusions in terms of

341

probabilities, but I focus on a confusion in terms of complexities. In this section, I set the stage by discussing the distinction between priors and conditionals in a way that also bears relevance on the issues discussed in the previous section. To recall, for a hypothesis H that fits proximal data D, the prior indicates how good H is in itself, and the conditional indicates how well D fits H. To express this in terms of probabilities, one needs both a (prior) categorization of distal scenes and (conditional) categorizations of views of distal scenes. Whereas establishing such a prior categorization is troublesome (see previous sections), it is relatively easy to give plausible conditional categorizations, that is, categorizations into qualitatively different views of distal scenes (see, e.g., Burns, 2001). One way to exploit this in models would be to assume uniform priors — this is what some likelihood approaches to perception did back in the 1980s, for instance. Another way would be to make an educated guess about the priors, for instance by using complexity-based priors pa as introduced above — this is what simplicity approaches to perception implicitly do. Neither way guarantees compliance with objective prior probabilities, but either way might actually work in the everyday situation of a moving observer who gets a growing sample D of different views of the same distal scene. That is, as I discuss next, the interpretation of such a growing sample D of views can be modeled by means of a recursive application of Bayes' rule. Suppose the sample D consist, at first, of only one view, with Hi (i = 1, 2,…) as candidate interpretations and with prior and conditional probabilities p(Hi) and p(D|Hi), so that the posterior probabilities p(Hi|D) can be determined by applying Bayes' rule. Then, each time an additional view enters the sample D, the previously computed posterior probabilities p(Hi|D) can be taken as the new prior probabilities p(Hi) which, together with the conditional probabilities p(D| Hi) for the expanded sample D, can be used to determine new posterior probabilities by again applying Bayes' rule. This recursive application of Bayes' rule is not guaranteed to converge always on one interpretation (cf. Diaconis & Freedman, 1986), but this is actually good because, in perception, it may therefore also account for visual ambiguity. Generally, however, it converges on one interpretation, that is, the interpretation that, under the employed conditionals, will continue to get the highest posterior when sample D is expanded further (cf. Li & Vitányi, 1997). Hence, if one has (approximately) the right conditional probabilities, then several (not too atypical) views of a distal scene suffice to make a (fairly) reliable inference about what the distal scene comprises and, thereby, what subsequent views will show. That is, the trick of the recursive application of Bayes' rule is that, after several recursions, the effect of the first priors fades away because the priors are continuously updated on the basis of the conditionals which, thereby, become the decisive entities (for more on visual updating, see, e.g., Moore, Mordkoff, & Enns, 2007). However, two remarks are in order. First, though the foregoing is convenient to model perception in everyday situations, it is not indicative of whether the visual system works with the Helmholtzian likelihood principle or with the Occamian simplicity principle. That is, as mentioned in Section 3.4, the margin between these two principles seems to be given roughly by the number of different categories to be considered. This holds for both priors and conditionals. For the priors, clearly many different scene categories are to be considered, but for the conditionals, generally only a few categories have to be considered because a specific scene gives rise to only a few qualitatively different views. This suggests that the two principles may be far apart regarding the priors, but also that they are probably close regarding the conditionals, and this suggests that the two principles perform about equally well in the just-discussed everyday situations of moving observers (van der Helm, 2000). Second, the first priors are perhaps not decisive in such everyday situations, but they do affect the speed of convergence. Besides,

342

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

perception research is about the workings of the visual system, so that it remains relevant to assess which first priors it uses. In fact, the first priors are as decisive as the conditionals in the case of relatively static situations — which are also part of everyday life. In the next section, I argue among other things that the visual system does not use uniform first priors but, rather, Occamian ones.

5. The interplay between priors and conditionals In this section, I discuss confusions surrounding priors and conditionals in terms of probabilities, but I focus on a confusion in terms of complexities. The Leitmotif in this section is that, in my view, nonuniform conditionals must be complemented with nonuniform priors in order to account for perceptual organization. To this end, I begin by discussing Rock's (1983) avoidance-of-coincidences principle. In a proximal stimulus yielded by a specific view of a distal scene, coincidences occur when, for instance, edges or junctions in one distal object accidentally coincide with edges or junctions in another distal object — at least, according to some interpretation of what the objects in the scene are. Such coincidences are unlikely to occur, and Rock therefore proposed that the visual system tends to avoid interpretations according to which they do occur. Rock's proposal reappeared in various shapes and forms, mostly in likelihood approaches (but, as I discuss later on, also in simplicity approaches). For instance, Biederman (1987), Binford (1981), and Witkin and Tenenbaum (1983) argued that a straight proximal line can safely be interpreted as a straight distal edge, because it can be caused by a curved distal edge only from an accidental viewpoint. They therefore referred to straightness by the term nonaccidental property: if such a property is present in the proximal stimulus, then it is most likely also present in the distal stimulus. Just as Rock's principle, this idea reflects the general-viewpoint assumption, which holds that a proximal stimulus is interpreted assuming it does not contain features that would arise only in an accidental view of the distal scene. The general-viewpoint assumption is indeed plausible, but notice that general-viewpoint positions vary with the distal scene at hand. For instance, a straight needle gives rise to only two nongeneral viewpoints (i.e., those yielding a proximal dot), whereas a solid cube gives rise to at least six nongeneral viewpoints (i.e., those yielding a proximal square). The general-viewpoint assumption can therefore be formulated more precisely in terms of conditional probabilities, which quantify how likely proximal data are under specific hypotheses. For instance, a curved distal edge yields a straight proximal line from hardly any viewpoint, so that a straight proximal line has a low probability under the hypothesis that it is caused by a curved distal edge. By the same token, a straight distal edge yields a straight proximal line from nearly any viewpoint, so that a straight proximal line has a high probability under the hypothesis that it is caused by a straight distal edge. This shows that the general-viewpoint assumption derives its plausibility from favouring interpretations involving high conditional probabilities. To show how the foregoing may give rise to confusion, I again consider the idea that a straight proximal line can safely be interpreted as a straight distal edge. Pomerantz and Kubovy (1986) argued that this heuristic should be justified by showing that, in the world, straight edges occur more frequently than curved edges. This, however, would be a Helmholtzian justification in terms of prior probabilities, whereas, as argued above, the heuristic actually derives its plausibility from the fact that it favours high conditional probabilities. Yet, in a sense, Pomerantz and Kubovy were right because, by Bayes' rule, a high conditional probability may well be suppressed by a low prior probability. That is, the straight edge hypothesis may be likely, namely, due to a high conditional probability, but it remains to be seen whether the objective prior probability is high enough to allow for a Helmholtzian justification (see also Leeuwenberg et al., 1994).

Be that as it may, the just-discussed focus on conditionals — and then assuming uniform priors — reflects the line of thinking in likelihood approaches in the 1980s, and it is also the tenet Feldman (2009) put forward in a Bayesian reformulation of his minimal model theory. In the next subsections, I discuss this theory to exemplify another confusion between priors and conditionals. This theory relies on the plausible notion of codimensions, and as I discuss next, it does consider what I call prior and conditional codimensions but it does not recognize that prior and conditional codimensions play opposite perceptual roles.

5.1. Prior codimensions Distal scenes can be categorized such that the categories form supercategories and subcategories of one another (Collard & Buffart, 1983; Garner, 1970; van der Helm, 2000). For instance, squares form a subcategory of rectangles, which form a subcategory of both trapezoids and parallelograms, which in turn are subcategories of the supercategory of quadrangles (see Fig. 2). Feldman (2009, Fig. 1) argued that this implies that the categories can be given a partial order with, as in Fig. 2, the supercategory at the top, and below that, other categories ordered such that nodes lower in the partial order represent subcategories of nodes higher in the partial order. For a given partial order, he then defined the codimension of a category by the number of steps in the partial order from the supercategory to the category at hand. Thus, going from top to bottom in Fig. 2, the categories of objects get codimensions 0, 1, 2, and 3, respectively. I agree with this argument, except that I would speak of prior codimensions because they apply to distal scenes as such (i.e., independent of viewpoint). Furthermore, notice that, going from top to bottom in Fig. 2, the categories comprise increasingly regular distal scenes. So, a category with a higher codimension is a category of simpler distal scenes (e.g., squares are simpler than rectangles). Hence, apart from the modulating role of conditionals, the Occamian simplicity principle can be said to favour interpretations with high prior codimensions. In Feldman's approach, however, these prior codimensions play no role because he assumed uniform priors. In his

Quadrangles

Trapezoids

Parallelograms

Rectangles

Squares

Fig. 2. Five categories of quadrangles ordered such that the one at the top represents the most complex (super)category, while below that, the others are ordered such that, as indicated by arrows, less complex lower ones represent subcategories of more complex higher ones.

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

approach, everything depends on the conditionals which he, as I discuss next, also cast in terms of codimensions. 5.2. Conditional codimensions The notion of what I would call conditional codimensions is related to ideas about avoidance of coincidences, nonaccidental properties, general viewpoints, and so on. In the literature, the whole discussion about these ideas mostly serves to get a better understanding of the role of an observer's viewpoint position in amodal completion (see also Section 1). The crux of this discussion, however, can conveniently be cast in terms of two sticks hanging motionless in space (see Fig. 3), and then by focusing on the question of how likely they are seen in a specific configuration. Hence, for the moment, the interpretation “two sticks” is presupposed, and the question is how likely the two sticks yield a specific spatial relationship in the resulting proximal stimulus. Clearly, some configurations in Fig. 3 seem more likely than others, that is, their likelihood seems to decrease from top to bottom. Notice, however, that this impression is based on a categorization of “similar” configurations — without such a categorization, all configurations would actually be equally likely to result. Also notice that some categories are subcategories of others. For instance, the category in Fig. 3d is a subcategory of the one in Fig. 3c. Hence, just as the prior categories in Fig. 2, also these conditional categories can be given a partial order with the supercategory at the top and the other categories ordered below that. This is also what Feldman (2009, Fig. 2) argued, and as before, he defined the codimension of

a

General position

b

T−junction

c

Cotermination

d

Cotermination and collinearity

Fig. 3. Four configurations in which two motionless hanging sticks might end up proximally. Taken as representatives of categories of “similar” configurations, the one at the top represents the (most likely) supercategory, and below that, the others are ordered such that (less likely) lower ones represent subcategories of (more likely) higher ones. Taken as proximal stimuli to be interpreted, each configuration can, for instance, be interpreted as consisting of one object or as consisting of two objects. Going from top to bottom, the two-object interpretation (definitely preferred in a) gradually makes way for the one-object interpretation (definitely preferred in d).

343

a category by the number of steps from the supercategory to the category at hand. Thus, going from top to bottom in Fig. 3, also these categories get codimensions 0, 1, 2, and 3, respectively. I again agree with this argument, except that this time, I would speak of conditional codimensions because they apply to views of specific distal scenes. These conditional codimensions are appealing because, intuitively, they can be said to correspond to the number of proximal coincidences. For instance, the supercategory in Fig. 3a represents a general position of the two sticks, that is, a position without coincidences. Compared to that, one coincidence can intuitively be said to occur in the category in Fig. 3b, two in the one in Fig. 3c, and three in the one in Fig. 3d. Now, let us reverse the situation, and let us again consider the configurations in Fig. 3, but this time, as proximal stimuli that are still to be interpreted. Then, each configuration can, in principle, be interpreted as consisting of one object or as consisting of two objects. Clearly, going from top to bottom in Fig. 3, the two-object interpretation (definitely preferred in Fig. 3a) gradually makes way for the oneobject interpretation (definitely preferred in Fig. 3d). This illustrates Rock's (1983) idea that the visual system tends to prefer interpretations in which coincidences do not occur. For instance, for Fig. 3d, the two-object interpretation does imply coincidences, whereas the one-object interpretation does not, so that the latter is preferred over the former. Hence, whereas a high prior codimension is an asset for a candidate interpretation, a high conditional codimension is a liability. Notice that these opposite roles arise only if priors and conditionals are cast in terms of codimensions, that is, not if they are cast in terms of probabilities or complexities. This difference seems to have played a trick on Feldman's minimal model theory, and thereby, on his Bayesian translation thereof. 5.3. Minimal model theory Feldman (2009) formulated his minimal model theory as holding that the preferred interpretation of a given proximal stimulus is the one with the highest codimension. Considering the opposite roles of what I called prior and conditional codimensions, this would make sense if it applied to prior codimensions, that is, to the codimensions in the partial order of categories of distal scenes — after all, a high prior codimension is an asset. However, he assumed uniform priors, so, it can only apply to conditional codimensions, that is, to the codimensions in the partial orders of categories of views of specific scenes. This, however, does not seem to make sense — after all, a high conditional codimension is a liability. In other words, his prediction rule (whether or not translated in terms of probabilities) makes sense if applied to priors but not if applied to conditionals. If his formulation would have applied to priors instead of conditionals, then it would have corresponded to the simplicity approach proposed by Buffart, Leeuwenberg, and Restle (1981, 1983). The latter made predictions (concerning amodal completion) based solely on the basis of prior complexities, which correlate negatively with prior codimensions (i.e., simpler objects have a higher prior codimension; see Section 5.1). Rock (1983), however, showed that this is insufficient and that the conditionals should not be ignored (see also, e.g., Boselie, 1988, 1994; Boselie & Wouterlood, 1989; Kanizsa, 1985). In view of this, it is not so much surprising that Feldman assumed uniform priors, but it is surprising that he took a high conditional codimension to be an asset instead of a liability. After all, he claimed to promote a simplicity approach, but conditional codimensions in fact correlate positively with conditional complexities (i.e., objects in a simpler proximal position imply a lower conditional codimension). This is explicated further in the next subsection, and to this end, I conclude this subsection with two remarks. First, considering that Feldman took a high conditional codimension to be an asset, his approach in fact favours interpretations

344

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

involving a high number of coincidences (see Section 5.2). Thereby, contrary to what he claimed, his approach is actually the opposite of Rock's (1983) avoidance-of-coincidences principle. Second, Rock (1983) and others may have shown that the conditionals should not be ignored, but in the 1990s, it became clear that the priors should not be ignored either. This led, for instance, to Bayesian approaches combining nonuniform priors and nonuniform conditionals (cf. Gigerenzer & Murray, 1987; Knill & Richards, 1996). In SIT, it led to van Lier, van der Helm, and Leeuwenberg's (1994) empirically successful model of amodal completion (see also de Wit & van Lier, 2002; van Lier, 1999; van Lier, Leeuwenberg, & van der Helm, 1995; van Lier, van der Helm, & Leeuwenberg, 1995; van Lier & Wagemans, 1999). In the next subsections, this model is compared with Feldman's approach.

within hypothesized objects is a prior asset, but regularity between hypothesized objects is a conditional liability. Hence, because amounts of regularity correlate positively with codimensions, this implies that a high prior codimension is an asset while a high conditional codimension is a liability. In terms of complexities, however, the dual role of regularity implies that both low prior complexities and low conditional complexities are assets. That is, the construction of a hypothesized distal scene as such, on the one hand, requires less effort as the hypothesized objects exhibit more internal regularity, while the construction of a view of a distal scene, on the other hand, requires less effort as the proximal position of the hypothesized objects exhibits less regularity in the form of coincidences.

5.4. The dual role of regularity

As indicated, in the 1980s, it became clear that the conditionals should not be ignored, and in the 1990s, it became clear that the priors should not be ignored either. In fact, by way of a clever experiment, Feldman (2007) provided excellent evidence that perceptual organization is guided by an interplay between priors and conditionals. To test if configurations of two lines (as in Fig. 3) are perceived as one object or as two objects, he exploited the phenomenon that withinobject comparisons are faster than between-objects comparisons (Behrmann, Zemel, & Mozer, 1998). His data, for twelve configurations, nicely show that the two lines are gradually bound more strongly into one perceived object as the conditional codimension for the two-object interpretation increases. For instance, he found that, just as the configuration in Fig. 3a, the one in Fig. 3b (a T-junction) is still perceived as two objects, but that, just as the configuration in Fig. 3d, the one in Fig. 3c is perceived as one object (a hook). T-junctions are particularly interesting, because in many models of amodal completion, they are considered to be cues for occlusion: If a proximal stimulus contains a T-junction, then this is taken as a strong cue that the distal scene comprises one surface partly occluded by the other. To this end, however, the proximal stimulus first has to be segmented into the visible parts of those two surfaces, and I fully agree with Feldman (2007, pp. 817/824) that T-junctions are cues for segmentation rather than for occlusion. That is, T-junctions are cues for segmentation even if occlusion is not at hand. To analyse this further, notice that, if one assumes nonuniform prior complexities and uniform conditional complexities, then the hook in Fig. 3c is not predicted to be perceived as one object. After all, two separate lines are simpler than one hook, so, the two-object interpretation has a lower prior complexity and would therefore be predicted. On the other hand, if one assumes uniform prior complexities and nonuniform conditional complexities, then the T-junction in Fig. 3b is not predicted to be perceived as two objects. After all, the conditional complexity (in sips) is one for the two-object interpretation and zero for the one-object interpretation, so, the latter would be predicted. Furthermore, notice that Feldman concluded to a codimension of one in the case of a T-junction and a codimension of two in the case of a hook. He assumed uniform priors, so that these codimensions can only refer to the conditional codimension for the two-object interpretation. Considering that he took a high conditional codimension to be an asset, this can only imply that he predicts the two-object interpretation in both cases — because the conditional codimension for the one-object interpretation is zero in both cases. This prediction would be right for T-junctions but wrong for hooks. By the way, one would get a prediction which is right for hooks but wrong for T-junctions if one takes a high conditional codimension to be a liability but still assumes uniform priors. Hence, the foregoing confirms the need for both nonuniform priors and nonuniform conditionals. In this respect, notice that Feldman (2007) concluded by:

In van Lier et al.'s (1994) model of amodal completion, both priors and conditionals are quantified in terms of complexities — using the sip, short for structural information parameter, as the unit of complexity. Accordingly, it makes predictions using the information-theoretic formulation of the Occamian simplicity principle in Eq. (4). Here, I focus on its conceptual arguments (further details can be found in the literature). In the Occamian simplicity paradigm, the descriptive complexity of things is associated with the effort needed to construct these things. Accordingly, van Lier et al. (1994) quantified the prior complexity of an interpretation by the number of descriptive parameters needed to construct the hypothesized objects. Furthermore, they quantified the conditional complexity by the effort needed to bring these objects in the given proximal position, starting from a general position (i.e., a position without coincidences). More specifically, they quantified the conditional complexity by the difference in complexity between a general position of the objects and the possibly coincidental proximal position of these objects. This resembles the way in which Feldman (1997, 2003, 2009) determined conditional codimensions (see Fig. 3). The latter are indeed quantitatively equal to van Lier et al.'s (1994) conditional complexities, but Feldman used them differently. That is, Feldman (2009) also promoted a simplicity approach and he rightly assessed that complexities and codimensions correlate, but his approach differs from van Lier et al.'s not only regarding the priors but also regarding the conditionals. In van Lier et al.'s model, both low prior complexities and low conditional complexities are considered to be assets for interpretations (see Eq. 4). Cast in terms of codimensions, this means that it favours not only high prior codimensions (which Feldman assumed to be uniform) but also low conditional codimensions (which is opposite to what Feldman did, but which does honour Rock's avoidance-of-coincidences principle). This difference between complexities and codimensions may be confusing, but it can be clarified as follows. First, notice that, for both prior and conditional categories, amounts of regularity correlate positively with codimensions. For instance, a square exhibits more regularity than a rectangle, which is expressed in a higher prior codimension (see Fig. 2). Furthermore, under the “two sticks” interpretation, Fig. 3d exhibits a very regular relative position of the two sticks, which is expressed in a high conditional codimension. Second, notice that, perceptually, regularity between stimulus elements tends to bind these elements into wholes. Now, reconsider the configuration in Fig. 3d as a proximal stimulus that is still to be interpreted. It can be interpreted as consisting of one object (“one stick”) which involves no coincidences and which is bound strongly into a whole by its highly regular internal structure. It could also be interpreted as consisting of two objects (“two sticks”), but then, it involves coincidences forming a positional regularity that exerts a strong binding force which goes against the segmentation into two objects. This illustrates the dual role of regularity: For an interpretation, regularity

5.5. An application to perceptual organization

“The interplay of cues across time is revealed particularly vividly by the data for T-junctions, which suggest that local cues dominate early

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

while configural cues eventually override them. A more complete computational theory of grouping will be required before we can understand more fully the mechanisms underlying this competition, especially with more complex configurations than those tested here.” (p. 826) I agree with this, but notice that a “more complete” approach was already available, namely, in the form of van Lier et al.'s (1994) model of amodal completion. For a large set of critical and complex stimuli taken from the literature, this model showed to be highly successful in predicting whether occlusion is perceived and, if so, what the occluded shape is taken to be. It does, admittedly, not specify the exact time course of the interplay between cues, but it does specify both the players in this interplay and its outcome. As said, this model promotes an Occamian interplay between nonuniform prior complexities and nonuniform conditional complexities. This implies the following for Fig. 3. Going from top to bottom, the prior complexities for the one-object interpretation are 5, 4, 3, and 1 sip, respectively; it does not involve coincidences, so, its conditional complexity is always zero. For the two-object interpretation, conversely, the prior complexity is always 2 sip, but the conditional complexities are 0, 1, 2, and 3 sip, respectively. The latter yields posterior complexities of 2, 3, 4, and 5 sip, respectively, so that, compared to this two-object interpretation, the one-object interpretation is predicted to get gradually stronger (details of this calculation can be found in the literature and on www.socsci.ru.nl/ ~peterh/doc/t_junction.html). This gradual increase in strength means that the two-object interpretation still is predicted to prevail for T-junctions, but that the one-object interpretation is predicted to prevail for hooks. These predictions are not implied by Feldman's (2009) approach but do agree with Feldman's (2007) data. 6. Conclusion In this theoretical article on perceptual organization, I argued that both the priors and the conditionals for candidate interpretations of a proximal stimulus must be taken into account in predicting the preferred interpretation. The prior indicates how good an interpretation is in itself (i.e., independently of the proximal stimulus), and the conditional indicates how good the proximal stimulus is if the interpretation were true. Priors and conditionals can be cast in terms of either probabilities, complexities, or codimensions, and precisely this seems to have confused some approaches to perceptual organization. If priors and conditionals are cast in terms of probabilities (obtained one way or another, e.g. by deriving them from complexities or codimensions), then one gets a Bayesian model. Contrary to some claims in the literature, however, this does not automatically imply compliance with the Helmholtzian likelihood principle which promotes veridicality in the external world. After all, to comply with this principle, one would have to know the objective probabilities of distal scenes in the world, but these are unknown. If priors and conditionals are cast in terms of complexities or codimensions (and also after a Bayesian translation thereof into probabilities), then one complies with the Occamian simplicity principle which promotes efficiency in terms of internal resources. Then, however, one has to reckon with the dual role of regularity in perception. This dual role implies, for interpretations, that both low prior complexities and low conditional complexities are assets, but contrary to some claims in the literature, that high prior codimensions are assets whereas high conditional codimensions are liabilities. Finally, it may be unclear exactly how the visual system achieves the amazing feat of perceptual organization, but contrary to some claims in the literature, it does not seem to be sufficient to focus on only nonuniform priors or only nonuniform conditionals. As I argued, the outcome of the perceptual organization process can be modeled and explained by an Occamian interplay between both nonuniform

345

priors and nonuniform conditionals. That is, such an Occamian interplay successfully predicts how likely humans are to give certain interpretations. The predicted interpretations do not necessarily agree with what the actual distal scene is most likely to be, but as I also argued, they seem to be sufficiently veridical to guide us through many everyday situations.

References Allen, G. (1879). The origin of the sense of symmetry. Mind, 4, 301–316. Attneave, F. (1954). Some informational aspects of visual perception. Psychological Review, 61, 183–193. Bayes, T. (1958). Studies in the history of probability and statistics: IX. Thomas Bayes' (1763) essay “Towards solving a problem in the doctrine of chances” in modernized notation. Biometrika, 45, 296–315. Behrmann, M., Zemel, R. S., & Mozer, M. C. (1998). Object-based attention and occlusion: Evidence from normal participants and a computational model. Journal of Experimental Psychology. Human Perception and Performance, 24, 1011–1036. Bertrand, J. L. (1889). Calcul des probabilités. Paris: Gauthier-Villars. Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115–147. Binford, T. (1981). Inferring surfaces from images. Artificial Intelligence, 17, 205–244. Boselie, F. (1988). Local versus global minima in visual pattern completion. Perception & Psychophysics, 43, 431–445. Boselie, F. (1994). Local and global factors in visual occlusion. Perception, 23, 517–528. Boselie, F., & Leeuwenberg, E. L. J. (1986). A test of the minimum principle requires a perceptual coding system. Perception, 15, 331–354. Boselie, F., & Wouterlood, D. (1989). The minimum principle and visual pattern completion. Psychological Research, 51, 93–101. Brunswick, E. (1956). Perception and the representative design of psychological experiments. Berkeley, CA: University of California Press. Buffart, H. F. J. M., Leeuwenberg, E. L. J., & Restle, F. (1981). Coding theory of visual pattern completion. Journal of Experimental Psychology. Human Perception and Performance, 7, 241–274. Buffart, H. F. J. M., Leeuwenberg, E. L. J., & Restle, F. (1983). Analysis of ambiguity in visual pattern completion. Journal of Experimental Psychology. Human Perception and Performance, 9, 980–1000. Burns, K. J. (2001). Mental models of line drawings. Perception, 30, 1249–1261. Chaitin, G. J. (1969). On the length of programs for computing finite binary sequences: Statistical considerations. Journal of the Association for Computing Machinery, 16, 145–159. Chater, N. (1996). Reconciling simplicity and likelihood principles in perceptual organization. Psychological Review, 103, 566–581. Collard, R. F. A., & Buffart, H. F. J. M. (1983). Minimization of structural information: A set-theoretical approach. Pattern Recognition, 16, 231–242. Csathó, Á., van der Vloed, G., & van der Helm, P. A. (2003). Blobs strengthen repetition but weaken symmetry. Vision Research, 43, 993–1007. Csathó, Á., van der Vloed, G., & van der Helm, P. A. (2004). The force of symmetry revisited: Symmetry-to-noise ratios regulate (a)symmetry effects. Acta Psychologica, 117, 233–250. de Wit, T., & van Lier, R. (2002). Global visual completion of quasi-regular shapes. Perception, 31, 969–984. Diaconis, P., & Freedman, D. (1986). On the consistency of Bayes estimates. The Annals of Statistics, 14, 1–26. Feldman, J. (1997). Regularity-based perceptual grouping. Computational Intelligence, 13, 582–623. Feldman, J. (2003). Perceptual grouping by selection of a logically minimal model. International Journal of Computer Vision, 55, 5–25. Feldman, J. (2007). Formation of visual “objects” in the early computation of spatial relations. Perception & Psychophysics, 69, 816–827. Feldman, J. (2009). Bayes and the simplicity principle in perception. Psychological Review, 116, 875–887. Garner, W. R. (1970). Good patterns have few alternatives. American Scientist, 58, 34–42. Gigerenzer, G., & Murray, D. J. (1987). Cognition as intuitive statistics. Hillsdale, NJ: Erlbaum. Hochberg, J. E., & McAlister, E. (1953). A quantitative approach to figural “goodness”. Journal of Experimental Psychology, 46, 361–364. Hoffman, D. D. (1996). What do we mean by “The structure of the world”? In D. K. Knill, & W. Richards (Eds.), Perception as Bayesian inference (pp. 219–221). Cambridge, MA: Cambridge University Press. Kanizsa, G. (1985). Seeing and thinking. Acta Psychologica, 59, 23–33. Knill, D. K., & Richards, W. (Eds.). (1996). Perception as Bayesian inference. Cambridge, MA: Cambridge University Press. Koffka, K. (1935). Principles of Gestalt psychology. London: Routledge & Kegan Paul. Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems of Information Transmission, 1, 1–7. Leeuwenberg, E. L. J. (1969). Quantitative specification of information in sequential patterns. Psychological Review, 76, 216–220. Leeuwenberg, E. L. J. (1971). A perceptual coding language for visual and auditory patterns. The American Journal of Psychology, 84, 307–349. Leeuwenberg, E. L. J., & Boselie, F. (1988). Against the likelihood principle in visual form perception. Psychological Review, 95, 485–491.

346

P.A. van der Helm / Acta Psychologica 138 (2011) 337–346

Leeuwenberg, E. L. J., van der Helm, P. A., & van Lier, R. J. (1994). From geons to structure: A note on object classification. Perception, 23, 505–515. Li, M., & Vitányi, P. (1997). An introduction to Kolmogorov complexity and its applications (2nd ed.). New York: Springer-Verlag. MacKay, D. J. C. (2003). Information theory, inference, and learning algorithms. Cambridge, England: Cambridge University Press. Mark, J. T., Marion, B. B., & Hoffman, D. D. (2010). Natural selection and veridical perception. Journal of Theoretical Biology, 266, 504–515. Moore, C. M., Mordkoff, J. T., & Enns, J. T. (2007). The path of least persistence: Evidence of object-mediated visual updating. Vision Research, 47, 1624–1630. Nucci, M., & Wagemans, J. (2007). Goodness of regularity in dot patterns: Global symmetry, local symmetry, and their interactions. Perception, 36, 1305–1319. Pomerantz, J., & Kubovy, M. (1986). Theoretical approaches to perceptual organization: Simplicity and likelihood principles. In K. R. Boff, L. Kaufman, & J. P. Thomas (Eds.), Cognitive processes and performance. Handbook of perception and human performance, Vol. 2, . New York: Wiley pp. 36-1–36-46. Rock, I. (1983). The logic of perception. Cambridge, MA: MIT press. Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423 623–656. Simon, H. A. (1972). Complexity and the representation of patterned sequences of symbols. Psychological Review, 79, 369–382. Solomonoff, R. J. (1964). A formal theory of inductive inference, Part 1. Information and Control, 7, 1–22. Solomonoff, R. J. (1964). A formal theory of inductive inference, Part 2. Information and Control, 7, 224–254. Sutherland, S. (1988). Simplicity is not enough. In B. A. G. Elsendoorn, & H. Bouma (Eds.), Working models of human perception (pp. 381–390). London: Academic Press. Treder, M. S., & van der Helm, P. A. (2007). Symmetry versus repetition in cyclopean vision: A microgenetic analysis. Vision Research, 47, 2956–2967. van der Helm, P. A. (1994). The dynamics of Prägnanz. Psychological Research, 56, 224–236. van der Helm, P. A. (2000). Simplicity versus likelihood in visual perception: From surprisals to precisals. Psychological Bulletin, 126, 770–800. van der Helm, P. A. (2004). Transparallel processing by hyperstrings. Proceedings of the National Academy of Sciences of the United States of America, 101(30), 10862–10867. van der Helm, P. A. (2010). Weber–Fechner behaviour in symmetry perception? Attention, Perception, & Psychophysics, 72, 1854–1864.

van der Helm, P. A. (2011). The influence of perception on the distribution of multiple symmetries in nature and art. Symmetry, 3, 54–71. van der Helm, P. A., & Leeuwenberg, E. L. J. (1991). Accessibility, a criterion for regularity and hierarchy in visual pattern codes. Journal of Mathematical Psychology, 35, 151–213. van der Helm, P. A., & Leeuwenberg, E. L. J. (1996). Goodness of visual regularities: A nontransformational approach. Psychological Review, 103, 429–456. van der Helm, P. A., & Leeuwenberg, E. L. J. (1999). A better approach to goodness: Reply to Wagemans (1999). Psychological Review, 106, 622–630. van der Helm, P. A., & Leeuwenberg, E. L. J. (2004). Holographic goodness is not that bad: Reply to Olivers, Chater, and Watson (2004). Psychological Review, 111, 261–273. van der Helm, P. A., & Treder, M. S. (2009). Detection of (anti)symmetry and (anti)repetition: Perceptual mechanisms versus cognitive strategies. Vision Research, 49, 2754–2763. van der Helm, P. A., van Lier, R. J., & Leeuwenberg, E. L. J. (1992). Serial pattern complexity: Irregularity and hierarchy. Perception, 21, 517–544. van Lier, R. (1999). Investigating global effects in visual occlusion: From a partly occluded square to a tree-trunk's rear. Acta Psychologica, 102, 203–220. van Lier, R. J., Leeuwenberg, E. L. J., & van der Helm, P. A. (1995). Multiple completions primed by occlusion patterns. Perception, 24, 727–740. van Lier, R. J., van der Helm, P. A., & Leeuwenberg, E. L. J. (1994). Integrating global and local aspects of visual occlusion. Perception, 23, 883–903. van Lier, R. J., van der Helm, P. A., & Leeuwenberg, E. L. J. (1995). Competing global and local completions in visual occlusion. Journal of Experimental Psychology. Human Perception and Performance, 21, 571–583. van Lier, R., & Wagemans, J. (1999). From images to objects: Global and local completions of self-occluded parts. Journal of Experimental Psychology. Human Perception and Performance, 25, 1721–1741. von Helmholtz, H. L. F. (1962). Treatise on Physiological Optics. Trans.. In J. P. C. Southall (Ed.), New York: Dover (Original work published 1909). Wenderoth, P., & Welsh, S. (1998). Effects of pattern orientation and number of symmetry axes on the detection of mirror symmetry in dot and solid patterns. Perception, 27, 965–976. Witkin, A. P., & Tenenbaum, J. M. (1983). On the role of structure in vision. In J. Beck, B. Hope, & A. Rosenfeld (Eds.), Human and machine vision (pp. 481–543). New York: Academic Press.

Suggest Documents