Linking simulation argument to the AI risk

12 downloads 0 Views 197KB Size Report
Jun 3, 2015 - capability for large-scale simulations of intelligent observers; or (2) any ..... consequences for astrobiology and SETI studies, especially ...
Futures 72 (2015) 27–31

Contents lists available at ScienceDirect

Futures journal homepage: www.elsevier.com/locate/futures

Linking simulation argument to the AI risk Milan M. C´irkovic´ a,b,* a

Astronomical Observatory of Belgrade, Volgina 7, 11000 Belgrade, Serbia Future of Humanity Institute, Faculty of Philosophy, University of Oxford, Suite 8, Littlegate House, 16/17 St Ebbe’s Street, Oxford OX1 1PT, UK

b

A R T I C L E I N F O

A B S T R A C T

Article history: Available online 3 June 2015

Metaphysics, future studies, and artificial intelligence (AI) are usually regarded as rather distant, non-intersecting fields. There are, however, interesting points of contact which might highlight some potentially risky aspects of advanced computing technologies. While the original simulation argument of Nick Bostrom was formulated without reference to the enabling AI technologies and accompanying existential risks, I argue that there is an important generic link between the two, whose net effect under a range of plausible scenarios is to reduce the likelihood of our living in a simulation. This has several consequences for risk analysis and risk management, the most important being putting greater priority on confronting ‘‘traditional’’ existential risks, such as those following from the misuse of biotechnology, nuclear winter or supervolcanism. In addition, the present argument demonstrates how – rather counterintuitively – seemingly speculative ontological speculations could, in principle, influence practical decisions on risk mitigation policies. ß 2015 Elsevier Ltd. All rights reserved.

Keywords: Existential risk Artificial intelligence Simulations Future of humanity Risk analysis

1. Introduction Bostrom (2003) has suggested that the posterior probability of our living in a computer simulation might be larger than naively expected. This conclusion rests on reasonable assumptions about the advances in information processing and simulation technology, as well as on important philosophical principles, such as Leibniz’s principle of indifference. If we accept – under assumptions such as physicalism regarding minds – that sufficiently advanced simulation of an observer is another observer in her own right, we will in the fullness of time have observers in two categories: baseline physical, evolved ones and simulated ones. Even very limited experience of humans simulating physical objects such as bridges or airplanes or stars tells us that it is much cheaper in terms of resources to simulate an object than to construct it.1 So, it is reasonable to allow for the possible future with cheap simulations and simulated observers possibly outnumbering the evolved ones by a large margin. There are three possible conclusions: either (1) the human species is likely to go extinct before reaching a stage of capability for large-scale simulations of intelligent observers; or (2) any advanced civilization (human or posthuman) is

* Corresponding author at: Astronomical Observatory of Belgrade, Volgina 7, 11000 Belgrade, Serbia. Tel.: +381 69 1687200. E-mail address: [email protected] 1 Even if we are not in position to construct the relevant objects, such as stars, it is still possible to try to replicate some of the aspects of relevant processes – like nuclear reactions in stellar cores – in both computer simulations and laboratory analogs (e.g., thermonuclear fusion reactors). The former is clearly and immensely cheaper. http://dx.doi.org/10.1016/j.futures.2015.05.003 0016-3287/ß 2015 Elsevier Ltd. All rights reserved.

M.M. C´irkovic´ / Futures 72 (2015) 27–31

28

extremely unlikely to run a significant number of simulations of their evolutionary history (or variations thereof – hereafter the ‘‘ancestor-simulations’’); or (3) we are almost certainly living in a computer simulation. Obviously, accepting (3) would mean massive changes in our metaphysical outlook, although it might not, at first glance, present us with any new practical challenges. However, the reasoning employed by Bostrom in reaching the trilemma does not take into account the possibly risky consequences of the very existence of the technologies necessary for running ancestor-simulations. Obviously, the explosive growth of our computing power, expressed through Moore’s Law and similar generalizations (Kurzweil, 2005), as well as our capacity for simulating more and more complex systems, are facts of everyday life and it is not easy to perceive them as large, probably even existential, risk factors. However, there are multiple indications that, as far as increases in computing power and complexity go, we are dealing with the threshold phenomena in which reaching a range of critical values might result in large, possibly catastrophic shifts in the outcome. This is the major concern underlying contemporary discussions of the risk associated with the concept of artificial intelligence (henceforth AI risk). The enormously increased computing capacities of future AI systems are at the core of several high-risk scenarios, which involve both the intrinsic unpredictability of the behavior of such systems and the systems’ simulating powers, which are far in excess of our present-day simulating powers. Usually, these two aspects are dealt with separately, which might not be entirely justified. The present note deals with the risk aspect of the simulation argument, while showing how the central argument about enabling technologies could be further generalized. Once it is accepted that the enabling technologies carry a load of risk quite independently of the issue of simulations and observer-counting, there is a feed-back effect on the distribution of probabilities between the three possible outcomes of Bostrom’s argument. This, in turn, brings about a rearranging of our priorities in dealing with the ‘‘traditional’’ existential risks vs. risks following from AI and the possibility of our living in a simulation. The present argument deals with the future of humanity, but it could be generalized to any set of technological civilizations in the universe at any given epoch. 2. A scenario Consider the following scenario: the increase in computing power leads to viable whole-brain emulation and running human uploads. As far as complexity of both hardware and software go, this is an intermediate stage between the best present-day AI systems and envisioned superintelligent AI systems (in the further text denoted as AI++, following Chalmers, 20102). AI++ systems are clearly a source of existential risk, for with their great power comes the lack of predictability following their superior cognition (Mu¨ller, 2014 and references therein). Therefore, efforts have been made to enable design of safe or ‘‘friendly’’ AI++ (e.g., Yudkowsky, 2008). The main difficulty stems from the fact that the conventional road to AI++ systems goes through self-improvement of lower-level AI systems, notably those equivalent to human intelligence at present, and possibly even much lower. This iterative procedure might occur in a self-accelerating mode and end up with AI++ ‘‘in a flash’’, i.e., before researchers, risk analysts, and policy-makers are able to ascertain the situation and gauge the relevant risks. In order to highlight the complexity of the situation, let us first compare two extreme cases: (i) a world in which all human-level AI as well as AI++ are designed completely safe and sound. In such a world, there would be a huge amount of computing power available to everyone, including individual actors, and running detailed simulations of individual humans, as well as large-scale ancestor-simulations, would be cheap and easy. In this world, it is hard to avoid the conclusion that we are indeed living in a simulation, since the number of simulated observers would, in this world, vastly dominate in the total tally of all observers. In contrast, we might wish to consider (ii) a world in which AI++ emerges rapidly, is extremely dangerous, and the probability of (post)humanity surviving its emergence is zero or sufficiently close to zero. In such case, there will be only evolved observers (up to the moment of the AI++ emergence) plus those simulated observers which would have been simulated prior to the moment of the AI++ emergence. In order to estimate the probability of our living in a simulation, we need to know the ratio between the two, or at least to gauge whether the interval between the advent of the technology of ancestor-simulations and the advent of AI++ is short or long. If that interval is very short, as suggested by rapid emergence of AI++, the measure of simulated observers will be small and, consequently, the probability of our living in a simulation would tend to zero. The realistic case lies somewhere between these two extremes. But the very fact that the magnitude of AI++ risk is related to feasibility and number of ancestor-simulations should impose some constraints on the original simulation argument. 3. The argument Consider the following set of premises: 1. Running ancestor-simulations will require computing resources of some minimal complexity Cas, to be conceived and executed at characteristic timescales tas.

2

For present purposes, AI++ is equivalent to what Bostrom (2014) dubs superintelligence.

M.M. C´irkovic´ / Futures 72 (2015) 27–31

29

2. Such computing resources will exceed those required to successfully run and optimize individual human uploads (having complexity Cup  Cas). 3. Optimization of individual human uploads is a way leading to the superintelligent AI (AI++). 4. AI++ will present a high level of existential risk on characteristic timescales tAI++. 5. tAI++  tas. Hence, 6. Running ancestor-simulations is contingent on successfully managing AI risks. (Here, I neglect for the moment those scenarios in which a malevolent AI++ runs its own ancestor-simulations, presumably after enslaving or exterminating humanity; while it would make no sense to assume that it would not run some simulations, they are quite unlikely to be ancestor-simulations as suggested by Bostrom.3 However, I shall reconsider this stance at the very end of this section.) In particular, if management of AI risk requires effectively formation of a singleton (Bostrom, 2006), at least in this particular respect, [1_TD$IF]then it would entail[2_TD$IF]: 7. The ancestor-simulations are mostly run by singletons, if at all. It is conceivable that there are extraneous reasons against running ancestor-simulations. Some of these reasons might be economical (too high resource costs) or ethical (suffering of sentient beings inside the simulation). Note that all these could be translated into increases in tas (complete prohibition being equivalent to an infinite characteristic timescale). Expensive projects require, on the average, more time for completion than the cheap ones; ethically controversial projects require, on the average, more time for completion than those which are obviously acceptable from the moral point of view. Therefore, it might be the case that A. There are valid reasons for prohibiting the running of ancestor-simulations. B. Singletons will be uniquely capable of enforcing any such prohibition. Which, together with 7 and the original simulation argument of Bostrom (2003), will give the conclusion that C. The a posteriori probability of our living in the simulation is decreased. (The a posteriori qualification pertains to the current argument in which we have taken the AI++ risk and the possibility of singletons into account, in contrast to the a priori simulation argument, which does not consider those factors.) A related and potentially interesting variation on this theme arises when we consider being located in a malevolent AI++ simulation as an existential risk in itself. While this type of scenario may not look intuitive, our ignorance about AI motivations and drives should justify such considerations as well. For example, human history thus far is so permeated by suffering of sentient beings, that one might find it appealing subject for simulating from the point of view of a sadistic AI++ system (insofar as we one can intelligibly use such attributes for a superintelligence). The argument given above then justifies reaching a weaker, but still interesting conclusion that, as long as running ancestor-simulations requires successful management of AI risk, logically and chronologically prior to the emergence of AI++, the a posteriori probability of our living in a malevolent AI++ simulation is decreased.4 This is somewhat weaker than conclusion in C, but does not depend on validity of A and B, and still gives some grounds for optimism. 4. Discussion Premises 1, 2, and 4 above do not seem questionable according to our best present insights. Possible weaknesses of the argument leading to 6 may be located in premises 3 or 5. While it might be that the technologies of uploading and optimization of uploaded minds are in fact irrelevant to the feasibility and development of AI++, it does not seem likely, for several interrelated reasons. Those ancestor-simulations containing observers in our reference class – those to which the simulation argument applies – will contain several times 109 human-level intelligences (much more if it simulates large

3 I also neglect ‘‘slow fuse’’ AI++ risks in which the complexity of AI++ is reached at some point, the temporal delay until the adverse consequences are manifested is long enough to run many ancestor-simulations. 4 I am deeply grateful to an anonymous referee for bringing my attention to this important point.

30

M.M. C´irkovic´ / Futures 72 (2015) 27–31

segments of past and future), plus an arbitrary but huge number of other high-complexity systems, notably cognitively advanced animals. The same insight into physical structure of human brains allowing an ancestor-simulation will allow for running uploads, and the same methods and resources required for optimization of ancestor-simulations can reasonably be used in optimizing uploads (Sandberg & Bostrom, 2008). If uploads are a viable road to a conventional AI, and a conventional AI leads via subsequent optimization to AI++, then premise 3 holds; it would be the case even if they are not the most efficient or desirable road to the same goal. It would not hold only if the road to AI++ is the one containing inherently unpredictable elements: for instance, if quantum computation is necessary for AI++, while not being necessary for human-level AI. While we cannot confidently judge such scenarios at present, most AI researchers would dismiss this possibility as unrealistic. The impending advances in elaborating the pathways to AI will hopefully resolve this issue soon (Bostrom, 2014; Rayhawk, Salamon, McCabe, Anissimov, & Nelson, 2009). What about the timescales in assumption 5? While the relevant timescales might indeed vary wildly from one ancestorsimulation to another, it seems that each of them would include long-term simulated multiple human-level intelligences. While it is impossible to know in advance the timescale for their development, setting up, running and analyzing results, etc., it seems that the very fact of existence of all these (and other) stages suggests that it will take some considerable time interval in the director’s reference frame. In contrast, the main factor in the AI risk is exactly potentially extremely rapid AI ! AI++ transition via self-improving loop which can lead to superintelligence on timescales short by everyday human standards. Even if it cannot be proven in advance, the assumption such as 5 above should be accepted pending further insight just as a precaution. We might, in principle, conceive of situations in which the timescale for AI++ is approximately equal or longer than the timescale for ancestor-simulations. However, those seem unlikely and contrived as long as we maintain that AI++ corresponds to a welldefined and circumscribed region of the relevant design space. Ancestor-simulations, on the other hand, are not so constrained: there could be very many of them and very different ones, and those containing most observers will, for basic computational reasons, tend to consume most resources and last longest (again in the director’s reference frame). The second line of argument leading to C is admittedly more speculative and imagination-limited at present. Both premises A and B could be criticized on the grounds of presupposing hardly knowable motivations and values of the directors of ancestor-simulations. One can rather easily conceive scenarios avoiding the specifics of both premises. Notably, suffering in the simulated world could indeed be compensated for in a ‘‘virtual afterlife’’ of relatively low complexity and computing cost (eerily similar to the doctrines of most historical religions, thus avoiding at least some ethical obstacles to running ancestor-simulations). Also, the directors could be immensely richer than we currently imagine, so any computing cost for them might indeed be negligible, negating hypothetical economic constraint on running ancestor-simulations. In the case, however, that some constraints remain, it is clear that their enforcement will be very hard in face of the trend of empowerment of individuals and other small actors in the society. In fact, this problem is relevant from the point of view of other global risks, such as the threat of more and more accessible weapons of mass destruction or climate change quick-fixes by cheap and risky geo-engineering procedures. If the only or predominant solution for enforcement in such cases is formation of a singleton – or something similar to a singleton – than it should a fortiori be the case with harder-to-manage threats such as the risk of hostile AI++. While it would certainly be desirable to obtain more precise estimate of the a posteriori decrease in probability that we are living in simulation, as concluded in C, it does not seem possible unless some further insights are provided. In particular, the timescale for AI++ development and the efficiency of singletons in managing AI risks are two key parameters which control the magnitude of the probability shift. At present, both are quite poorly constrained and much further work is necessary in this respect. In the extreme pessimistic case of quick AI++ emergence and only marginal efficiency of singletons in risk management, the probability shift would be negligible and the argument given above would make no practical sense. In other parts of the parameter space, however, the shift could be large and add strong support to the particular ontological view of observed physical reality as the baseline case. Insofar as we regard being in a simulation as an existential risk per se (e.g., Bostrom, 2002), the present argument could influence decision-making on mitigation of other risks. It is reasonable to conclude that if one finds – for whatever exact reasons – that the probability of our living in a simulation is high, then the effort to be spent on reducing other existential and global catastrophic risks would be lower than if there were no chance of being in a simulation. Complementarily, if we have reasons to rationally believe that the probability of our living in a simulation is low, we should assign more resources to reducing the chances for, say, catastrophic asteroid impact or a global nuclear war. In this manner, the present argument contends that, under a plausible range of assumptions, we have actually more reason to invest resources in mitigating latter well-known large risks. This agrees well with our intuitions regarding mitigation priorities. 5. Conclusions The original simulation argument of Bostrom assumes the availability, safety, and societal acceptability of ancestorsimulation-relevant technology. The argument maintains its separation from the cluster of safety problems surrounding superintelligent AI++. This separation is, I have argued, unjustified. Instead, the same technologies enabling the running of ancestor-simulations are key enablers of the AI++ as well – and, therefore, key factors in the existential risks generated by AI++. By the same token, any measures whatsoever undertaken to regulate the existential risk stemming from AI++ is likely to influence the Bayesian calculus expressing likelihood of our living in a simulation. As I have shown, if those measures are

M.M. C´irkovic´ / Futures 72 (2015) 27–31

31

to be effective, the net result is a decrease in the likelihood of our living in a simulation and the support for ‘‘realism’’ in this respect. Of course, the degree of such support depends on the details of AI++ risk mitigation, which is a large and mostly unexplored issue thus far – but one generating lively research activity in recent years (Bostrom, 2014; Eden, Moor, Søraker, & Steinhart, 2012; Mu¨ller, 2014). Note that the current argument does not suggest any specific axiological views. In particular, I do not wish to argue that it is ‘‘better’’ or ‘‘more palatable’’ that the probability of our living in a simulation is decreased if we take the AI/AI++ risk more seriously. Such moral conclusions are still, unfortunately, based to a large extent on subjective intuitions and cultural and psychological preferences. It would be also quite misleading to read the present argument as devaluing the AI risk through association with a radical metaphysical hypothesis; quite to the contrary, I regard the AI risk as a given threat, not yet ‘‘real and present danger’’ in the legal sense, but very real and forthcoming nonetheless. The purpose of the argument given above could be far better construed as flowing in the opposite direction: to show that even traditional ontological issues deserve more respectability and serious discussion in light of their connection to rational assessment of a very real risk. It could be regarded as a particular instance of the general lesson about renewed relevance of philosophy in future studies. The same argument can easily be generalized to a statistical conclusion about properties of the whole set of civilizations successfully overcoming the AI++ risk. Therefore, unless malevolent AI++ systems which have destroyed their biological predecessors engage in massive simulations with many individual observers, we may wish to increase our posterior confidence that evolved observers outnumber the simulated ones in our universe. Obviously, this has a wide set of consequences for astrobiology and SETI studies, especially regarding the issue whether most relevant SETI targets are biological or post-biological (e.g., C´irkovic´, 2012; Dick, 2003). Conversely, if we had some independent reasons to believe we are not living in a simulation, the line of reasoning presented here would constitute a weak probabilistic argument that singleton is the most likely form of organization of any intelligent community of observers. While it is not obvious what those independent reasons might be, unless a radical move like the rejection of physicalism about minds is considered, the possibility should be kept in mind. At the very least, it might be useful to consider it in the more conservative context of public outreach or policy-making debates, where radical metaphysical ideas like the simulation hypothesis might encounter a gut-level resistance. An additional practical benefit which might follow from this speculative philosophical considerations is taking more seriously into account the trajectories leading to possible future human singleton, as a regulating factor in the overall risk landscape. While we may yet conclude that singleton is an attractor in the space of evolutionary trajectories of advanced technological civilizations on independent grounds, it is worthwhile to bear in mind that these trajectories, by definition, are outcome of evolutionary processes which are morally indifferent and include extinction as entirely naturalistic and often expected outcome of the underlying regularities of physics and biology. It is at the level of applied science and philosophy – including future studies as the paramount applied multidisciplinary field – that considerations of intentionality, morality or desirability come into play, even if some level of anthropocentrism is inescapable in the process. Insofar as we can speak about formation of singleton as a form of risk mitigation (and not, obviously, just in the narrow context of the AI risk, but even more in the wider context of threats from climate change, biowarfare/bioterrorism, etc.), there is no doubt that more work on elaboration and quantitative modeling of the timing arguments would be very welcome. Acknowledgments It is a pleasure to thank the guest editor, Seth Baum, for his kind help, encouragement, and dilligent work in improving previous versions of this manuscript. Two anonymous referees are acknowledged for important suggestions and criticisms. I wish to thank Jelena Dimitrijevic´, Anders Sandberg, Branislav Vukotic´, Nick Bostrom, Slobodan Popovic´, Slobodan Perovic´, Ivana Kojadinovic´, Karl Schroeder, Ana Erakovic´, Aleksandar Obradovic´, Goran Milovanovic´, Ana Vlajkovic´, George Dvorsky, and the late Robert Bradbury for many pleasant and useful discussions on the topics related to the subject matter of this study. This research has been supported by the Ministry of Education and Science of the Republic of Serbia through the project ON176021. References Bostrom, N. (2002). Existential risks: Analyzing human extinction scenarios and related hazards. Journal of Evolution and Technology, 9. Bostrom, N. (2003). Are you living in a computer simulation? Philosophical Quarterly, 53, 243–255. Bostrom, N. (2006). What is a singleton? Linguistic and Philosophical Investigations, 5, 48–54. Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford: Oxford University Press. Chalmers, D. J. (2010). The singularity: A philosophical analysis. Journal of Consciousness Studies, 17, 7–65. C´irkovic´, M. M. (2012). The astrobiological landscape: Philosophical foundations of the study of cosmic life. Cambridge: Cambridge University Press. Dick, S. J. (2003). Cultural evolution, the postbiological universe and SETI. International Journal of Astrobiology, 2, 65–74. Eden, A., Moor, J., Søraker, J., & Steinhart, E. (Eds.). (2012). Singularity hypotheses: A scientific and philosophical assessment. Berlin: Springer-Verlag. Kurzweil, R. (2005). The singularity is near: When humans transcend biology. New York: Viking Penguin. Mu¨ller, V. C. (2014). Editorial: Risks of general artificial intelligence. Journal of Experimental and Theoretical Artificial Intelligence, 26, 297–301. Rayhawk, S., Salamon, A., McCabe, T., Anissimov, M., & Nelson, R. (2009). Changing the frame of AI futurism: From story-telling to heavy-tailed, high-dimensional probability distributions. Proceedings of the European conference on computing and philosophy. Sandberg, A., & Bostrom, N. (2008). Whole brain emulation: A roadmap. Technical report #2008-3. Future of Humanity Institute, Oxford University. Yudkowsky, E. (2008). Artificial intelligence as a positive and negative factor in global risk. In N. Bostrom & M. M. Cirkovic (Eds.), Global catastrophic risks (pp. 308– 345). Oxford: Oxford University Press.

Suggest Documents