Computation and Logic in the Real World

11 downloads 0 Views 6MB Size Report
Philip Welch (Bristol), Turing Unbound: Transfinite Computation ...... Bailhache P., L. Poinsot: La théorie générale de l'équilibre et du mouvement des systèmes,.
S Barry Cooper Thomas F. Kent Benedikt Löwe Andrea Sorbi (Eds.)

Computation and Logic in the Real World Third Conference on Computability in Europe, CiE 2007 Siena, Italy, June 18–23, 2007 Local Proceedings

Preface

CiE 2007: Computation and Logic in the Real World Siena, Italy, June 18–23, 2007

Computability in Europe (CiE) is an informal network of European scientists working on computability theory, including its foundations, technical development, and applications. Among the aims of the network is to advance our theoretical understanding of what can and cannot be computed, by any means of computation. Its scientific vision is broad: computations may be performed with discrete or continuous data by all kinds of algorithms, programs, and machines. Computations may be made by experimenting with any sort of physical system obeying the laws of a physical theory such as Newtonian mechanics, quantum theory or relativity. Computations may be very general, depending upon the foundations of set theory; or very specific, using the combinatorics of finite structures. CiE also works on subjects intimately related to computation, especially theories of data and information, and methods for formal reasoning about computations. The sources of new ideas and methods include practical developments in areas such as neural networks, quantum computation, natural computation, molecular computation, computational learning. Applications are everywhere, especially, in algebra, analysis and geometry, or data types and programming. Within CiE there is general recognition of the underlying relevance of computability to physics and a broad range of other sciences, providing as it does a basic analysis of the causal structure of dynamical systems. This volume, Computation and Logic in the Real World, is the local proceedings of the third in a series of conferences of CiE that was held at the Dipartimento di Scienze Matematiche e Informatiche “Roberto Magari”, University of Siena, June 18–23, 2007. The first two meetings of CiE were at the University of Amsterdam, in 2005, and at the University of Wales Swansea last year. Their proceedings, edited in 2005 by S Barry Cooper, Benedikt Löwe and Leen Torenvliet, and in 2006 by Arnold Beckmann, Ulrich Berger, Benedikt Löwe and John V Tucker, were published as Springer Lecture Notes in Computer Science, Volumes 3526 and 3988, respectively. As the editors noted in last year’s proceedings, CiE and its conferences have changed our perceptions of computability and its interface with other areas of knowledge. The large number of mathematicians and computer scientists attending those conference had their view of computability theory enlarged and transformed: they discovered that its foundations were deeper and more mysterious, its technical development more vigorous, its applications wider and more challenging than they had known. The Siena meeting promises to extend and enrich that process. The annual CiE conference, based on the Computability in Europe network, has become a major event, and is the largest international meeting focused on computability theoretic issues. The series is coordinated by the CiE Conference Series Steering Committee:

Preface

III

Paola Bonizzoni (Milan) Barry Cooper (Leeds) Benedikt Löwe (Amsterdam, Chair) Elvira Mayordomo (Zaragoza) Dag Normann (Oslo) Andrea Sorbi (Siena) Peter van Emde Boas (Amsterdam). We will reconvene in 2008 in Athens, 2009 in Heidelberg and 2010 in Lisbon.

Structure and Programme of the Conference The conference was based on invited tutorials and lectures, and a set of special sessions on a range of subjects; there were also many contributed papers and informal presentations. This volume, together with the accompanying LNCS proceedings, contains all the accepted papers from those submitted and invited, and the abstracts of the informal presentations. There will be a number of post-proceedings publications, including special issues of Theoretical Computer Science, Theory of Computing Systems, Annals of Pure and Applied Logic, and Journal of Logic and Computation. Tutorials Pieter Adriaans (Amsterdam), Learning as Data Compression Yaakov Benenson (Cambridge, Massachusetts), Biological Computing Invited Plenary Talks Anne Condon (Vancouver), Computational Challenges in Prediction and Design of Nucleic Acid Structure Stephen Cook (Toronto), Low Level Reverse Mathematics Yuri Ershov (Novosibirsk), HF-Computability Sophie Laplante (Paris), Using Kolmogorov Complexity to Define Individual Security of Cryptographic Systems Wolfgang Maass (Graz), Liquid Computing Anil Nerode (Cornell), Logic and Control Piergiorgio Odifreddi (Turin), Conference Introductory Lecture Roger Penrose (Oxford), A talk on Aspects of Physics and Mathematics Michael Rathjen (Leeds), Theories and Ordinals in Proof Theory Dana Scott (Pittsburgh), Two Categories for Computability (Lecture sponsored by the European Association for Computer Science Logic.) Robert I. Soare (Chicago), Computability and Incomputability Philip Welch (Bristol), Turing Unbound: Transfinite Computation Special Sessions Doing without Turing Machines: Constructivism and Formal Topology, organised by Giovanni Sambin and Dieter Spreen Givanni Sambin (Padova) Doing without Turing Machines: Constructivism and Formal Topology Andrej Bauer (Ljubljana), RZ: a Tool for Bringing Constructive and Computable Mathematics Closer to Programming Practice Douglas Bridges (Canterbury, NZ), Apartness on Lattices

IV

Preface

Thierry Coquand (Göteborg), A Constructive Version of Riesz Representation Theorem Maria Emilia Maietti (Padova), Constructive Foundation for Mathematics as a Two Level Theory: An Example Approaches to Computational Learning, organised by Marco Gori and Franco Montagna John Case (Newark, Delaware), Resource Restricted Computability Theoretic Learning: Illustrative Topics and Problems Klaus Meer (Odense), Some Aspects of a Complexity Theory for Continuous Time Systems Frank Stephan (Singapore), Input-Dependence in Function-Learning Osamu Watanabe (Tokyo), Finding Most Likely Solutions Real Computation, organised by Vasco Brattka and Pietro Di Gianantonio Pieter Collins (Amsterdam), Effective Computation for Nonlinear Systems Abbas Edalat (London), A Continuous Derivative for Real-Valued Functions Hajime Ishihara (Tokyo), Unique Existence and Computability in Constructive Reverse Mathematics Robert Rettinger (Hagen), Computable Riemann Surfaces Martin Ziegler (Paderborn), Real Hypercomputation Computability and Mathematical Structure, organised by Serikzhan Badaev and Marat Arslanov Vasco Brattka (Cape Town), Computable Compactness Barbara F. Csima (Waterloo), Properties of the Settling-Time Reducibility Ordering Sergey S. Goncharov (Novosibirsk), Computable Numberings Relative to Hierarchies Jiří Wiedermann (Prague), Complexity Issues in Amorphous Computing Chi Tat Chong (Singapore), Maximal Antichains in the Turing Degrees Complexity of Algorithms and Proofs, organised by Elvira Mayordomo and Jan Johannsen Eric Allender (Piscataway, New Jersey), Reachability Problems: An Update Jörg Flum (Freiburg), Parameterized Complexity and Logic Michal Koucký (Prague), Circuit Complexity of Regular Languages Neil Thapen (Prague), The Polynomial and Linear Hierarchies in Weak Theories of Bounded Arithmetic Heribert Vollmer (Hannover), Computational Complexity of Constraint Satisfaction Logic and New Paradigms of Computability, organised by Paola Bonizzoni and Olivier Bournez Felix Costa (Lisbon), The New Promise of Analog Computation Natasha Jonoska (Tampa, Florida), Computing by Self-assembly Giancarlo Mauri (Milan), Membrane Systems and Their Applications to Systems Biology Grzegorz Rozenberg (Leiden), Biochemical Reactions as Computations Damien Woods (Cork), (with Turlough Neary) The Complexity of Small Universal Turing Machines Computational Foundations of Physics and Biology, organised by Guglielmo Tamburrini and Christopher Timpson James Ladyman (Bristol), Physics and Computation: The Status of Landauer’s Principle Itamar Pitowsky (Jerusalem), From Logic to Physics: How the Meaning of Computation Changed Over Time Grzegorz Rozenberg (Leiden), Natural Computing: A Natural and Timely Trend for Natural Sciences and Science of Computation Christopher Timpson (Leeds), What’s the Lesson of Quantum Computing?

Preface

V

Giuseppe Trautteur (Naples), Does the Cell Compute? Women in Computability Workshop, organised by Paola Bonizzoni and Elvira Mayordomo A new initiative at CiE 2007 is the adding of the Women in Computability workshop to the programme. Women in Computer Science and Mathematics face particular challenges in pursuing and maintaining academic and scientific careers. The Women in Computability workshop brings together women in Computing and Mathematical research to present and exchange their academic and scientific experiences with young researchers. The speakers were: Anne Condon (British Columbia) Natasha Jonoska (Tampa, Florida) Carmen Leccardi (Milan) Andrea Cerroni (Milan)

Organisation and Acknowledgements The conference CiE 2007 was organised by the logicians at Siena: Andrea Sorbi, Thomas Kent, Franco Montagna, Tommaso Flaminio, Luca Spada, Andrew Lewis, Maria Libera Affatato and Guido Gherardi; and with the help of Leeds computability theorists: S Barry Cooper, Charles Harris and George Barmpalias; and Benedikt Löwe (Amsterdam). The CiE CS Steering Committee also played an essential role. The Programme Committee was chaired by Andrea Sorbi and Barry Cooper and consisted of: Manindra Agrawal (Kanpur) Marat M. Arslanov (Kazan) Giorgio Ausiello (Rome) Andrej Bauer (Ljubljana) Arnold Beckmann (Swansea) Ulrich Berger (Swansea) Paola Bonizzoni (Milan) Andrea Cantini (Florence) S. Barry Cooper (Leeds, Co-Chair) Laura Crosilla (Leeds) Josep Diaz (Barcelona) Costas Dimitracopoulos (Athens) Fernando Ferreira (Lisbon) Sergei S. Goncharov (Novosibirsk) Peter Grünwald (Amsterdam) David Harel (Jerusalem) Andrew Hodges (Oxford)

Julia Kempe (Paris) Giuseppe Longo (Paris) Benedikt Löwe (Amsterdam) Johann A. Makowsky (Haifa) Elvira Mayordomo Cámara (Zaragoza) Wolfgang Merkle (Heidelberg) Franco Montagna (Siena) Dag Normann (Oslo) Thanases C. Pheidas (Iraklio, Crete) Grzegorz Rozenberg (Leiden) Giovanni Sambin (Padova) Helmut Schwichtenberg (Munich) Wilfried Sieg (Pittsburgh) Andrea Sorbi (Siena, Co-Chair) Ivan N. Soskov (Sofia) Peter van Emde Boas (Amsterdam)

We are delighted to acknowledge and thank the following for their essential financial support: the Department of Mathematics and Computer Science “Roberto Magari” at Siena; the Fondazione del Monte dei Paschi di Siena; the Istituto Nazionale di Alta Matematica: Gruppo Nazionale per le Strutture Algebriche, Geometriche e le loro Applicazioni (INDAM-GNSAGA); the University of Siena; the Associazione Italiana di Logica e sue Applicazioni (AILA); the Association for Symbolic Logic (ASL); the European Association for Computer Science Logic (EACSL). We would also like to thanks our sponsors: the European Association for Theoretical Computer Science (EATCS); the Association of Logic, Language and Information (FoLLI);

VI

Preface

the Committee on the Status of Women in Computing Research (CRA-W). We are pleased to thank our colleagues on the Organising Committee for their many contributions and our research students for practical help at the conference. Special thanks are due to Thomas Kent, Tommaso Flaminio, Luca Spada, Andy Lewis, and Franco Montagna for their precious collaboration, and the Congress Service of the University of Siena for the administrative aspects of the conference. The high scientific quality of the conference was possible through the conscientious work of the Program Committee, the special session organisers and the referees. We are grateful to all members of the Programme Committee for their efficient evaluations and extensive debates, which established the final programme. We also thank the following referees: Klaus Aehlig Pilar Albert Klaus Ambos-Spies Andrea Asperti Luís Antunes Albert Atserias George Barmpalias Freiric Barral Sebastian Bauer Almut Beige Josef Berger Luca Bernardinello Daniela Besozzi Laurent Bienvenu Christian Blum Markus Bläser Thomas Bolander Roberto Bonato Lars Borner Abraham P. Bos Malte Braack Vasco Brattka Andries E. Brouwer Joshua Buresh-Oppenheim Nadia Busi Cristian S. Calude Riccardo Camerlo John Case Orestes Cerdeira Yijia Chen Luca Chiarabini Jose Félix Costa Ronald Cramer Paola D’Aquino Ugo Dal Lago Victor Dalmau Tijmen Daniëls Anuj Dawar Barnaby Dawson Ronald de Wolf José del Campo

Karim Djemame David Doty Rod Downey Martin Escardo Antonio Fernandes Claudio Ferretti Eric Filiol Eldar Fischer Daniel Garca Parmenides Garcia Cornejo William Gasarch Ricard Gavaldà Giangiacomo Gerla Eugene Goldberg Massimiliano Goldwurm Johan Granström Phil Grant Dima Grigoriev Barbara Hammer Tero Harju Montserrat Hermo Peter Hertling Thomas Hildebrandt John Hitchcock Pascal Hitzler Steffen Hölldobler Mathieu Hoyrup Simon Huber Jim Hurford Carl Jockusch Jan Johannsen Michael Kaminski Vladimir Kanovei Basil Karadais Vassilios Karakostas Iztok Kavkler Thomas Kent Hans Kleine Büning Sven Kosub Bogomil Kovachev Evangelos Kranakis

S. N. Krishna Oleg Kudinov Petr Kurka Eyal Kushilevitz Akhlesh Lakhtakia Jérôme Lang Hans Leiss Stephane Lengrand Alberto Leporati Andy Lewis Maria Lopez-Valdes Michele Loreti Alejandro Maass Vincenzo Manca Edwin Mares Luciano Margara Maurice Margenstern Simone Martini Ralph Matthes Andrea Maurino Klaus Meer Nenad Mihailovic Russell Miller Pierluigi Minari Nikola Mitrovic Tal Mor Philippe Moser Mioara Mugu-Schachter Thomas Müller Nguyen Hoang Nga Ray Nickson Karl-Heinz Niggl Martin Otto Jiannis Pachos Aris Pagourtzis Dimitrii Palchunov Francesco Paoli Dirk Pattinson George Paun Andrea Pietracaprina Sergey Podzorov

Preface

Chris Pollett Pavel Pudlak Diana Ratiu Jan Reimann Paul Ruet Markus Sauerman Stefan Schimanski Wolfgang Schönfeld Jeremy Seligman Peter Selinger Mariya Soskova Bas Spitters Frank Stephan Mario Szegedy

Wouter Teepe Balder ten Cate Sebastiaan Terwijn Christof Teuscher Neil Thapen Klaus Thomsen Christopher Timpson Michael Tiomkin Edmondo Trentin Trifon Trifonov José Triviño-Rodriguez Reut Tsarfaty John Tucker Sara Uckelman

VII

Christian Urban Tullio Vardanega Sergey Verlan Thomas Vidick Heribert Vollmer Rebecca Weber Philip Welch Guohua Wu Reem Yassawi Martin Ziegler Jeffery Zucker Dragisa Zunic

We thank Andrej Voronkov for his Easy Chair system which facilitated the work of the Programme Committee and the editors considerably. Siena, Leeds and Amsterdam, May 2007

S Barry Cooper Thomas F. Kent Benedikt Löwe Andrea Sorbi

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

II

Contributed Talks Trust-moderated Information-Likelihood. An MVL Approach . . . . . . . . . . . . . . . . . . . . . . . Adrien Revault d’Allonnes, Herman Akdag, and Olivier Poirel

1

Very Primitive Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sandra Alves, Maribel Fernández, Mário Florido, and Ian Mackie

7

Mobile Ambients and Mobile Membranes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bogdan Aman and Gabriel Ciobanu

16

Natural Deduction and Normalisation for Partially Commutative Linear Logic and Lambek Calculus with Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maxime Amblard and Christian Retoré

28

Computational Depth of Infinite Strings Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luís Antunes, Armindo Costa, Armando Matos, and Paul Vitányi

36

A New Approach to the Uncounting Problem for Regular Languages . . . . . . . . . . . . . . . . Kostyantyn Archangelsky

45

Paraconsistent Reasoning and Distance Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ofer Arieli

53

Syntactic Approximations to Computational Complexity Classes . . . . . . . . . . . . . . . . . . . . Argimiro Arratia, Carlos E. Ortiz

62

Nondeterminism without Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mathias Barra, Lars Kristiansen, and Paul J. Voda

71

Exact Direct Cover Minimization Algorithm for a Single Output of Boolean Functions . Fatih Basçiftçi

79

Weak König’s Lemma Implies the Fan Theorem for C-bars . . . . . . . . . . . . . . . . . . . . . . . . . Josef Berger

86

Semantics of Sub-Probabilistic Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yixiang Chen and Hengyang Wu

89

Concrete and Abstract Quantum Computational Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maria Luisa Dalla Chiara, Roberto Giuntini, and Roberto Leporini

97

Pseudorandom Number Generation Based on 90/150 Linear Nongroup Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sung-Jin Cho, Un-Sook Choi, Han-Doo Kim, Yoon-Hee Hwang, and Jin-Gyoung Kim Towards a Domain-theoretic Model of Developmental Machines . . . . . . . . . . . . . . . . . . . . . Graçaliz Pereira Dimuro, Antônio Carlos da Rocha Costa

106

114

Preface

Quantum Algorithms for Graph Traversals and Related Problems . . . . . . . . . . . . . . . . . . . Sebastian Dörn Four Ways of Logical Reasoning in Theoretical Physics and their Relationship with both Kinds of Mathematics and with Computer Science . . . . . . . . . . . . . . . . . . . . . . . . . . . Antonino Drago

IX

123

132

Totally d-c.e. Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yun Fan, Decheng Ding, and Xizhong Zheng

143

Why is P Not Equal to N P ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Fellows and Frances Rosamond

151

P = NP for Expansions Derived from Some Oracles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christine Gaßner

161

Resolution Proofs Hidden in Mathematical and Physical Structures and Complexity . . . Annelies Gerber

170

Hybrid Finite Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luís Mendes Gomes and José Félix Costa

178

Learnability of Recursively Enumerable Sets of Recursive Real-Valued Functions . . . . . . Eiju Hirowatari, Kouichi Hirata, and Tetsuhiro Miyahara

186

A Logic for Probabilistic XML documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robin Hirsch and Evan Tzanis

194

Computational Power of Intramolecular Gene Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tseren-Onolt Ishdorj, Ion Petre, and Vladimir Rogojin

202

Turing Degrees & Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iraj Kalantari and Larry Welch

210

The Fabric of Small Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gregory Lafitte and Christophe Papazian

219

On a Constructive Proof of Completeness of the Implicational Propositional Calculus . . Domenico Lenzi

228

On the Computational Complexity of the Theory of Complete Binary Trees . . . . . . . . . . Zhimin Li, Libo Lo, and Xiang Li

233

A Multiple-Evaluation Genetic Algorithm for Numerical Optimization Problems . . . . . . Chih-Hao Lin and Jiun-De He

239

A Functional Characterisation of the Analytical Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . Bruno Loff

247

How can Natural Brains Help us Compute? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlos Lourenço

257

A Purely Arithmetical, yet Empirically Falsifiable, Interpretation of Plotinus’ Theory of Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bruno Marchal

263

X

Preface

On a Relationship between Non-Deterministic Communication Complexity and Instance Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Armando B. Matos, Andreia C. Teixeira, and André C. Souto

274

Parallelism in DNA and Membrane Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benedek Nagy and Remco Loos

283

On the Complexity of Matching Non-injective General Episodes . . . . . . . . . . . . . . . . . . . Elżbieta Nowicka and Marcin Zawada

288

Elementary Complexity into the Hyperfinite II1 Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marco Pedicini and Mario Piazza

297

Self-similar Carpets Associated to the Odd Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mihai Prunescu

306

The Intended Model of Arithmetic. An Argument from Tennenbaum’s Theorem . . . . . . Paula Quinon and Konrad Zdanowski

313

Induction vs Collection over the Weak Pigeon-hole Principle with Counting . . . . . . . . . . A. Sirokofskich

318

Logical and Complexity-theoretic Aspects of Models of Computation with Restricted Access to Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iain A. Stewart

324

Effective Reducibilities on Structures and Degrees of Presentability . . . . . . . . . . . . . . . . . . Alexey Stukachev

332

On Computability in Ershov’s Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zh. T. Talasbayeva

340

Universal Quantum Computation via Yang-Baxterization of the Two-Colour Birman-Wenzl-Murakami Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mario Vélez and Juan Ospina

346

Turning the Liar Paradox into a Metatheorem of Basic Logic . . . . . . . . . . . . . . . . . . . . . . . P. A. Zizzi

354

On η-Representable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maxim Zubkov

364

Abstracts of Informal Presentations Information Content and Computability in the n-C.E. Hierarchy, II . . . . . . . . . . . . . . . . . Bahareh Afshari

367

ω-rules and Learnability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yohji Akama

368

The Deduction Theorem, Optimal Proof Systems, and Complete Disjoint NP-Pairs . . . . Olaf Beyersdorff

369

Some Results and Questions in Effective Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . Laurent Bienvenu, Wolfgang Merkle, and Alexander Shen

370

Preface

XI

Framework for Comparing Domain Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Blanck

371

Black Box Groups: Guesswork, Computation, Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexandre V. Borovik

372

Strong Degree Spectra of Initial Segments of Scattered Linear Orders . . . . . . . . . . . . . . . . John Chisholm, Jennifer Chubb, Valentina S. Harizanov, Denis R. Hirschfeldt, Carl G. Jockusch, Jr., Timothy McNicholl, and Sarah Pingrey

373

Computability of the Evolution of Hybrid Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . Pieter Collins

374

Effective Structure Theorems for Spaces of Compactly Supported Distributions . . . . . . . Fredrik Dahlgren

375

Towards a Logical Characterisation of and Complexity Gap Theorem(s) for Resolution-based Propositional Proof Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stefan Dantchev

376

Generating Number Sequences Through Periodic Structures in Tag Systems . . . . . . . . . . Liesbeth De Mol

377

About Quantum Graph Colouring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ellie D’Hondt

378

Risk-Aware Grid Resource Brokering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Karim Djemame, Iain Gourlay, and James Padgett

379

There Exist Some ω-Powers of any Borel Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Olivier Finkel and Dominique Lecomte

380

Membrane Dynamical Systems: Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giuditta Franco

381

Computable Measures and Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stefano Galatolo

382

Complexity of Proofs in Extensions of the Polynomial Calculus . . . . . . . . . . . . . . . . . . . . . Nicola Galesi and Massimo Lauria

383

Automated Abstraction-Refinement of Hybrid Automata for Monotonic CTL Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. Gentilini, K. Schneider, and B. Mishra Belief Flow Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sujata Ghosh, Benedikt Löwe, Erik Scorelle, and Fernando Velazquez-Quesada

384 385

Classification Problems for Computable Structures and Relations on Computable Models 386 Sergey S. Goncharov Lambda Types on the Lambda Calculus with Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . Ferruccio Guidi

387

Continuous Stream Processing and a Generalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Hancock

388

XII

Preface

What if Computers could Count to Infinity? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chris Impens and Sam Sanders

389

Doing α-Recursion Theory with Ordinal Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Koepke and Benjamin Seyfferth

390

Acyclicity of Preferences, Nash Equilibria, and Subgame Perfect Equilibria: a Formal and Constructive Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stéfane le Roux

391

Limit Computable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mashudu Makananise and Vasco Brattka

392

The Strength of Fraïssé’s Conjecture for Finite Hausdorff Rank . . . . . . . . . . . . . . . . . . . . . Alberto Marcone

393

Weak Lowness Properties and Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joseph S. Miller

394

Solution to a Problem of Vardi on Ordinal Heights of Well-founded Automatic Relations 395 Mia Minnes and Bakhadyr Khoussainov Almost-Everywhere Domination, Non-cupping and LR-reducibility . . . . . . . . . . . . . . . . . Anthony Morphett

396

Some New Approaches to Characterizing Computable Analysis by Analog Computation 397 Kerry Ojakian On Isomorphism Type of Some Structures in Many-one Degrees . . . . . . . . . . . . . . . . . . . . Sergei Podzorov

398

A Fixed-Parameter Approach to the Convex Recolouring Problem . . . . . . . . . . . . . . . . . . Oriana Ponta

399

Gödel’s Incompleteness Theorem and Man-Machine Non-Equivalence . . . . . . . . . . . . . . . . Zvonimir Šikić

400

Co-total enumeration degrees. Part 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Boris Solon

401

The Jump Operator on the ω-Enumeration Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivan N. Soskov and Hristo Ganchev

402

A Non-Splitting Theorem in the Enumeration Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mariya Ivanova Soskova

403

Universal Match Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mingji Xia

404

Trust-moderated Information-Likelihood. A Multi-valued Logics Approach Adrien Revault d’Allonnes1,2 , Herman Akdag1 , and Olivier Poirel2

2

1 Laboratoire d’Informatique de Paris 6 - LIP6 {Adrien.Revault-d’Allonnes,Herman.Akdag}@lip6.fr http://webia.lip6.fr/~allonnes/?lg=en; http://webia.lip6.fr/~akdag Office National d’Études et de Recherches Aérospatiales - ONERA {Adrien.Revault_dAllonnes,Olivier.Poirel}@onera.fr http://www.onera.fr/english.php

Abstract. This work’s motivation is to evaluate an information’s certainty based on a confirmation criterion and weighted by its source’s credibility. To simplify matters in order to keep this paper legible, we will suppose we have an incoming flow of information, each of which is either a confirmation of a known information, a contradiction of a known information or is unknown. With each piece of information an external estimation of its source’s credibility will be given. Due to the uncertainty regarding the evolution of information certainty and that of its source’s credibility, we have chosen to represent both in a multi-valued logics formalism. In this way, the constraints we will put on credibility evolution will be expressed in the same formalism as the actual evolution.

1

Introduction

The application we are working on aims at giving a certainty score to information of different types and from various sources. The bias for calculating an information’s certainty is that if it has been confirmed by other sources, it is probably more likely to be true. However, we would like to refine this idea and integrate a moderation with respect to the estimated trustworthiness of the information’s source. This trustworthiness is an existing information and we believe that the more you trust someone, the more likely you are to believe what that person tells you. From this basic human tendency and using the information at hand, we wish to build a model that will favour trusted sources yet consider others as well. Since we will be, in effect, calculating a confirmation score, we will need to compare incoming information. The actual comparison is beyond the reach of this paper, but if one considers that the sources are different and differently rated, that each source may give information of a different type than that already known, that the considered information may well confirm or disclaim the information of interest but might do this only ‘to some extent’ one immediately sees how the data we wish to evaluate, to say nothing of the evaluation itself, is extremely uncertain. In addition to this, evaluating a source’s trustworthiness is also another potential cause for uncertainty. This explains why we have chosen to express our work in a multivalued logics framework. Using the existing interpretation [2, 3] which states that x is vα A ⇔ “x is A” is τα –true. In our problem this will be interpreted as both ‘Information i is likely is τα –true’ and as ‘Information i’s source is trustworthy is τα –true’. The actual meaning of ‘τα –true’ will be briefly recalled in the next section. Section 2 will also introduce the notation we have chosen to express our model. In section 3, we will start by introducing the principles of our algorithm (§3.1), and go on to show a more formal description in §3.2. The following section, section 4, will discuss what we think our formalism represents and allows in terms of cognitive posture. Finally in section 5 we will conclude this paper and offer some thoughts on our future works.

2

Adrien Revault d’Allonnes, Herman Akdag, and Olivier Poirel

2

Multi-valued formalism

We will be reasoning in a multi-valued logics formalism. This implies that we give ourselves a totally ordered scale of truth degrees, LM = {τ0 , . . . , τM −1 }. LM is said to be totally ordered because τα 6 τβ ⇔ α 6 β. This scale ranges from τ0 which is considered to be ‘false’ to τM −1 or ‘true’, intermediate τα ’s values in between, such as ‘possibly true’ and the like. Multi-valued scales, and hence qualitative degrees, offer a good way of modelling uncertain, ill-defined or poorly appraisable knowledge, something like Zadeh’s linguistic variables did [1]. Other works have added tools to reason on these uncertain values. Among these Darwiche and Ginsberg, Seridi and Akdag [4, 5] have built operators to combine intermediate truth values, and therefore model evolving cognitive processes. Following in Zadeh’s footsteps, Truck, Akdag and others [6, 8, 9] construct symbolic modifiers and other generalisations of useful operands. To model the evolution of our belief scores, we need to use operations on truth degrees. We will use those defined by Seridi and Akdag in [5] using Łukasiewicz’s implication, defined in a multi-valued context by: τα →L τβ = min(τM −1 , τM −1−(α−β) ) Notation K, in figure 1, represents the set of all known information. i ∈ K represents information i and any ulterior information confirming i. τ (i) ∈ LM is the current evaluation of information i’s certainty. Terms denoted using the letter κ, in general, refer to the evaluation of a source’s trustworthiness. In particular κ+ (i) ∈ LM is the current advancement in information τ (i)’s progression to the τ superior level, as κ− (i) ∈ LM is τ (i)’s progression to the inferior mark, if any. Also, κτβα represents the threshold set to get i’s credibility from τα to τβ .

3

Algorithm

We want combined source-credibility to moderate the evolution information-score. Therefore the evolution of the truth degree of an information i, will be of the form illustrated in figure 1 and detailed hereafter. 3.1

Principle

We suppose, for legibility’s sake, that we have a flow of information which either confirms a previously known information, contradicts it or is as yet unknown. Any such information will be denoted i hereafter, whether it be the original information or any other confirming it. Any contradicting information will be noted ¬i. We will suppose, for the time being, that an information confirms – or contradicts – another fully. Now, suppose a κj -trustworthy (κj ∈ LM ) source gives us a new information i. As would any new information, i will be initially rated at the middle of our scale, to represent an uncertainty about its likelihood. What we want the process to do next is to have sufficient confirmation to go on to the next level of likelihood. We also wish to favour trusted sources over unknown or untrustworthy ones. For τ (i) to move on to ADD(τ (i), τ1 ) = τ (i) ⊕ τ1 3 , i.e. a one-step increase in i’s likelihood (resp. SUB (τ (i), τ1 ) = τ (i) ª τ1 3 , a one-step decrease in i’s likelihood ) in the case of a confirmation (resp. contradiction), we will therefore require κ+ (i) (resp. κ− (i)) to ADD (τ (i),τ1 ) SUB (τ (i),τ1 ) reach a certain threshold κτ (i) (resp. κτ (i) ). The choice of parameters and its implications will be discussed in section 4. 3

Where ADD(τα , τβ ) = ¬τα →L τβ SUB (τα , τβ ) = ¬(τα →L τβ )

Trust-moderated Information-Likelihood. An MVL Approach

3

i∈ /K

κ+ (i) > κττ10

τ1

τ0

κ− (i) > κττ01

τ4

τ3

τ2

κ− (i) > κττ12

κ+ (i) > κττ43

κ+ (i) > κττ32

κ+ (i) > κττ21

κ− (i) > κττ23

κ− (i) > κττ34

Fig. 1. UML-like state diagram, representing the evolution of i’s likelihood and the various thresholds on the way in L5 . Note that there is no end-state.

3.2

Formal representation

To clarify the above described algorithm, the following representation describes each step along the way. Note that as long as we receive either a confirmation or a negation of a given piece of information i, we will loop through this algorithm. Obviously, if either end of the scale has been reached the progression in the corresponding direction will not evolve, but no information is permanently rated. Suppose we learn information i from a source whose trustworthyness is estimated at κj ∈ LM if i ∈ / K then τ (i) ← τ M 2 κ+ (i) ← κj else κ+ (i) ← ADD(κ+ (i), κj ) end if τ (i)⊕τ if κ+ (i) > κτ (i) 1 then τ (i) ← ADD(τ (i), τ1 ) κ+ (i) ← null κ− (i) ← null end if Suppose, now, we learn information ¬i with a given source-credibility κj ∈ LM Since i ∈ K, then κ− (i) ← ADD(κ− (i), κj ) τ (i)ªτ if κ− (i) > κτ (i) 1 then τ (i) ← SUB (τ (i), τ1 ) κ− (i) ← null κ+ (i) ← null end if

4

4

Adrien Revault d’Allonnes, Herman Akdag, and Olivier Poirel

Discussion

In this section, we will discuss the different parameters of our algorithm and their respective influence and range. We will then explain what we believe these parameters allow us to model and how. First, we must note that we suggest to rate both information likelihood and source trustworthiness on the same scale LM . Obviously, this is possible only if the steps needed to distinguish different levels of trust are compatible with those required by the plausibility rating. That is to say that if we decide to rate source credibility on, say, a seven level scale (i.e. L7 ), then there τ have to be seven steps in likelihood as well. The same consequence applies to the thresholds κτβα . We could, of course, distinguish the two scales, since they only relate the granularity of truth we are allowing and because likelihood and trustworthiness are never compared nor combined. We have chosen to express both on the same scale to insist on the fact that they are truth values, distinct from the objects they relate to. However, whether both scores are evaluated on the same scale or not, it is important to note that they differ in interpretation. In fact the two factors differ in nature. κ± (i) is an accumulated local evaluation of source credibility, whereas τ (i) is an evolved evaluation on one information. When an information is rated from ‘impossible’ to ‘certain’ through ‘highly unlikely’ and the like, the aggregated credibilities of different sources cannot really be read as anything. However each credibility score, taken on its own, may well be anything from ‘untrustworthy’ to ‘completely trustworthy’, scaling through all and many degrees on the way. The most important part of our method is the setting of the thresholds. We think that by τ not imposing any symmetry on κτβα and κτταβ we can model different psychological postures. Indeed, we say that one way of modelling a suspicious character is to favour downard evolution over upward progression. Suppose that it would take three confirmations to get from ‘possibly true’ to ‘quite likely’, the next step up, and only two to go the other way. We think that this is typical of a mistrustful psychology. If, in addition to this, we fix the step down from ‘possibly true’ to ‘probably not true’ to two contradictions, we have a very hard to convince person. The fact that this probably increases the potential number of cycles in the evaluation of an information’s likelihood is not a problem, since, by construction, there is no end state. As long as confirmations or contradictions keep on coming, the credibility will keep on evolving, whatever the settings. The main consequence of having unequal thresholds is that the order of arrival of the information is important. We also think that, in a debate for instance, the order and the timing of arguments is of primary importance. Besides, we can eliminate this problem by setting all thresholds to the same value. Note that we have chosen to set κ± (i) to null when τ (i) changes. We could have kept the overhaul to indicate which penchant we were on. Example To try and make the above discussion clearer, we will consider a simple situation and look at its consequences. Suppose first of all that we will be using L5 to judge the truth degrees. Table 1 gives the interpretations associated with each degree for the likelihood of an information and for the trustworthiness of any source. Now suppose we have two different readers, as in table 2 and in figure 2, who will be rating the same flow of information, specified in table 3. What we mean by a flow of information is a time-ordered list of information / source-rating pairs. To keep things handy, we will only list related informations, i.e. pieces of information either confirming or contradicting the original one. Table 3 shows the flow of incoming information and also the consequential evolution of both user’s rating. If User 1 is balanced and regular, he should not necessarily be seen as exceedingly trusting. Indeed he will only trust one source if it is rated as ‘quite trustworthy’. User 2, on the other hand, is rather mistrustful. Not only is he hard to convince, but changes of hearts will,

Trust-moderated Information-Likelihood. An MVL Approach Likelihood Totally unlikely Rather unlikely Possible Rather likely Extremely likely

Degree τ0 τ1 τ2 τ3 τ4

5

Trustworthiness Absolutely untrustworthy Rather untrustworthy Possibly trustworthy Quite trustworthy Completely trustworthy

Table 1. An example of possible truth-values in L5

User 1 ↑ κττ10 κττ21 κττ32 κττ43

User 2 ↓

τ3 τ3 τ3 τ3

κττ01 κττ12 κττ23 κττ34

↑ τ3 τ3 τ3 τ3

κττ10 κττ21 κττ32 κττ43

↓ κττ01 κττ12 κττ23 κττ34

τ4 τ3 τ2 τ3

τ2 τ2 τ1 τ1

Table 2. Two different perspectives on persuasion

i∈ / K τ3

τ1

τ0

τ3

i∈ / K τ3

τ3

τ3

τ4

τ3

τ2

τ3

τ4

τ3

τ3

τ3

τ1

τ0

τ2

User 1

τ3

τ2

τ2

τ2

τ4

τ3

τ2

τ1

User 2

Fig. 2. Graphical representation of our two users, described in table 2

Information i ¬i i ¬i i ¬i

Source Trustworthiness τ1 τ2 τ2 τ2 τ3 τ2

User 1 τ (i) κ+ (i) κ− (i) τ2 τ1 τ2 τ1 τ2 τ3 τ3 τ2 τ4 τ4 τ2

User 2 τ (i) κ+ (i) κ− (i) τ2 τ1 τ1 τ1 τ2 τ0 τ0 τ3 τ0 τ3 τ2

Table 3. A conflicting flow of information, the first line denoting the initial entry of the knowledge, the others either confirming it (i) or contradicting it (¬i)

in general, not be received very well. This simplified example was, obviously, constructed to enhance our point of view that different settings reflect different attitudes towards trust. Yet we are convinced that in a more general context the same differences would be noted. So, where Mendel and John [7] see the fuzzyfying of membership functions as an opportunity to allow for noise in the model, we think that our sort higher type multi-valued formalism may allow for different perceptions on the evolution of the truth-degrees.

6

5

Adrien Revault d’Allonnes, Herman Akdag, and Olivier Poirel

Conclusion

In this paper we have used a multi-valued formalism to qualify both our belief in an information and in its source. We have used the latter to moderate the former’s evolution. In so doing, we have constructed a qualitative estimation of the truth value, hence added some lattitude to model uncertain processes. We have shown that different cognitive stands may be represented using this added degree of freedom. In future works, we would like to investigate further in matters of comparison. Our model supposes that information are either unrelated or totally comparable. We think that it would benefit from the inclusion of a degree of similarity between compared objects. We would also like to work further on multi-valued scales. We think that, with all their convenient properties, they might benefit from being relaxed somewhat.

References 1. L. A. Zadeh. The concept of linguistic variable and its application in approximate reasoning. Information Science (I, II, III), 8(9), 1975. 2. M. De Glas. Representation of Łukasiewicz’ many-valued algebras; the atomic case. Fuzzy Sets and Systems, 14, 1987. 3. H. Akdag, M. De Glas, and D. Pacholczyk. A qualitative theory of uncertainty. Fundamenta Informaticae, 17(4):333–362, 1992. 4. A. Darwiche and M. Ginsberg. A symbolic generalization of probability theory. In proceedings of the American Association for Artificial Intelligence, San Jose, California, 1992. 5. H. Seridi and H. Akdag. Approximate reasoning for processing uncertainty. Journal of Advanced Computational Intelligence, 5(2):108–116, 2001. Fuji technology Press. 6. H. Akdag, I. Truck, A. Borgi, and N. Mellouli. Linguistic modifiers in a symbolic framework. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9 (Supplement):49– 61, 2001. 7. J. Mendel and R. John. Type-2 Fuzzy Sets Made Simple IEEE Transactions on Fuzzy Systems, 10(2):117–127, April 2002. 8. I. Truck and H. Akdag. Fuzzy Systems Engineering Theory and Practice, Series: Studies in Fuzziness and Soft Computing, volume 181, chapter 2: A Qualitative Approach for symbolic Data Manipulation under Uncertainty, pages 23–51. Springer, 2005. 9. I. Truck and H. Akdag. Manipulation of qualitative degrees to handle uncertainty: Formal models and applications. Knowledge and Information Systems, 9 (4):385–411, 2006.

Very Primitive Recursive Functions? Sandra Alves1?? , Maribel Fernández2 , Mário Florido1 , and Ian Mackie2,3? ? ? 1

2

University of Porto, Department of Computer Science & LIACC, R. do Campo Alegre 823, 4150-180, Porto, Portugal King’s College London, Department of Computer Science, Strand, London WC2R 2LS, U.K. 3 LIX, CNRS UMR 7161, École Polytechnique, 91128 Palaiseau Cedex, France

Abstract. With the recent trend of analysing the process of computation through the linear logic looking glass, it is well understood that copying and erasing are essential in order to obtain a Turing-complete computation model. But do we need explicit copying and erasing machinery? In this paper we show that the class of partial recursive functions that are syntactically linear is Turing-complete.

1

Introduction

Primitive recursive functions, which we shall call PR, are a class of functions which form an important building block on the way to a full formalisation of computability. Intuitively speaking, (partial) recursive functions are those that can be computed by some Turing machine. Primitive recursive functions can be computed by a specific class of Turing machines that always halt. PR are the least set including the zero, successor and projection functions, and closed under the operations of composition and primitive recursion. In the PR definition zero and successor give us access to the natural numbers and the projection functions are useful for erasing, adding and reordering arguments. Copying and erasing (i.e., the ability for functions to duplicate and to discard their arguments) are key operations in the definition of all interesting functions by primitive recursion and also in the definition of the two operations used to define PR itself: composition and primitive recursion. In this paper we focus on this aspect of computation, which has attracted a great deal of attention in recent years. We say that a function is linear if it uses its argument exactly once. In this context the following question arises: can we define the class of primitive recursive functions without explicitly relying on copying and erasing? In this paper we show that the answer is yes; more precisely, we show that any primitive recursive function can be defined using a syntactically linear system. Furthermore, we show that any computable function can be defined using a single minimisation operator and linear functions. This yields an alternative formulation of the theory of recursive functions, where each function is linear; we call this class of functions linear recursive functions. Summarising, our main contributions are: – Definition of the class of linear primitive recursive functions (LPR)—a class of functions defined by the initial functions zero, successor, and permutations (generated by swappings and the identity; all of these are linear functions) together with linear composition and pure iteration. Note that we don’t have projections (since they are not linear functions). ?

?? ???

Research partially supported by the Treaty of Windsor Grant: “Linearity: Programming Languages and Implementations”, and by funds granted to LIACC through the Programa de Financiamento Plurianual, Fundação para a Ciência e Tecnologia and FEDER/POSI. Programa Gulbenkian de Estímulo à Investigação. Projet Logical, Pôle Commun de Recherche en Informatique du plateau de Saclay, CNRS, École Polytechnique, INRIA, Université Paris-Sud.

8

Sandra Alves, Maribel Fernández, Mário Florido, and Ian Mackie

– Simulation of erasing and copying in LPR, in particular, projections can be simulated by permutation followed by a linear erasing. – Linear primitive recursive functions define exactly the same set of functions as primitive recursive functions. This shows that primitive recursive functions can be defined linearly. – Any general recursive function (i.e., any computable function) can be obtained if we add a minimisation operator working on linear primitive recursive functions. Related work: There are several alternative definitions of the primitive recursion scheme [15, 10, 11, 14, 6]. In some of these works, for instance [11, 6], primitive recursion was replaced by pure iteration. Pure iteration is a linear scheme, in the sense that each function uses its arguments exactly once. Here we extend this previous work by replacing also the initial functions (by simpler linear initial functions) and the composition scheme (by a linear composition scheme). We then show that this defines a set of linear functions that corresponds exactly to the primitive recursion functions. There are several formalisms based on the notion of linearity that limit the use of copy and erasing. This includes languages based on a version of the λ-calculus with a type system corresponding to intuitionistic linear logic [8]. One of the main features of the calculus (which can be seen as a minimal functional programming language) is that it provides explicit syntactical constructs for copying and erasing terms (corresponding to the exponentials in linear logic) [1]. From another perspective there have been a number of calculi, again many based on linear logic, for capturing specific complexity classes ([3, 7, 9, 4, 13, 18, 5]). One of the main examples is that of bounded linear logic [9], which has as one of its main aims to find a calculus inbetween the linear λ-calculus and that with the exponentials (specifically the polynomial time computable functions). In previous work [2] we showed that a simple extension of a typed, linear λ-calculus, without the exponentials, i.e., a calculus that is syntactically linear, has an enormous computational power. More precisely, we demonstrated that without minimisation but with an iterator and higher-order constructs, any function definable in Gödel’s System T is linear. These previous results inspired our work on linear primitive recursion. This work is part of a research programme which aims at studying the notion of linearity in computation, and at analysing the computational power of functions that are syntactically linear. In this paper we show that all primitive recursive functions can be defined in a linear syntax and the same is true of general (partial) recursive functions. In the next section we recall the background material. In Section 3 we define the linear primitive recursive functions, and in Section 4 we show how any PR function can be encoded as an LPR function and vice-versa. In Section 5 we define linear recursive functions and show that any computable function can be written as a linear recursive function. Finally we conclude the paper in Section 6.

2

Background

We assume familiarity with recursion theory, and recall some basic notions along the lines presented in [16]. We refer to reader to [16] for more details. Notation: We use x1 , . . . , y1 , . . . to represent natural numbers, f, g, h to represent functions and X1 , . . . to represent sequences of the form x1 , . . . , xn . We only have tuples on natural numbers, thus we will work modulo associativity for simplicity: (X1 , (x1 , x2 ), X2 ) = (X1 , x1 , x2 , X2 ). Definition 1 (Primitive recursive functions). A function f : Natk → Nat is primitive recursive if it can be defined from a set of initial functions using composition and the primitive recursive scheme defined as:

Very Primitive Recursive Functions

9

– Initial functions: 1. The natural numbers, built from 0 and the successor function S. (We write n or Sn 0 for S . . . (S 0).) | {z } n

2. Projection functions: πin (x1 , . . . , xn ) = xi (1 ≤ i ≤ n); we omit the superindex when there is no ambiguity. – Composition, which allows us to define a primitive recursive function h using auxiliary functions f , g1 , . . . , gn where n ≥ 0: h(X) = f (g1 (X), . . . , gn (X)). – The primitive recursive scheme, which allows us to define a recursive function h using two auxiliary primitive recursive functions f , g: h(X, 0) = f (X) h(X, S n) = g(X, h(X, n), n). In [11] it was shown that primitive recursion could be replaced by a more restricted recursion scheme, called pure iteration: hg (X, 0) = X hg (X, S n) = g(hg (X, n)). The function hg (X, n), obtained by the last scheme, is the result of applying the function g n times to X. Hence we may write hg (X, n) = g n X. We do not have constant functions of the form C(X) = x as initial functions. However, we can see 0 as a constant function with no arguments, and every other constant function can be built by composition of 0 and S, and projections. For instance, the constant function zero(x, y) = 0 is defined as an instance of composition (using the initial, 0-ary function 0) and one(x, y) = S(zero(x, y)), again as an instance of the composition scheme. Note also, that functions obtained from primitive recursive functions by introducing “dummy” variables, permuting variables, or repeating variables, are also primitive recursive functions. To keep our definitions simple, we will sometimes omit the definition of those functions. In the sequel, we consider primitive recursive functions from Natk to Natl , since every primitive recursive function from Natk to Natl can be transformed into a primitive recursive function from Natk to Nat, and vice versa (see [17] for details). Definition 2 (Minimisation). Let f be a total function from Natn+1 to Nat. The function g from Natn to Nat is called the minimisation of f and is defined as: g(X) = min{y | f (X, y) = 0}. We denote g as µy (f ). Definition 3 (Recursive functions). The set of (partial) recursive functions is defined as follows: – Every primitive recursive function is recursive. – For every n ≥ 0 and every total recursive function f : Natn+1 → Nat, the function Mf : Natn → Nat defined by Mf = µy (f ) is a recursive function. We recall the following result from Kleene [12]. Theorem 1 (The Kleene normal form). Let h be a partial recursive function on Natk . Then, a number n can be found such that h(x1 , . . . , xk ) = f (µy (g(n, x1 , . . . , xk , y))) where f and g are primitive recursive functions.

10

3

Sandra Alves, Maribel Fernández, Mário Florido, and Ian Mackie

Linear primitive recursive functions

Definition 4. A function f : Natk → Natj is linear primitive recursive if it can be defined from a set of linear initial functions using linear composition and the linear primitive recursive scheme defined as follows: – Initial functions: 1. The natural numbers, built from 0 and the successor function S. We write n or Sn 0 for S . . . (S 0). | {z } n

2. Swappings, written (i j), and Id (identity), which we use to generate permutations. More precisely, permutations are generated by composition of swappings and Id; they are generated by the grammar: π ::= Id | (i j) · π. Permutations act on tuples, written: π(x1 , . . . , xn ). Id is the identity, and the result of applying a swapping (i j) on (t1 . . . ti . . . tj . . . tn ) is the tuple (t1 . . . tj . . . ti . . . tn ). – Linear composition, which allows us to define a function h using auxiliary linear primitive recursive functions f , g1 , . . . , gk , k ≤ n: h(x1 , . . . , xn ) = f (g1 (X1 ), . . . , gk (Xk )), where (X1 , . . . , Xk ) = π(x1 , . . . , xn ). – Pure iteration, which allows us to define a recursive function hg using an auxiliary linear primitive recursive function g: hg (X, 0) = X hg (X, S n) = g(hg (X, n)). Example 1. A simple example of a linear primitive recursive function is addition. It can be defined as follows, where g(x) = S(x): addg (x, 0) = x addg (x, S n) = g(addg (x, n)). 3.1

Some useful linear primitive recursive functions

Erasing the last element of a tuple. The function Elast , which erases the last element of a tuple of length greater than 1, is defined as: Elast (X, 0) = X Elast (X, S n) = Id(Elast (X, n)) where Id is the identity permutation: π(x1 , . . . , xn ) = (x1 , . . . , xn ). Lemma 1. For any X and number n, Elast (X, n) = X. Proof. By induction: Elast (X, 0) = X and Elast (X, S n) = Id(Elast (X, n)) = Id(X) = X. Using the function that erases the last element one can define a function zero, such that zero(X) = 0. Example 2. The function zero is defined as: zero(x1 , . . . , xn ) = Elast . . . Elast (0, x1 , . . . , xn ) | {z } n

Lemma 2. For any X, zero(X) = 0.

Very Primitive Recursive Functions

11

Linear copying. We now define copying using pure iteration: f (x1 , . . . , xk ) = Id(Sx1 , . . . , Sxk ) Ck (n) = hf ((0, . . . , 0), n). | {z } k

We use C to denote C2 . Lemma 3. For any number n, and any k > 0, Ck (n) = (n, . . . , n). | {z } k

We can generalise this for any tuple: Ck (x1 , . . . , xn ) = π(Ck (x1 ), . . . , Ck (xn )) where (x1 , . . . , xn , . . . , x1 , . . . , xn ) = π(x1 , . . . , x1 , . . . , xn , . . . , xn ). | {z } | {z } k

k

Example 3. Multiplication is a linear primitive recursive function. mul(x, y) = Elast (mul0g (0, x, y)) 0 mulg (x1 , x2 , 0) = (x1 , x2 ) mul0g (x1 , x2 , S n) = g(mul0g (x1 , x2 , n)) where g(x1 , x2 ) = f (x1 , C(x2 )), and f (x1 , x2 , x3 ) = (add(x1 , x2 ), x3 ). Using these ideas we will define a systematic translation of primitive recursive definitions into linear primitive recursive functions.

4

From linear primitive to primitive and back

In this section we show that every primitive recursive function is linear primitive recursive. We also show that linear primitive recursive functions do not add any power to primitive recursive functions, i.e., the two classes coincide. 4.1

Primitive recursive functions are linear primitive recursive

A summary of the encoding of primitive recursive functions using linear primitive recursive functions is given as follows: Primitive recursive 0 and S projections composition recursive scheme

Linear primitive recursive 0 and S permutations + linear erasing linear composition + linear copying + linear erasing pure iteration + linear copying + linear erasing

Projections. We can define projections using linear primitive recursive functions: πi (x1 , . . . , xn ) = Elast · · · Elast (xi , x1 , . . . , xi−1 , xi+1 , . . . , xn ) | {z } n−1

where (xi , x1 , . . . , xi−1 , xi+1 , . . . , xn ) = π(x1 , . . . , xn ). Lemma 4. For any (x1 , . . . , xn ): πi (x1 , . . . , xn ) = xi .

12

Sandra Alves, Maribel Fernández, Mário Florido, and Ian Mackie

Multiple projection can be defined as follows: πI (X) = Elast · · · Elast (X1 , X2 ) | {z } n−k

where I = {i1 , . . . , ik } ⊆ {1, . . . , n}, (xi1 , . . . , xik , X2 ) = π(X). Lemma 5. π{i1 ,...,ik } (x1 , . . . , xn ) = (xi1 , . . . , xik ). Proof. By induction on n − k. Composition. We now define composition (see Definition 1), using linear primitive recursive functions. Let h(X) = f (g1 (X), . . . , gk (X)) where X = x1 , . . . , xn , and assume there are linear primitive recursive functions f L , and g1L , . . . , gkL such that f L (Y ) = f (Y ) giL (Z) = gi (Z), (1 ≤ i ≤ k). Then we define h using the linear composition scheme as follows: h(X) = fL0 (π((Ck (x1 ), . . . , Ck (xn )))), where fL0 (X 0 ) = f L (g1L (X1 ), . . . , gkL (Xk )) and X 0 = (X1 , . . . , Xk ) = (x1 , . . . , xn , . . . , x1 , . . . , xn ) = π(x1 , . . . , x1 , . . . , xn , . . . , xn ). | {z } | {z } | {z } | {z } X1

Xk

k

k

Primitive recursive scheme. We now define the primitive recursive scheme of Definition 1, using linear primitive recursive functions. Let f L and g L be such that for the auxiliary functions f and g in the primitive recursive scheme we have: f L (X) = f (X) g L (X, x, n) = g(X, x, n). We define hL in the following way: hL (X, n) = π1 (hg1 (f1 (C(X)), 0, n)), where f1 (X1 , X2 ) = (f L (X1 ), X2 ) f2 (X, x, n) = (X, x, n, X, n) = π(C(X), x, C(n)) g2 (X, x, n, X, n) = (g L (X, x, n), X, S n) g1 (x, X, n) = g2 (f2 (X, x, n)) (x, X, n) = π(X, x, n). Lemma 6. For any X, and number n, hg1 (f1 (C(X)), 0, n) = (h(X, n), X, n). Proof. By induction: hg1 (f1 (C(X)), 0, 0) = (f L (X), X, 0) = (h(X, 0), X, 0) and hg1 (f1 (C(X)), 0, Sn) = g1 (hg1 (f1 (C(X)), 0, n)) = g1 (h(X, n), X, n) = g2 (f2 (X, h(X, n), n)) = (g L (X, h(X, n), n), X, S n) = (h(X, S n), X, S n). Lemma 7. For any X, and number n: hL (X, n) = h(X, n). Proof. hL (X, n) = π1 (hg1 (f1 (C(X)), 0, n)) = π1 (h(X, n), X, n) = h(X, n).

Very Primitive Recursive Functions

4.2

13

Linear primitive recursive functions are primitive recursive

A summary of the encoding of linear primitive recursive functions using primitive recursive functions is given as follows: Linear primitive recursive Primitive recursive 0 and S 0 and S permutations projection + composition linear composition composition + projection pure iteration recursive scheme + projection Permutations. We can define permutations using primitive recursive functions, in the following way: π(x1 , . . . , xn ) = (y1 , . . . , yn ), where yi = πj (x1 , . . . , xn ), and for every yi , (1 ≤ i ≤ n) we use a different projection on x1 , . . . , xn . Linear composition. We now define linear composition, using primitive recursive functions. Let f P , and g1P , . . . , gkP be such that f P (X) = f (X) giP (Xi ) = gi (Xi ), (1 ≤ i ≤ k). and (X1 , . . . , Xk ) = π(X). Then we define linear composition as h(X) = f P (g10 (X), . . . , gk0 (X)) where gi0 (X) = giP (πIi (X)), with Ii = {i1 , . . . , im } ⊆ {1, . . . , n}, and if X = x1 , . . . , xn , then Xi = xji1 , . . . , xjim . Pure Iteration. Let g P be a primitive recursive function from Natk to Natl , such that for the auxiliary function g in the pure iteration scheme we have: g P (X) = g(X). We define hP in the following way: hP (X, n) = hg1 (X, n), where, if X = x1 , . . . , xn , Y = y1 , . . . , yl f P (X) =X g1 (X, Y, n) = g P (π{n+1,...,n+l} (X, Y, n)) = g P (Y ). Lemma 8. For any X = x1 , . . . , xk , and number n, hP (X, n) = hg (X, n). Proof. By induction on n. – Basis: hP (X, 0) = f (X) = X = hg (X, 0). – Induction: hP (X, Sn) = g1 (X, h(X, n), n) = g P (π{n+1,...,n+l} (X, hg1 (X, n), n)) = g P (π{n+1,...,n+l} (X, hP (X, n), n)) = g P (hP (X, n)) = g(hg (X, n)) = hg (X, S n). We now introduce some notation which we will need in the next section: Definition 5. – Let f be a primitive recursive function. Then Jf KL will be a linear primitive recursive function such that: f (X) = Jf KL (X). – Let f be a linear primitive recursive function. Then Jf KP will be a primitive recursive function such that: f (X) = Jf KP (X). Notice that the existence of the functions J·KL and J·KP is guaranteed by the encodings given in this section.

14

5

Sandra Alves, Maribel Fernández, Mário Florido, and Ian Mackie

Minimisation of linear functions

5.1

Partial linear recursive functions

Definition 6 (Minimisation). Let f be a linear recursive function from Natn+1 to Nat. The function g from Natn to Nat is called the minimisation of f and is defined as: g(X) = min{y | f (X, y) = 0}. We denote g as µy (f ). Definition 7 (Linear recursive functions). The set of linear recursive functions (LRF) is defined as follows: – Every linear primitive recursive function is linear recursive. – For every n ≥ 0 and every total linear recursive function f : Natn+1 → Nat, the function Mf : Natn → Nat defined by Mf = µy (f ) is a linear recursive function. 5.2

From recursive to linear recursive

Theorem 2. Let h be a (partial) recursive function on Natk . Then there exists a linear recursive function hL on Natk , such that: h(x1 , . . . , xk ) = hL (x1 , . . . , xk ). Proof. Let h be a recursive function on Natk . Then, by Kleene’s theorem, there exists f and g primitive recursive, and a number n, such that h(x1 , . . . , xk ) = f (µy (g(n, x1 , . . . , xk , y))). Consider then the function hL (x1 , . . . , xk ) = Jf KL (µy (JgKL (n, x1 , . . . , xk , y))). Notice that g(n, x1 , . . . , xk , y) = JgKL (n, x1 , . . . , xk , y) ⇒ µy (g(n, x1 , . . . , xk , y)) = µy (JgKL (n, x1 , . . . , xk , y)) ⇒ f (µy (g(n, x1 , . . . , xk , y))) = Jf KL (µy (JgKL (n, x1 , . . . , xk , y))). Thus h(x1 , . . . , xk ) = f (µy (g(n, x1 , . . . , xk , y))) = Jf KL (µy (JgKL (n, x1 , . . . , xk , y))) = hL (x1 , . . . , xk ). The function hL is linear recursive. An alternative proof could be written using the fact that closing isomorphic sets of functions with the same minimisation functor gives isomorphic sets. Corollary 1. All computable functions are linear recursive.

6

Conclusion

The aim of this paper is to demonstrate that linear computations are powerful: linear recursive functions can express copying and erasing, thus all Turing computable functions are linear recursive. In addition, in a separate work we have demonstrated that without minimisation but with the addition of higher-order constructs, any function definable in Gödel’s System T is linear.

Acknowledgement We thank Yves Lafont for pointing out [6] to us.

Very Primitive Recursive Functions

15

References 1. S. Abramsky. Computational Interpretations of Linear Logic. Theoretical Computer Science, 111:3–57, 1993. 2. S. Alves, M. Fernández, M. Florido, and I. Mackie. The power of linear functions. In Proceedings of Computer Science Logic, CSL 2006, volume 4207, pages 119–134. Springer Verlag, 2006. 3. A. Asperti. Light affine logic. In Proc. Logic in Computer Science (LICS’98), pages 300–308. IEEE Computer Society, 1998. 4. A. Asperti and L. Roversi. Intuitionistic light affine logic. ACM Transactions on Computational Logic, 3(1):137–175, 2002. 5. P. Baillot and V. Mogbil. Soft lambda-calculus: a language for polynomial time computation. In Proc. Foundations of Software Science and Computation Structures (FOSSACS’04), volume 2987 of LNCS, pages 27–41. Springer Verlag, 2004. 6. A. Burroni. Récursivité graphique, i : Catégorie des fonctions récursives primitives formelles. Cahiers de topologie et géométrie différentielle catégoriques, XXVII:49, 1986. 7. J. Girard. Light linear logic. Information and Computation, 143(2):175–204, 1998. 8. J.-Y. Girard. Linear Logic. Theoretical Computer Science, 50(1):1–102, 1987. 9. J.-Y. Girard, A. Scedrov, and P. J. Scott. Bounded linear logic: A modular approach to polynomial time computability. Theoretical Computer Science, 97:1–66, 1992. 10. M. Gladstone. A reduction of the recursion scheme. J. Symb. Logic, 32, 1967. 11. M. Gladstone. Simplification of the recursion scheme. J. Symb. Logic, 36, 1971. 12. S. C. Kleene. Introduction to Metamathematics. North-Holland, 1952. 13. Y. Lafont. Soft linear logic and polynomial time. Theoretical Computer Science, 318(1-2):163–180, 2004. 14. P. Odifreddi. Classical recursion theory. Elsevier Science, 1999. 15. R. Robinson. Primitive recursive functions. Bull. Am. Math. Soc., 53, 2004. 16. J. Shoenfield. Recursion Theory. Springer-Verlag, 1993. 17. J. Stern. Fondements Mathematiques de L’Informatique. Ediscience International, Paris, 1994. 18. K. Terui. Light affine calculus and polytime strong normalization. In Proc. Logic in Computer Science (LICS’01). IEEE Computer Society, 2001.

Mobile Ambients and Mobile Membranes Bogdan Aman2 and Gabriel Ciobanu1,2 1

2

“A.I.Cuza” University, Faculty of Computer Science Blvd. Carol I no.11, 700506 Iaşi, Romania Romanian Academy, Institute of Computer Science Blvd. Carol I no.8, 700505 Iaşi, Romania [email protected],[email protected]

Abstract. Mobile ambients and membrane systems are two new computation models. Both of them are computationally universal. We formally relate mobile ambients and mobile membranes by using a translation function. Several results describe the relationship between concepts of these models, and between the behaviours of the ambients and their corresponding membranes.

1

Introduction

Ambients calculus [6] and membrane systems [14] represent two new computation models which are quickly evolving. During the last years, both of them were emergent research fields in computer science. The mobile ambients and membrane systems have similar structures and common concepts. We consider these new computing models, studying the computability aspects and the possible connections between them. Mobile ambients are designed to model both mobile computing (provided by mobile devices as laptops or PDA’s), and mobile computation (mobile code like applets or agents moving between computers and devices). An ambient is a location (bounded place) where computation happens. Ambients can be nested, having a hierarchical structure. An ambients has a name, and can move with all the computations and subambients it contains. Ambient calculus is designed to describe distributed and mobile computation. In contrast with other formalisms for mobile processes such as the π-calculus [12] whose computational model is based on the notion of communication, the ambient calculus is based on the notion of movement. An ambient is the unit of movement. Ambients mobility is controlled by the capabilities in, out, and open. In an ambient we have processes which may exchange messages. Membrane systems are introduced by Păun in [13] as a class of parallel computing devices inspired by biology. This computing model is inspired by the biological systems which are complex hierarchical structures, with a flow of materials and information which underlies their functioning. Essentially, the membrane systems (called also P systems) are composed of various compartments with different tasks, all of them working simultaneously to accomplish a more general task. There are several variants of membrane systems. The mobile membranes were introduced in [4] where the mobility is expressed through the operations of gemmation and fusion of mobile membranes. In this paper we use mobile membrane systems (called also P systems with mobile membranes), a variant introduced in [11] which expresses mobility using two biological operations: exocytosis and endocytosis. The structure of the paper is as follows. In Section 2 is given a short description of pure mobile ambients, whereas Section 3 presents the mobile membranes systems having local evolution rules and formally definitions for membrane configurations and structural congruence. In Section 4 are presented some results connecting the mobile ambients to the mobile membranes through a translation function. Conclusions and references end the paper.

Mobile Ambients and Mobile Membranes

2

17

Pure Mobile Ambients Without Replication

In this section we provide a short description of pure mobile ambients; more information can be found in [6]. The following table describes the syntax of pure mobile ambients. Table 1: Pure Mobile Ambients Syntax n, m, p names P, Q ::= M ::= capabilities 0 in n can enter n M.P out n can exit n n[P ] open n can open n P |Q (νn)P

processes inactivity movement ambient composition restriction

Process 0 is an inactive process (it does nothing). A movement M.P is provided by the capability M , followed by the execution of P . An ambient n[P ] represents a bounded place labelled by n in which a process P is executed. P | Q is a parallel composition of processes P and Q. (νn)P creates a new unique name n within the scope of P . Semantics of the ambient calculus is provided by two relations: structural congruence and reduction. The structural congruence P ≡amb Q relates different syntactic representations of the same process; it is used to define the reduction relation. The reduction relation P ⇒amb Q describes the processes evolution. We denote by ⇒∗amb the reflexive and transitive closure of ⇒amb . The structural congruence is defined as the least relation over processes satisfying the axioms from the table below: Table 2: Structural congruence P | Q ≡amb Q | P (P | Q) | R ≡amb P | (Q | R) (νn)(νm)P ≡amb (νm)(νn)P if n 6= m (νn)(P | Q) ≡amb P | (νn)Q if n ∈ / f n(P ) (νn)m[P ] ≡amb m[(νn)P ] if n 6= m P | 0 ≡amb P (νn)0 ≡amb 0

P P P P P P P

≡amb ≡amb ≡amb ≡amb ≡amb ≡amb ≡amb

P Q implies Q ≡amb P Q, Q ≡amb R implies P ≡amb R Q implies (νn)P ≡amb (νn)Q Q implies P | R ≡amb Q | R Q implies n[P ] ≡amb n[Q] Q implies M.P ≡amb M.Q

The rules from the left side of the table describe the commuting/association of parallel components, unfolding recursion, stretching of a restriction scope, renaming of bounded names. The rules from the right side describe how structural congruence is propagated across processes. The set of free names  for a process is defined as follow: ∅     f n(R) ∪ {n} f n(P ) = f n(R) ∪ {n}   f n(R) ∪ f n(Q)    f n(R) − {n}

if if if if if

P P P P P

=0 = in n.R or P = out n.R or P = open n.R = n[R] =R|Q = (νn)R

The reduction relation is defined as the least relation over processes satisfying the following set of axioms: Table 3: Reduction rules (In) n[in m. P | Q] | m[R] ⇒amb m[n[P | Q] | R] (Out) m[n[out m. P | Q] | R] ⇒amb n[P | Q] | m[R] (Open) open n. P | n[Q] ⇒amb P | Q (Res) P ⇒amb Q implies (νn)P ⇒amb (νn)Q (Amb) P ⇒amb Q implies n[P ] ⇒amb n[Q] (Par) P ⇒amb Q implies P | R ⇒amb Q | R P 0 ≡amb P, P ⇒amb Q, Q ≡amb Q0 (Struct) P 0 ⇒amb Q0

18

Bogdan Aman and Gabriel Ciobanu

The first three rules are the one-step reductions for in, out, open. The next three rules propagate reductions across scopes, ambient nesting and parallel composition. The final rule allows the use of structural congruence during reduction. The pure mobile ambients are powerful enough to encode Turing machines, meaning that they are Turing-complete. One way to demonstrate this is to encode mobile ambients into other process calculus which are known to be Turing-complete. Such an encoding is presented in [6] where the formalism encoded into mobile ambients is the asynchronous π-calculus. The encoding respects the semantics of the asynchronous π-calculus, in the sense that structural congruence over π-calculus processes is preserved by a contextual equivalence over mobile ambients, and a reduction step in π-calculus is simulated by a number of reduction steps (and equivalences) in mobile ambients. We denote by hhP iiS the ambient encoding a π-calculus P in the context of a set of names S which corresponds to the set of free names of the process P , and by ' the contextual equivalence over ambients. Proposition 1 ([6]). (1) If P ≡amb P 0 holds in π-calculus and S is a set of names, then hhP iiS ' hhP 0 iiS . (2) If P ⇒amb P 0 holds in π-calculus and S ⊇ f n(P ), then hhP iiS '⇒∗amb ' hhP 0 iiS . Another possibility to demonstrate the computational power of mobile ambients is to encode Turing machines directly into mobile ambients. This is given in [6] by encoding Turing machines into pure mobile ambients (where “pure” means the lack of communication primitives), and into pure public mobile ambients [8] (where “public” means the lack of the restriction operator). Turing machines are replaced by counter machines in [5], and it is presented an encoding of the counter machines in pure public mobile ambients. The correctness of such an encoding is presented in the following proposition: Proposition 2 ([5]). Let R be a random access machine, an arbitrary program (1 : I1 ), . . . (m : Im ), and state (i, c1 , . . . , cn ). If (i, c1 , . . . , cn ) →R (i0 , c01 , . . . , c0n ), then for any 0 0 0 0 0 0 P = [[(i, c1 , . . . , cn )]]R i1 ,...,in ,d1 ,...,dn ,z1 ,...,zn there exist Q and i1 ,. . .,in , d1 ,. . .,dn , z1 ,. . .,zn such + R 0 0 0 that P ⇒amb Q and Q ≡amb [[(i , c1 , . . . , cn )]]i0 ,...,i0 ,d0 ,...,d0 ,z0 ,...,z0 . 1

n

1

n

1

n

The concepts used in these results and proof details could be found in [5] and [6].

3

Mobile Membrane Systems

A membrane system consists of a hierarchy of nested membranes, placed inside a distinguishable membrane called skin. The space outside the skin membrane is called environment. A membrane contains multisets of objects, evolution rules, and possibly other membranes. The multisets of objects from a membrane correspond to the “chemicals swimming in the solution in the cell compartment”, while the rules correspond to the “chemical reactions possible in the same compartment”. The rules contain target indications specifying the membranes where the new obtained objects are sent. The new objects either remain in the same membrane whenever they have attached a here target, or they pass through membranes in two directions: they can be sent out of the membrane, or can be sent in one of the nested membranes which is precisely identified by its label. In one step, the objects can pass only through one membrane. A membrane without any other membranes inside is called elementary, while a non-elementary membrane is a composite one. The membrane systems are synchronous: at each time unit of a global clock, a transformation of the system takes place by applying the rules in a nondeterministic and maximally parallel manner. This means that the objects, the membranes and the rules involved in such a transformation are chosen in a nondeterministic way, and the application of rules is maximal.

Mobile Ambients and Mobile Membranes

19

After a choice was made, no rule can be applied anymore in the same evolution step: there are not enough objects and membranes available for any rule to be applied. Many variants of this basic model are discussed in the literature [7, 14]. In this paper we use mobile membrane systems. This is a variant of membrane systems with active membranes, but having none of the features like polarizations, label change and division of non-elementary membranes. Definition 1 ([11]). A mobile membrane system having local evolution rules Y = (V, H, µ, w1 , . . . , wn , R), where: 1. 2. 3. 4.

n ≥ 1 (the initial degree of the system); V is an alphabet (its elements are called objects); H is a finite set of labels for membranes; µ is a membrane structure, consisting of n membranes, labelled (not necessarily in a oneto-one manner) with elements of H; 5. w1 , w2 , . . . , wn are multisets of objects placed in the n membranes; 6. R is a finite set of developmental rules, of the following forms: (a) [ [ a → v ]m ]h , for h, m ∈ H, a ∈ V, v ∈ V ∗ ; local evolution rules These rules are called local, because the evolution of an object a of membrane m is possible only when membrane m is placed inside membrane h; if this restriction is not imposed, that is, the evolution of object a in membrane m is allowed irrespective of where membrane m is placed, then we say that we have a global evolution rule, and write it simply as [ a → v ]m (b) [ a ]h [ ]m → [ [ b ]h ]m , for h, m ∈ H, a, b ∈ V ; endocytosis An elementary membrane labelled by h enters the adjacent membrane labelled by m, under the control of object a. The labels h and m remain unchanged during this process; however the object a may be modified to object b during the operation. Membrane m is not necessarily elementary. (c) [ [ a ]h ]m → [ b ]h [ ]m , for h, m ∈ H, a, b ∈ V ; exocytosis An elementary membrane labelled by h is sent out of a membrane labelled by m, under the control of object a. The labels of the two membranes remain unchanged; the object a of membrane h may be modified to object b during this operation. Membrane m is not necessarily elementary. (d) [ a ]h → [ b ]h [ c ]h , h ∈ H, a, b, c ∈ V ; elementary division rules In reaction with an object a, the membrane labelled by h is divided into two membranes labelled by h, with the object a replaced in the two new membranes by possibly new objects b and c. The rules are applied according to the following principles: 1. The rules, the membranes, and the objects are chosen in a nondeterministic manner and the rules are applied in a maximal parallel way, meaning that in each step we apply a set of rules such that no further rule can be added to the set, no further membranes and objects can evolve at the same time. 2. Membrane m from the rules above is passive, while membrane h is active. The difference between passive and active membranes is that the passive membranes can be used by several rules at the same time, while the active membranes can be used in at most one rule. Moreover, each object of the system can be used in at most one rule. 3. The evolution of objects and membranes takes place in a bottom-up manner. After choosing a maximal set of rules, the rules are applied starting from the innermost membranes, level by level, up to the skin membrane.

20

Bogdan Aman and Gabriel Ciobanu

4. When an elementary membrane is moved across another membrane by endocytosis or exocytosis, its objects are also moved. All the inner objects evolve before the movement, and the membrane is moved with the obtained content. 5. If a membrane is divided by rule (d), then its content is replicated in the two new copies, with the exception of object a which may be replaced by objects b and c in the new copies. All the inner objects evolve before the replication, and both copies contain the resulting objects. Similarly, if a membrane l enters by endocytosis a membrane h which is divided, then in each of the two copies of h we will have a copy of membrane l. When dividing, membrane h is elementary, but the new membranes h are no longer so. 6. The skin membrane can never be divided, and no other membrane can pass through it (by exocytosis). Several classes of membrane systems (e.g. membrane systems with symbol-objects with catalytic rules, using only two catalysts, P systems with symport/antiport rules of a rather restricted size) are Turing-complete. The number of membranes sufficient to characterize the power of Turing machines is rather small, depending on the used class of P systems. In most cases three or four membrane suffice, but as can be seen in [1] the number of membranes can vary from one to nine. The mobile membranes systems are computationally universal using only the simple operations of endocytosis and exocytosis; moreover, if elementary membrane division is allowed, it is capable of solving NP-complete problems. In [11] it is shown that the family of all Turing computable sets of vectors of natural numbers can be computed by the mobile membrane systems having at least nine membranes, without using the division rules. Then it is proved in [9] that four membranes are sufficient for universality using only endo/exo operations. In [10] the computational power of these systems are related not only to the number of membranes, but also to the kind of rules used. It is proved that three membranes are sufficient for computational universality, whereas two membranes are not (if lambda-free rules are used). Such a result is the following: Theorem 1 ([9]). P sM P4 (endo, exo, gevol) ⊆ P sM P4 (endo, exo, levol) = P sRE. We explain now the classes of P systems used in the previous theorem. In mobile membrane the result of a halting computation (no rule can be applied) consists of all vectors describing the multiplicity of objects from all membranes sent out of the system during the computation; a non-halting computation provides no output. In a system Π the set of all vectors of natural numbers produced in this way is denoted by P s(Π). We denote by P sM P (div; levol; endo; exo) the family of all sets P s(Π) generated by systems using division rules, local evolution rules, endocytosis and exocytosis rules; when global evolution rules are used instead of local ones, levol is replaced by gevol. If a class of rules is not used, then we omit its name from the list. The endocytosis rules, exocytosis rules, division and local (global) evolution rules are denoted by endo, exo, div, levol(gevol). If RE denotes the family of recursively enumerable languages, then P sRE denoted the family of all Turing computable sets of vectors of natural numbers. We formally define in mobile membranes the notions of membrane configuration, structural congruence and level. A detailed approach is presented in [2]. We denote the membrane by M , N , M 0 , Mi , and the labels of the membranes by n, m, . . .. Considering O a finite multiset of objects, the set M of membrane configurations M is defined by M ::= O | [ M ]n | (νn)M | M1 , M2 Definition 2. The structural congruence ≡mem over M is the smallest congruence relation satisfying:

Mobile Ambients and Mobile Membranes

21

M, N ≡mem N, M M, (N, M 0 ) ≡mem (M, N ), M 0 ; (νn)(νm)M ≡mem (νm)(νn)M ; (νm)M ≡mem (νn)M {n/m}, where n is not a membrane label in M ; (νn)(N, M ) ≡mem M, (νn)N where n is not a membrane label in M ; n 6= m implies (νn)[ M ]m ≡mem [ (νn)M ]m . Proposition 3. The structural congruence has the following properties: M ≡mem M ; M ≡mem N implies N ≡mem M ; M ≡mem N and N ≡mem M 0 implies M ≡mem M 0 ; M ≡mem N implies M, M 0 ≡mem N, M 0 ; M ≡mem N implies M 0 , M ≡mem M 0 , N ; M ≡mem N implies [ M ]n ≡mem [ N ]n ; M ≡mem N implies (νn)M ≡mem (νn)N . We define the depth of each membrane from a membrane structure as follows: if n = skin, then depth(n) = 0; if [ [ ]m ]n then depth(m) = depth(n) + 1.

4

Connecting Mobile Ambients to Mobile Membranes

In order to establish a connection between the static semantic of mobile ambients and membrane systems we use the formal definition of the P systems presented in the previous section. An encoding of the pure mobile ambients into mobile membranes is given by using the following translation function: Definition 3. A translation T : A → M is given by T (A) = dlock T1 (A), where T1 : A → M is  cap n[ ]cap n if A = cap n     cap n[ T (A ) ] if A = cap n. A1  1 1 cap n   [ T1 (A1 ) ]n if A = n[ A1 ] T1 (A) = [ ] if A = n[ ]  n    (νn)T (A ) if A = (νn)A1  1 1   T1 (A1 ), T1 (A2 ) if A = A1 | A2 and cap stands for in, out or open. An object dlock is placed near the membrane structure after the translation is done; the additional object dlock prevents the consumption of capability objects in a membrane system which corresponds to a mobile ambient deadlock structure. Details of the translation and deadlock are described in [2]. We can use such a translation to relate these new computation models: Proposition 4. Structurally congruent ambients are translated into structurally congruent membrane systems; moreover, structurally congruent translated membrane systems correspond to structurally congruent ambients: A ≡amb B iff T (A) ≡mem T (B). Proof. The proof is made by structural induction. We prove only one case due to the lack of space. If A = A1 | A2 where A1 and A2 are two subambients which do not contain any composition operation, then from the definition of ≡amb we have that B = A2 | A1 . Using the definition of T and A = A1 | A2 we have that T (A) = dlock T1 (A1 ), T1 (A2 ). From B = A2 | A1 and the definition of T we get T (B) = dlock T (A2 ), T (A1 ). Following the definition of ≡mem we get T (A) ≡mem T (B). The other cases are proved in a similar way.

22

Bogdan Aman and Gabriel Ciobanu

From now on, we work with a subclass of M, namely the membrane systems obtained from the translation of mobile ambients. In [2] the evolution of the membrane systems from this particular subclass of M, is given by a particular set of developmental rules, namely: a) [ in m dlock one ]n [ ]m → [ in∗ m in∗ m dlock ]n [ ]m If a membrane n (containing the objects in m, dlock, one) and a membrane m are sibling membranes, then the objects in m and one are replaced by the objects in∗ m and in∗ m. The object in∗ m is used to control the process of introducing membrane n into membrane m, and the object in∗ m is used to dissolve the membrane in m. b) cap∗ m [ ]cap

m

→ [ δ ]cap

m

If an object cap∗ m is sibling to the membrane labelled by cap m, then the object cap∗ m is consumed and the membrane labelled by cap m is dissolved (this is denoted by the symbol δ). This rule simulates the consumption of a capability cap m in ambients. c) [ in∗ m ]n [ ]m → [ [ ]n ]m |[ ¬cap∗ ]m If an elementary membrane n (containing an object in∗ m) and a membrane labelled by m (which does not contain star objects – this is denoted by |[ ¬cap∗ ]m ) are sibling membranes, then the membrane n enters the membrane labelled by m under the control of object in∗ m which is consumed in this process. d) [ [ out m dlock one ]n ]m → [ [ out∗ m out∗ m dlock ]n ]m If a membrane m contains a membrane n (having the objects out m, dlock, one), then the objects out m and one are replaced by out∗ m, out∗ m. Object out∗ m is used to control the process of extracting membrane n from membrane m, and the object out∗ m is used to dissolve the membrane out m. e) [ [ out∗ m ]n ]m → [ ]n [ ]m If a membrane m contains an elementary membrane n which has an object out∗ m, then membrane n is extracted from the membrane labelled by m, and object out∗ m is consumed in this process. f) [ ]m open m dlock one → [ δ ]m open∗ m dlock If a membrane m and the objects open m, dlock, one are siblings, then membrane m is dissolved, and the objects open m and one are replaced by the object open∗ m. g) [ U ∗ [ ]t ]n → [ U ∗ [out∗ n in∗ n U ∗ ]t ]n |[ ¬cap∗ ]t U* is used to denote the set of star objects placed in membrane n. If a membrane n contains a set of star objects U ∗ and a membrane t which does not contain star objects (this is denoted by |[ ¬cap∗ ]t ), then a copy of set U ∗ and two new objects in∗ n and out∗ n are created inside membrane t. The existence of a set U ∗ of star objects indicates that membrane n can be used by rules c), e) to enter/exit into/from another membrane. In order to move, membrane n must be elementary; to accomplish this, the objects out∗ n, in∗ n and a copy of the set U ∗ are created inside membrane t such that membrane t can be extracted. After membrane n completes its movement (this is denoted by the fact that membrane labelled by n does not contain star objects), membrane t is introduced back into membrane n. (h) dlock [ ]n → dlock [ dlock ]n | ¬n [ ¬dlock ]n If an object dlock and a membrane n (which does not already contain an object dlock, i.e., [ ¬dlock ]n ) are siblings, and there is no sibling object n (¬n), then a new object dlock is placed inside membrane n. This rule specifies the fact that object dlock can only pass through membranes corresponding to translated ambients; this makes impossible the consumption of capability objects from the translated structures from Damb .

Mobile Ambients and Mobile Membranes

23

(i) [ dlock ]n → [ ]n The object dlock created by a rule h and located inside membrane n is removed. (j) [ dlock ]n → [ dlock one ]n If a membrane n contains an object dlock, then an additional object one is created in membrane n. (k) one → [ δ ] An object one is consumed; the last two rules ensure that at most one object one exists in the membrane system at any moment. Representing by r one of the rules a), . . . , k) from our particular set of developmental rules, r we use M → N to denote the transformation of a membrane system M into a membrane system N by applying a rule r. We can define a relation ⇒mem using the same steps presented in [3], where a structural operational semantics for a particular class of P systems is defined. Considering two membrane systems M and N with only one object dlock, we say that M ⇒mem r1 r N if there is a sequence of rules r1 , . . . , ri such that M → . . . →i N . The operational semantics r of the membrane systems is defined in terms of the transformation relation → by the following rules: r

(DRule) M → N for each developmental rule a), . . . , k) (Res)

r

M → M0 ; (Comp) r (νn)M → (νn)M 0

r

M → M0 ; r M, N → M 0 , N r

(Amb)

r M ≡mem M 0 , M 0 → N 0 , N 0 ≡mem N M → M0 ; (Struc) . r r 0 [ M ]n → [ M ]n M → N

The behaviour of a system can be describe with the help of the barbed bisimulations. The key ingredient of the barbed bisimulation is the notion of barb. A barb is a predicate which describes the observed elements of a certain structure. The observations of a system define its behaviour. In membrane systems an observer has the possibility of watching only the top-level membranes at any step of the computation, where the set of top-level membranes T L are defined as follows: if M = O, then T L(M ) = ∅ if M = [ N ]n , then T L(M ) = {n}; if M = (νn)N , then T L(M ) = T L(N )\{n}; if M = M1 , M2 , then T L(M ) = T L(M1 ) ∪ T L(M2 ). For the case M = (νn)N we have that T L(M ) = T L(N )\{n}, because an observer does not have the power to observe the membranes with restricted names. Definition 4. A barb ↓mem is defined inductively by the following rules: M ↓mem n if n ∈ T L(M ) M1 · · · Mk ↓mem n1 · · · nk if ni ∈ T L(Mi ) for each i (νk)M ↓mem n if k 6= n 0 0 mem We write M ⇓mem n if either M ↓mem n or M ⇒+ n. mem M and M ↓ Formally, we have: def

M ↓mem n = M ≡mem (νm1 ) . . . (νmi )[ M1 ]n , M2 , and n ∈ / {m1 , . . . , mi }. def

0 0 mem M ⇓mem n = either M ↓mem n or M ⇒+ n. mem M and M ↓ The following result reflects a relationship between structural congruence and barbs predicates in membrane systems.

Proposition 5. Structurally congruent membrane systems have the same top level membranes. If M ≡mem N , then M ↓mem n iff N ↓mem n, for all n ∈ T L(M, N ).

24

Bogdan Aman and Gabriel Ciobanu

Proof. We prove only the first implication, the other being treated similarly. M ↓mem n means that M ≡mem (νm1 ) . . . (νmi )[ M1 ]n , M2 , where n is a label different of m1 , . . . , mi . From M ≡mem (νm1 ) . . . (νmi )[ M1 ]n , M2 and M ≡mem N , we get N ≡mem (νm1 ) . . . (νmi ) [ M1 ]n , M2 , which means that N ↓mem n. The set M L of membrane labels is defined as follows: if if if if

M M M M

= O, then M L(M ) = ∅; = [ N ]n , then M L(M ) = M L(N ) ∪ {n}; = (νn)N , then M L(M ) = M L(N ); = M1 , M2 , then M L(M ) = M L(M1 ) ∪ M L(M2 ).

If a system contains a top level membrane after applying a number of computation steps, then a structurally congruent membrane system contains the same top level membrane after applying the same number of computation steps. Proposition 6. M ≡mem N implies M ⇓mem n iff N ⇓mem n, ∀n ∈ M L(M, N ). Proof. We prove only the first implication, the other being treated similarly. 0 0 mem If M ⇓mem n, then either M ↓mem n or M ⇒+ n. The first case was mem M and M ↓ 0 studied in the previous proposition, so only the second case is presented. If M ⇒+ mem M and 0 + 0 0 0 0 M ≡mem N , then there exists N such that N ⇒mem N and M ≡mem N . From M ≡mem N 0 0 and M 0 ↓mem n we have that N 0 ↓mem n, which together with N ⇒+ mem N implies that N ⇓mem n. Using the exhibit of an ambient ↓amb and the eventually exhibit of an ambient ⇓amb defined n n in [6] we obtain the following results: Proposition 7. An ambient contains a top ambient labelled by n if and only if the translated membrane system contains a top level membrane labelled by n. Formally, A ↓amb n iff T (A) ↓mem n for all n ∈ T L(T (A)). Proof (Sketch). We prove only the first implication, the other being treated similarly. If A ↓amb n, then we have A1 and A2 such that A = (νm1 ) . . . (νmi )n[ A1 ] | A2 , where n ∈ / {m1 , . . . , mi }. From A = (νm1 ) . . . (νmi )n[ A1 ] | A2 and the definition of the translation function we have that T (A) = (νm1 ) . . . (νmi )dlock [T1 (A1 ) ]n T1 (A2 ), which means that T (A) ↓mem n. Proposition 8. An ambient eventually contains a top ambient n if and only if the translated membrane system, after applying the same number of steps, eventually contains a top level membrane n. Formally, A ⇓amb n iff T (A) ⇓mem n for all n ∈ M L(T (A)). Proof. We prove only the first implication, the other being treated similarly. If A ⇓amb n, then amb either A ↓amb n or A ⇒+ n. The first case was studied in the previous amb B and B ↓ + proposition, so only the second case is presented. If A ⇒+ amb B then T (A) ⇒mem T (B). amb mem From B ↓ n then according to the previous proposition we have that T (B) ↓ n. From mem T (B) ↓mem n and T (A) ⇒+ n. mem T (B), we get that T (A) ⇓ Proposition 9. The level of an ambient is preserved in its translated membrane.

Mobile Ambients and Mobile Membranes

25

The proof is given using all the possible positions of an ambient or capability into the ambient structure. Proposition 10. If A and B are two ambients and M is a membrane system such that rk r1 A ⇒amb B and M = T (A), then there exists a chain of transitions M → ... → N such that r1 , . . . , rk are developmental rules, and N = T (B). Proof. (Sketch) Since A ⇒amb B, then one of the requirements (In), (Out) or (Open) is fulfilled for subambients A0 and B 0 that are included in A and B respectively. We treat only one case, the other being treated similarly: 1. A0 = n[ in m ] | m[ ] and B 0 = m[ n[ ] ], where n is an ambient which contains only the capability in m. Then according to the definition of the translation function T , M contains the membrane system: [ in m [ ]in m ]n [ ]m , and applying some rules of form h) we obtain the following structure: [ in m dlock [ ]in m ]n [ ]m . Using the rules r1 : [ dlock ]n → [ dlock one ]n r2 : [ in m dlock one ]n [ ]m → [ in∗ m in∗ m dlock ]n [ ]m r3 : in∗ m [ ]in m → [ δ ]in m r4 : [n in∗ m]n [m ]m → [m [n ]n ]m , k)

r1 )

and some rules of the form i) there exists the following sequence of transitions M →∗ M1 → r

i)

4 . . . M4 → M5 →∗ N , where M1 , . . . , M5 are intermediary configurations, and the membrane system N contains the membrane system [ [ ]n ]m . Once the objects dlock and one are created near object in m, these transitions are the only deterministic steps which can be performed. We can notice that T1 (B 0 ) = [ [ ]n ]m . Hence, according to the definition of translation function T r and transition relation →, we reach the conclusion that the membrane structure M admits the required sequence of transitions leading to the membrane structure N , and T (B) = N .

Proposition 11. Let M and N be two membrane systems with only one dlock object, and an rk r1 ambient A such that M = T (A). If there is a sequence of transitions M → ... → N , then there exists an ambient B with A ⇒∗amb B and N = T (B). The number of non-star objects (cap∗ and cap∗ ) consumed in membrane systems is equal with the number of capabilities consumed in ambients. Proof. (Sketch) We proceed by structural induction. Since M does not contain any star object, the first rule which consumes a translated capability object has one of the following forms: [ in m dlock one ]n [ ]m → [ in∗ m in∗ m dlock ]n [ ]m , [ [ out m dlock one ]n ]m → [ [ out∗ m out∗ m dlock ]n ]m , [ ]m open m dlock one → [ δ ]m dlock open∗ m. We treat only one case, the other being treated similarly: 1. If the first rule applied is [ in m dlock one ]n [ ]m → [ in∗ m in∗ m dlock ]n [ ]m where the membrane n contains only the capability object in m and the corresponding membrane labelled in m, then M contains the membrane structure [ in m [ ]in m ]n [ ]m . According to the definition of T , M can be written as M1 , M 0 or M2 [ M 0 ], where M 0 = [ in m [ ]in m ]n [ ]m and M2 represents a membrane structure in which M 0 is placed inside a nested structure of translated ambients. If A is a mobile ambient encoded by M = M1 , M 0 , then according to the definition of T it contains two subambients A0 = n[ in m ] | m[ ] and A1 such that A = A1 | A0 , T1 (A0 ) = M 0 , and T1 (A1 ) = M1 . If A is a mobile ambient

26

Bogdan Aman and Gabriel Ciobanu

encoded by M = M2 [ M 0 ], then according to the definition of T it contains two subambients A0 = n[ in m ] | m[ ] and A2 such that A = A2 [ A0 ], T1 (A0 ) = M 0 , and T1 (A2 ) = M2 . The application of the rule defined above, to the membrane system M changes only the membrane system M 0 . The newly created objects in∗ m and in∗ m control the moving of membrane n into membrane m, and are consumed by the following rules: in∗ m [ ]in m → [ δ ]in m [ in∗ m ]n [ ]m → [ [ ]n ]m . After the application of these rules, M 0 evolves to N 0 = [ [ ]n ]m . The inductive hypothesis expresses that N 0 encodes an ambient B 0 . After obtaining N 0 , N has the structure N = M1 , N 0 if M = M1 , M 0 and it encodes the mobile ambient B = A1 | B 0 , or N has the structure N = M2 [ N 0 ] if M = M2 [ M 0 ] and it encodes the mobile ambient B = A2 [B 0 ]. The transition from M 0 to N 0 represents also the transition from M to N . It should be notice that by consuming the capability in m we have A0 ⇒amb B 0 . So the transition from M to N with the consumption of only one non-star object is simulated by the transition of A to B. r

r

1 k Remark 1. If M → ... → N , and both M and N contain only one dlock object, then the number of steps which transform ambient A in ambient B is the number of non-star objects consumed during the computation in the membrane evolution. The order in which the reductions take place in ambients is the order in which the non-star objects are consumed in the membrane systems.

Considering together the previous two propositions, we have an operational correspondence result. Theorem 2 (Operational Correspondence). 1. If A ⇒amb B, then T (A) ⇒mem T (B). 2. If T (A) ⇒mem M , then exists B such that A ⇒amb B and M = T (B).

5

Conclusion

Ambient calculus is a process algebra using interleaving semantics, compositionality and bisimulation. Membrane computing is a branch of natural computing. Membrane systems represent a computation model in the Turing sense, making use of automata, formal languages and complexity tools. This paper presents some existing computability results of the mobile ambient calculus and mobile membrane systems, and emphasizes on a formal relationship between mobile ambients and mobile membranes. We consider some notions from mobile ambients: the exhibit of an ambient, its level, the structural congruence, and show that they can be connected with some new introduced notions in membrane systems: the observation barbs, depth and structural congruence of a membrane system. Some results relate these notions through a translation function. The soundness of this approach is provided by an operational correspondence result.

References 1. A. Alhazov, R. Freund, Y. Rogozhin. Computational Power of Symport / Antiport: History, Advances and Open Problems. Workshop on Membrane Computing, Lecture Notes in Computer Science vol.3850, Springer, 1-30, 2006. 2. B. Aman, G. Ciobanu. Translating Mobile Ambients into P Systems. Workshop on Membrane Computing and Biologically Inspired Process Calculi, 13-29, 2006.

Mobile Ambients and Mobile Membranes

27

3. O. Andrei, G. Ciobanu, D. Lucanu. Structural Operational Semantics of P Systems. Workshop on Membrane Computing, Lecture Notes in Computer Science vol.3850, Springer, 32-49, 2006. 4. D. Besozzi, C. Zandron, G. Mauri, N. Sabadini. P Systems with Gemmation of Mobile Membranes. Italian Conference on Theoretical Computer Science, Lecture Notes in Computer Science vol.2202, Springer, 136-153, 2001. 5. N. Busi, G. Zavattaro. On the expressive power of movement and restriction in pure mobile ambients. Foundations of Wide Area Network Computing, Theoretical Computer Science vol.322(3), Elsevier, 477-515, 2004. 6. L. Cardelli, A. Gordon. Mobile Ambients. Theoretical Aspects of Computer Software, Lecture Notes in Computer Science vol.1378, Springer, 140-155, 1998. 7. G. Ciobanu, Gh. Păun, M.J. Pérez-Jiménez. Application of Membrane Computing, Springer, 2006. 8. D. Hirschkoff, E. Lozes, D. Sangiorgi. Separability, expresssiveness, and decidability in the ambient logic. IEEE Symposium on Logic in Computer Science, IEEE Computer Society Press, 423-432, 2002. 9. S.N. Krishna. The Power of Mobility: Four Membranes Suffice. Computability In Europe: New Computational Paradigms, Lecture Notes in Computer Science vol.3526, Springer, 242-251, 2005. 10. S.N. Krishna. Upper and Lower Bounds for the Computational Power of P Systems with Mobile Membranes. Computability In Europe: Logical Approaches to Computational Barriers, Lecture Notes in Computer Science vol.3988, Springer, 526-535, 2006. 11. S.N. Krishna, Gh. Păun. P Systems with Mobile Membranes. Natural Computing, vol.4(3), Springer, 255-274, 2005. 12. R. Milner. Communicating and mobile systems: the π-calculus, Cambridge University Press, 1999. 13. Gh. Păun. Computing with membranes. Journal of Computer and System Sciences, vol.61(1), 108-143, 2000. 14. Gh. Păun. Membrane Computing. An Introduction, Springer, 2002.

Natural Deduction and Normalisation for Partially Commutative Linear Logic and Lambek Calculus with Product Maxime Amblard and Christian Retoré LaBRI & INRIA-futurs, Université de Bordeaux, France {amblard,retore}@labri.fr

Abstract. This paper provides a natural deduction system for Partially Commutative Intuitionistic Multiplicative Linear Logic (PCIMLL) and establishes its normalisation and subformula properties. Such a system involves both commutative and non commutative connectives and deals with contexts that are series-parallel multisets of formulæ. This calculus is the extension of the one introduced by de Groote presented by the second author for modelling Petri net execution, with a full entropy which allow order to be relaxed into any suborder — as opposed to the Non Commutative Logic of Abrusci and Ruet. Our result also includes, as a special case, the normalisation of natural deduction within the Lambek calculus with product, which is unsurprising but heretofore unproven. Up to now PCIMLL with full entropy had no natural deduction system. In particular for linguistic applications, such a syntax seems particularly well-suited for constructing semantic representations from syntactic analyses.

1

Presentation

Non commutative logics arise naturally both in mathematics and in the modelling of some real world phenomena. Mathematically non commutativity is natural both from the truth valued semantics viewpoint (phase semantics, based on monoids which can be non commutative) and from a syntactic one (sequent calculus with sequences rather than sets of formulæ, proof nets which can have well bracketed axiom links). Non commutativity also appears in real world applications such as concurrency theory, like concurrent execution of Petri nets, and computational linguistics, and this goes back to the fifties and the appearance of the Lambek calculus. We first give a brief presentation of non commutative logics and then stress their interest for concurrency and computational linguistics. Non commutative linear logics Linear logic [6] offered a logical view of the Lambek calculus [9] and non commutative calculi. During many years, the difficulty was to integrate commutative connectives and non commutative connectives. A first solution, without a term calculus, was Pomset Logic, now studied with extended sequent calculus called Calculus of Structures [7]. Another kind of calculus was introduced as a sequent calculus by de Groote in [5], which has to be intuitionistic to work neatly. It consists in a superposition of the Lambek calculus (non commutative) and of Intuitionnistic Linear Logic (commutative). For making a distinction bewteen the two connectives it is necessary that the context includes two different commas mimicking the conjunctions, one being commutative and the other being non commutative. Hence we deal with series-parallel partial orders over multisets of formulæ as sequents on the right hand side. Let us write (..., ...) for the parallel composition and h...; ...i for the non commutative: hence h{a, b}; {c, d}i stands for the finite partial order a s(n). Now if for some time bound g, depthgdim (α) = 0 then there exists a bound S, such that S = o(n), and infinitely often deptht(n) (αn ) < S(n). This is absurd and therefore for all recursive time bound t, depthtdim (α) > 0.

Computational Depth of Infinite Strings Revisited

43

Conversely if depthtdim (α) > 0 then there is some ² > 0 such that for almost all n, t(n) depthdim (αn ) > ²n. This implies that ldepths(n) (αn ) > ldepth²n (αn ) > t(n) for all significance function s = o(n) and almost all n. So α is super deep. In [JLL94] several characterizations of strong computational depth are obtained. Foolowing the ideas in [JLL94], we can prove analogous characterizations for super deepness. Theorem 3. For every sequence α the following conditions are equivalent. 1. α is super deep. 2. For every recursive time bound t : N → N and every significance function g = o(n), deptht (αn ) > g(n) a.e. 3. For every recursive time bound t : N → N and every significance function g = o(n), K t (αn ) − K(αn ) > g(n) a.e. 4. For every recursive time bound t : N → N and every significance function g = o(n), Q(αn ) ≥ 2g(n) Qt (αn ) a.e. In [JLL94] the authors proved that every weakly useful sequence is strongly deep. Following the ideas in [JLL94] we can also prove that that every weakly useful sequence is super deep. Theorem 4. Every weakly useful sequence is super deep. Corollary 1. The characteristic sequences of the halting problem and the diagonal halting problem are super deep.

References [ACMV07] L. Antunes and A. Costa and A. Matos and P. Vitányi. Computational Depth: A Unifying Approach. Submitted, 2007. [AFPS06] L. Antunes, L. Fortnow, A. Pinto, and A. Souto. Low-depth witnesses are easy to find. Technical Report TR06-125, ECCC, 2006. [AF05] L. Antunes and L. Fortnow. Time-bounded universal distributions. Technical Report TR05144, ECCC, 2005. [AFMV06] L. Antunes and L. Fortnow and D. van Melkebeek and N. V. Vinodchandran. Computational depth: concept and applications. Theor. Comput. Sci., 354 (3): 391–404, 2006. [Ben88] C. H. Bennett. Logical depth and physical complexity. In R. Herken, editor, The Universal Turing Machine: A Half-Century Survey, pages 227–257. Oxford University Press, 1988. [FLMR05] Stephen A. Fenner and Jack H. Lutz and Elvira Mayordomo and Patrick Reardon. Weakly useful sequences. Information and Computation 197 (2005), pp. 41-54. [Gac74] P. Gács. On the symmetry of algorithmic information. Soviet Math. Dokl., 15 (1974) 1477– 1480. [JLL94] David W. Juedes, James I. Lathrop and Jack H. Lutz. Computational Depth and Reducibility. Theoret. Comput. Sci. 132 (1994), 37-70. [LL99] James I. Lathrop and Jack H. Lutz. Recursive computational depth. Information and Computation 153 (1999), pp. 139-172. [Lev73] Leonid A. Levin. Universal Search Problems. Problems Inform. Transmission, 9(1973), 265266. [Lev74] Leonid A. Levin. Laws of information conservation (nongrowth) and aspects of the foundation of probability theory. Probl. Inform. Transm., vol. 10, pages 206–210, 1974. [Lev80] Leonid A. Levin. A concept of independence with applications in various fields of mathematics MIT, Laboratory for Computer Science, 1980. [Lev84] Leonid A. Levin. Randomness conservation inequalities: information and independence in mathematical theories. Information and Control, 61:15–37, 1984.

44

Luís Antunes, Armindo Costa, Armando Matos, and Paul Vitányi

[Li03] Ming Li and Xin Chen and Xin Li and Bin Ma and Paul M.B. Vitanyi The similarity metric. In Proceedings of the 14th ACM-SIAM Symposium on Discrete Algorithms, 2001. [LV97] Ming Li and Paul M. B. Vitányi. An introduction to Kolmogorov complexity and its applications. Springer, 2nd edition, 1997. [Lut00] J. H. Lutz. Dimension in complexity classes. Proceedings of the 15th IEEE Conference of Computational Complexity, IEEE Computer Society Press, 2000. [Lut02] J. H. Lutz. The dimensions of individual strings and sequences. Technical Report cs.CC/0203017, ACM Computing Research Repository, 2002. [May02] E. Mayordromo. A Kolmogorov complexity characterization of construtive Hausdorff dimension. Information Processing Letters, 84:1-3, 2002.

A New Approach to the Uncounting Problem for Regular Languages Kostyantyn Archangelsky SRSC ‘Algorithm-center’ P.O. Box 129, 04070 Kyiv, Ukraine [email protected] Abstract. This work presents an algebraic method to study the counting problem for regular languages based on the embedding rational formal power series over free noncommutative semigroup (FPS) in Malcev-Neumann ordered series skew field of the free noncommutative group (MNOS) . Such approach allows to obtain old and new results on the counting and uncounting as well as on other problems concerning, for example, Fatou extension. Let L be a regular language over alphabet Σ = {σ1 , σ2 , . . . , σt }, t ≥ 2, ln , be a number of words from L of the length n, i.e., a counting function. Such function must satisfy a linear recurrence relation: ln = ln−1 α1 + . . . + ln−m αm , n ≥ m. By the uncounting problem we mean the problem of computing the inverse of counting function, i.e., finding all regular languages with a given counting function. This problem is still open (Goldwurm [10], Choffrut, Goldwurm [6], Ravikumar,Eisman [15], Bassino,Beal,Perrin [3],Shallit [17]). We describe a family of relations between regular languages in MNOS skew field which are the proimage of linear recurrence relation of counting function under the unary morphism: u(σi ) = z, i = 1, t.Such approach gives a fresh tools for the investigating a well-known open counting problems (Beal,Perrin [4], Ravikumar,Eisman [15]) as well as generates a new ones.

1

Introduction

We use standard notations from the books of Kuich, Salomaa [13], Berstel, Reutenauer [5] and Cohn [7]. In particular, it will be assumed that Σ = {σ1 , σ2 , . . . , σt } is a finite alphabet, Σ −1 = {σ1−1 , σ2−1 , . . . , σt−1 }, ε is an empty word and a unity in the semigroup Σ ∗ and the group GΣ , generated by Σ, ∅ is an empty set and zero in semirings and fields, generated by Σ; ε, σ i , σ −1 are corresponding characteristic FPS, k is a commutative zero-divizor-free i semiring embeddable in a commutative field K (this includes the semirings N , Z, Q, R, C), k ¿ Σ ∗ À is a semiring of FPS over monoid Σ ∗ , k < Σ ∗ > is a semiring of polynomials over monoid Σ ∗ , c(Σ ∗ ) is a free commutative semigroup generated by the alphabet Σ. Each |w| |w| element w ∈ c(Σ ∗ ) can be represented in the form σ1 σ1 . . . σt σt . Let us observe that c can be viewed as a morphism mapping Σ ∗ into the free commutative semigroup. It means reasonable calling c(Σ ∗ ) a commutative image of Σ ∗ . We denote by u(Σ ∗ ) cyclic semigroup z ∗ generated by the letter z and correspondingly by u : Σ ∗ → z ∗ an unary morphism defined as u(σ1 ) = ... = u(σt ) = z. Surely, the unary morphism is defined on commutative semigroup too. We extend c and u in usual fashion to a semiring morphism mapping the semiring k ¿ Σ ∗ À into the k ¿ c(Σ ∗ ) À and k ¿ u(Σ ∗ ) À. Let us note that FPS in commutative variables is a mapping of c(Σ ∗ ) into k. Let A = hQ, Σ, q0 , δ, F i be a deterministic finite state automation recognizing a language L and, for any state q ∈ Q and every n ∈ N , let qn denote the number of words w ∈ Σ ∗ such that δ(q, w) ∈ F and |w| = n. It is known that each sequence {qn }n ≥ 0 satisfies a linear recurrence relation of the form: qn+k = qn+k−1 α1 + . . . + qn αk , n ≥ 0

(1)

46

Kostyantyn Archangelsky

where k depends only on A and αi ∈ Z, i = 1, k (see Huynh [11], Katayama, Okamoto, Enomoto [12],Bassino, Beal, Perrin [3]). We call such sequence the counting function of finite automaton A. Example 1. Let A be the following finite automaton:

Its counting functions must satisfy the following correlations: p0 = 1

pn+1 = pn + qn

n≥0

(2)

q0 = 1

qn+1 = qn + pn

n≥0

(3)

By substitution (2) into (3) we obtain: pn+2 − pn+1 = pn+1 − pn + pn

n ≥ 0,

and finally: p0 = 1, p1 = 2, pn+2 = pn+1 · 2 + pn · 0

n≥0

¤ The foundations of a new, different from “recognizable” and “rational” way of representation of FPS and regular languages have been laid out in some of the latest papers by the author [2]. We claim that this representation is a proimage of (1) under unary morphism. We will need the following new definitions and results (Archangelsky [2], Cohn [7]). We consider a total order ≤ on GΣ compatible with its group structure (Cohn [7]). Let K((GΣ )) denote the ring of Malcev-Neumann ordered P series on GΣ over K relative to this order: an element of K((GΣ )) is an infinite series S˜ = g∈GΣ kg g, with kg in K, such that ˜ that is, the set {g ∈ GΣ |kg 6= 0} is a well-ordered subset of GΣ . Recall that the support of S, a well-ordered set is an ordered set such that each nonempty subset has a minimum. Element ˜ is called the leading term of S. ˜ kg g with g = min(supp(S)) Such a definition makes possible to work with arbitrary expressions from the set K ¿ GΣ À. Note, that K ¿ GΣ À is not a semiring because, for example (a+a−1 )∗ ∈ K ¿ GΣ À, but does not define FPS.Indeed, any coefficient near each monomial of this FPS must be infinity, because it has infinitely many factorizations of a given degree (i.e., coefficient near, for example, ε must be a sum of coefficients near a−1 a, aa−1 , a−1 a2 a−1 , a2 a−2 , ..). On the other hand, using total order of GΣ we can rewrite it as follows: (a + a−1 )∗ = (ε − a − a−1 )−1 = (−a−1 + ε − a)−1 Now a−1 < ε < a , support is well-ordered and inversion of (−a−1 + ε − a) could be found by the usual rule (Cohn [7]): (−a−1 + ε − a)−1 = (−a−1 (ε − a + a2 ))−1 = (ε − a + a2 )−1 (−a) = −(a − a2 )∗ a

A New Approach to the Uncounting Problem for Regular Languages

2

47

Automata and MNOS

According to Kuich, Salomaa [13], every FPS r ∈ krat ¿ Σ ∗ À can be represented as behaviour of k − Σ ∗ -automaton A =< {q1 , q2 , . . . , qn }, A, q1 , F > where A ∈ kn×n < Σ > - transition matrix, q1 -initial state, F ∈ kn×1 < {ε, ∅} > – final states: ∞ X (Ai F )1 .

rA =

(4)

i=0 (j)

We denote qi

P∞

(i) i=0 q1

= (Aj F )i , then rA = (j)

qi

=

n X

and

Ais qs(j−1)

(5)

s=1

Let us consider a system of n equations with n + 1 unknowns X0 , X1 , . . . , Xn :  n  X (n) (n−j) qi Xn = qi Xn−j , i = 1, n 

(6)

j=1

Theorem 1. (Archangelsky [2]) A solution of the system (6) in K((GΣ )) exists always in the form: ˜0, X ˜1, . . . , X ˜ p , ε, ∅, . . . , ∅), 1 ≤ p ≤ n − 1 (X

(7)

˜ i can be also ∅ and deg(X ˜ n−i ) = i. ¤ while some X Example 2. We can view finite automaton A from the Example 1 as Z-{a, b, c}∗ - automaton with coefficients 1 by all letters and transition matrix µ ¶ ab A= bc Let us construct a system of kind (6) for this automaton: P (0) = εP (1) = a + bP (2) = a2 + ab + bc + b2 Q(0) = εQ(1) = c + bQ(2) = c2 + cb + ba + b2 ½

(a2 + ab + bc + b2 )X2 = (a + b)X1 + X0 (c2 + cb + ba + b2 )X2 = (c + b)X1 + X0

To work in Q((G{a,b,c} )) we must fix an order: ε < a < b < c. Let us subtract the 2nd equation from the 1st: (a2 + ab + bc − c2 − cb − ba)X2 = (a − c)X1 and multiply both sides of the equation above by a−1 : (a + b + a−1 bc − a−1 cb − a−1 ba)X2 = (ε − a−1 c)X1 Since ε < a−1 c polynomial (ε − a−1 c) ∈ Q((G{a,b,c} )) and has an inverse, (a−1 c)∗ : (a−1 c)∗ (a + b + a−1 bc − a−1 c2 − a−1 ba)X2 = X1

(8)

48

Kostyantyn Archangelsky

Substitute (8) in the 1st equation of the system: X0 = (a2 + ab + bc + b2 − (a + b)(a−1 c)∗ (a + b + a−1 bc − a−1 c2 − a−1 ba))X2

(9)

Since the coefficient by X1 is not a zero we may set (following Archangelsky [2]) X2 = ε and find X1 , X0 : X1 = (a−1 c)∗ (a + b + a−1 bc − a−1 c2 − a−1 ba) X0 = a2 + ab + bc + b2 − (a + b)(a−1 c)∗ (a + b + a−1 bc − a−1 c2 − a−1 ba) ¤ Theorem 2. Let B = h{q1 , . . . , qn }, B, q1 , F i be K − Σ ∗ - automaton, X0 , X1 , . . ., Xn−1 be a solution of the system of equations of kind (6) for automaton B . Then for all m ∈ N holds (n+m) qi

=

n X

(n+m−i)

qi

Xn−j ,

i = 1, n.

(10)

j=1

Proof. For m = 0 the statement is proved, suppose it is true for m. Then for m + 1 (n+m+1) (5)

qi

=

=

n X l=1

 

n X

(n+m)

Bij qj

j=1 n X j=1

 (n+m−l) 

Bij qi

=

n X

Bij

j=1 (5)

Xn−l =

n X

(n+m−l)

qi

Xn−l =

l=1 n X

(n+m+1−l)

qi

Xn−l .

l=1

¤ We call a representation of FPS qi in the form (10) a quasilinear recurrence representation (QLR-representation). Example 3. We continue with automaton A from Examples 1 and 2. One can check that A accepts the following words of length 3: P (3) = a3 + a2 b + abc + ab2 + bc2 + bcb + b2 a + b3 . Let us try to count P (3) by QLR-representation: P (3) = P (2) X1 + P (1) X0 = = (a2 + ab + bc + b2 )(a−1 c)∗ (a + b + a−1 bc − a−1 c2 − a−1 ba)+

+(a + b)(a2 + ab + bc + b2 − (a + b)(a−1 c)∗ (a + b + a−1 bc − a−1 c2 − a−1 ba)) = = (a2 + ab + bc + b2 − (a + b)2 (a−1 c)∗ (a + b + a−1 bc − a−1 c2 − a−1 ba)+

+(a + b)(a2 + ab + bc + b2 ) = −ba(ε − a−1 c)(a−1 c)∗ (a + b + a−1 bc − a−1 c2 − a−1 ba)+

+(a + b)(a2 + ab + bc + b2 ) = −ba(a + b + a−1 bc − a−1 c2 − a−1 ba) + a3 + a2 b + abc+

+ab2 + ba2 + bab + b2 c + b3 = a3 + a2 b + abc + ab2 + bc2 + bcb + b2 a + b3 = P (3)

A New Approach to the Uncounting Problem for Regular Languages

49

As one can see P (3) does not contain terms with σ −1 and negative coefficients, as claimed. ¤ From the other hand, MNOS X1 and X2 have infinite supports, i.e., their expressions do not cancell into polynomials like above. To illustrate this let us put down explicitly few first members of the expansion (a−1 c)∗ = ε + a−1 c + a−1 ca−1 c + ... in, say, X1 : X1 = a + b + a−1 bc − a−1 c2 − a−1 ba + a−1 ca + a−1 cb + a−1 ca−1 bc − a−1 ca−1 c2 − ... Let us compare counting function {pn } and QLR -representation {P (n) } of automaton A : pn+2 = pn+1 · 2 + pn · 0 P (n+2) = P (n+1) · X1 + P (n) · X0 Since u(P (n) ) = pn z n , n ≥ 0, it looks like it should be u(X1 ) = 2 · z, u(X0 ) = 0 · z 2 . But when we apply unary morphism to, for example, X1 , we obtain u(X1 ) = (z −1 z)∗ (2z + z −1 zz − 3z −1 zz) = ε∗ · ∅ · z = (ε − ε)−1 · ∅ · z = ∞ · ∅ · z.

(11)

Author has no an explanation to this a paradox. (Theoretical existence of situation above was predicted in Amitsur [1], Reutenauer [16], Gelfand,Retakh [9].) We propose instead of applying unary morphism to Xi ∈ K((GΣ )) to apply first a commutative morphism and then apply unary morphism to the result. The proposed method apparently does not have any solid rationale behind it and in case of negative result like (11) would not bring us closer to the problem resolution. So, c(X0 ) = b2 − ac,

c(X1 ) = a + c

and u(c(X0 ) = 0 · z 2 ,

u(c(X1 )) = 2 · z

as expected. After the unsuccessful attempt of the direct calculation u(X0 ) and u(X1 ) it turned out quite unexpectedly that commutative image of such complicated objects like MNOS are usual polynomials. Moreover, these polynomials happen to be precisely the coefficients of the characteristic equation of the commutative image of matrix A for automaton A µ ¶ λ − a −b det = λ2 − λ(a + c) − (b2 − ac) = 0 −b λ − c We will show that this fact holds true in general.

3

The principal result

Theorem 3. If sequence {vk }k≥0 of natural numbers satisfying a linear recurrence relation vk+n =

n X

v(k+n−i) βi ,

k≥0

(12)

i=1

is a counting function of regular language L over alphabet Σ, L(i) =

X w∈L,|w|=i

exist such Xi ∈ Z((GΣ )), i = 1, n that

w, then there

50

Kostyantyn Archangelsky

L(k+n) =

n X

L(k+n−i) Xi ,

k≥0

i=1

u(c(L(i) )) = vi z i ,

i≥0

u(c(Xi )) = βi z i ,

i = 1, n

Proof. We continue with the notations from Theorems 1 and 2. Let matrix A = {aij }i,j=1,n ∈ Z n×n < Σ >, vector F = (fi )Ti=1,n ∈ Z n < Σ >. (k)

(0)

Denote {aij }i,j=1,n = Ak , k ≥ 1 and {aij }i,j=1,n = (if i = j then ε else ∅) = E (unity matrix). Let det (c(Eλ − A)) = λn −

n X

λn−i αi

(13)

i=1

Note,that αi ∈ Z < c(Σ ∗ ) >, i = 1, n. According to Caley-Hamilton theorem c(A) must be a root of equation (13): c(A)n =

n−1 X

c(A)n−i αi + Eαn

i=1

or, in other words,

(n)

c(aij ) =

n−1 X

(n−s)

c(aij

(0)

)αs + c(aij )αn ,

(14)

s=1

i, j = 1, n On the other hand the following correlations (14) of system from Theorem 2 hold in K((GΣ )): n X (n) (n−j) qi = qi Xn−j j=1

or, substituting iterated (5), n X

(n)

aij fj =

n X n X

(n−s)

aij

fj X s ,

i = 1, n

s=1 j=1

j=1

Let us find a value of, say, c(X1 ). Since we work in commutative image, c(X1 ) is a quotient of two determinants (we denote columns a(k) = (

n X i=1

(k)

a1i fi , . . . ,

n X

(k)

ani fi )T , k = 0, n) :

i=1

³ ´ det c(a(n) ), c(a(n−2) ), . . . , c(a(0) ) (14) ³ ´ = c(X1 ) = det c(a(n−1) ), c(a(n−2) ), . . . , c(a(0) )

A New Approach to the Uncounting Problem for Regular Languages

µ

(14)

det

=

51

¶ αs c(a(n−s) ), c(a(n−2) ), . . . , c(a(0) ) s=1 ³ ´ = det c(a(n−1) ), c(a(n−2) ), . . . , c(a(0) ) n P

³ ´ αs det c(a(n−s) ), c(a(n−2) ), . . . , c(a(0) ) ¡ ¢ = s=1 = α1 . det c(a(n−1) ), c(a(n−2) ), . . . , c(a(0) ) n P

Other equalities c(Xi ) = αi can be proved in the same way. Since unary morphism applied to αi ∈ Zhc(Σ ∗ )i is well defined and its image is integer we have finished the proof of principal result of the paper.¤ Corollary 1. If integer sequence {vk }k≥0 satisfies a linear recurrence relation (12) and all βi are rational numbers, then {vk }k≥0 satisfies a (possibly different) linear recurrence relation where all the coefficients are actually integers. (This is a well known lemma of Fatou, see Fatou [8], Polya, Szego [14], Shallit [17]).

4

Implementation, application and open problems

Limitations to the size of the presentation do not allow us to provide all the results obtained on research of the uncounting problem. However, extensive material collected on QLRrepresentation of regular languages makes it possible to formulate Conjecture 1. There exists an effective algorithm for describing all the regular languages with a given counting function.¤ Use of QLR-representation is considered even more prospective for solving the following elegant open problem (Beal, Perrin [4]) Open problem 1. Suppose that we are given a regular language L and two natural sequences xn , yn ,each satisfying a ( possible different) linear recurrence relation, such that xn + yn is the counting function of L. Is it true that there exists a partition L = X ∪ Y such that xn is the counting function of X and yn is the counting function of Y ? As much as interesting would be application of QLR-representation to the research on weak equivalence property of finite automata (Ravikumar, Eisman [15]) : two deterministic finite automata are weakly equivalent if they both accept the same number of words of length k for every k. Investigation of Conjecture 1 deduces a new simple-formulated open problem: Open Problem 2. Let A be a minimal deterministic automaton with m states, an be its counting function. Methods of linear algebra make it possible to build for an a recurrence relation with minimal length, say m0 . Sure, m ≥ m0 . Can (m − m0 ) be large enough? And, finally, QLR-representation definition itself generates the following open problems: Open problem 3. Let in the relation (12) be vi−1 ∈ khΣ ∗ i, βi ∈ KhhΣ ∗ ii, i = 1, n. Find an algorithm to check if: (1) vi ∈ khΣ ∗ i, i ≥ 0. (2)

∞ P i=0

vi ∈ K rat hhΣ ∗ ii

Open problem 4. Let in the relation (12) be vi−1 ∈ khΣ ∗ i, βi ∈ K((GΣ )), i = 1, n. Find an algorithm to check if: (1) vi ∈ khhΣ ∗ ii, i ≥ 0;

52

Kostyantyn Archangelsky

(2) vi ∈ khΣ ∗ i, i ≥ 0 (3)

∞ P i=0

vi ∈ K rat hhΣ ∗ ii.

References 1. Amitsur A., Rational identities and applications to algebra and geometry, J.Algebra, 3, 1966, p.304-359. 2. Archangelsky K.V., A New Representation of Formal Power Series, Proc. FPSAC 04, Vancouver, Canada, 2004. 3. Bassino F., Beal M.-P., Perrin D., Length distributions and regular sequiences, in “ Codes, Systems and Graphical Models”, IMA Volumes in Mathematics and its applications, Springer-Verlag, 2001, p.415-437. 4. Beal M.-P., Perrin D., On the generating sequences of regular languages on k symbols, J.ACM, vol.50, iss.6, 2003, p.955-980. 5. Berstel J., Reutenauer C., Rational Series and their Languages, Springer-Verlag, 1988. 6. Choffrut Ch., Goldwurm M., Rational transductions and complexity of counting problems, Mathematical System Theory, 28 (5), 1995, p. 437-450. 7. Cohn P. M., Free Rings and Their Relations, 2nd edn., Acad. Press, 1985. 8. Fatou P., Series trigonometriques et series de Taylor, Acta Math., 30 (1906), 335-400. 9. Gelfand I.,Retakh V., Quasideterminants,I, Selecta Mathematica, New Series, vol.3, N 4, 1997, p.517-546. 10. Goldwurm M., private e-mail, 09.06.05. 11. Huynh D. T., Effective entropies and data compression, Information and Computation, 90, 1991, p. 67-85. 12. Katayama T., Okamoto M., Enomoto H., Characterization of the Structure-Generating Functions of Regular Sets and the DOL Growth Functions, Information and Control, 36, 1978, p. 85-101. 13. Kuich W., Salomaa A., Semirings, Automata, Languages, 1985. 14. Polya G., Szego G., Problems and Theorems in Analysis, vol. II, Springer-Verlag, Berlin, 1976. 15. Ravikumar B., Eisman G., Weak minimization of DFA - an algorithm and applications, TCS, v.328, iss. 1-2, 2004, p.113-133. 16. Reutenauer C., Malcev-Neumann series and the free field, Expositiones Mathematicae, 17, 1999, p.469-478. 17. Shallit J., Numeration Systems, Linear Recurrences, and Regular Sets, Information and Computation, 113 (2), 1994, p.331-347.

Paraconsistent Reasoning and Distance Minimization Ofer Arieli Department of Computer Science, The Academic College of Tel-Aviv, Israel. [email protected]

Abstract. We introduce a general framework that is based on distance semantics and investigate the main properties of the entailment relations that it induces. It is shown that such entailments are particularly useful for non-monotonic reasoning and for drawing rational conclusions from incomplete and inconsistent information. Some applications are considered in the context of belief revision, information integration systems, and consistent query answering for possibly inconsistent databases.

1

Introduction

Common-sense reasoning is frequently based on the ability to make plausible decisions among different options. This is particularly notable in the presence of inconsistency or incompleteness, where the reasoner’s epistemic state may vary among different alternatives. Distance semantics is a subtle way of handling such situations, as it provides quantitative means for evaluating those epistemic states and for drawing rational conclusions from a given theory. There is no wonder, therefore, that distance semantics has played a prominent role in different paradigms for non-monotonic information processing and consistency maintenance, such as formalisms for modelling belief revision (e.g., [9, 15, 18, 25]), preference representation [24], database integration systems [2, 3, 11, 26], and operators for merging constraint data-sources [21, 22]. The goal of this paper is to introduce similar distance considerations in the context of commonsense reasoning in general, and paraconsistent logics in particular. That is, formalisms that tolerate inconsistency and do not become trivial in the presence of contradictions.1 Classical logic, the most advocated formalism for reasoning with mathematical theories, is not useful for this task as, for instance, any conclusion classically follows from an inconsistent set of assumptions. Additionally, by its definition, classical logic is monotonic, while human thinking is non-monotonic in nature (that is, the set of conclusions is not necessarily non-decreasing in the size of the premises). The underlying theme here is that human knowledge and thinking necessarily requires inconsistency, and so conflicting data is unavoidable in practice, but it corresponds to inadequate information about the real world, and therefore it should be minimized. As we show below, this intuition is nicely and easily expressed in terms of distance semantics. The rest of this paper is organized as follows: in the next section we introduce the framework and the family of distance-based entailments that it induces. Then, in Section 3 we consider some basic properties of these entailments and in Section 4 we discuss their applications in relevant areas, such as operators for belief revision and consistent query answering in database systems. Finally, in Section 5 we briefly discuss some extensions to multiple-valued structures. In Section 6 we conclude.

2

Distance-based semantics and entailments

The intuition behind our approach is very simple. Suppose, for instance, that a certain set of assumptions Γ consists only of two facts p and q. In this case it seems reasonable to use 1

See [12] and [30]. Some collections of papers on this topic appear, e.g., in [7, 10]. Distance-based semantics for paraconsistent reasoning is also considered in [4].

54

Ofer Arieli

the classical entailment for inferring the formulas in the transitive closure of Γ . If we learn now that ¬p also holds, classical logic become useless, as everything classically follows from Γ 0 = Γ ∪ {¬p}. The decision how to maintain the inconsistent fragment of Γ 0 depends on the underlying formalism. For example, most of the belief revision operators prefer more recent information thus conclude ¬p and exclude p in this case. Alternatively, many merging operators that view Γ and ¬p as belief bases of two different sources will retract both p and ¬p, and so forth. It is evident, however, that ¬q should not follow from Γ 0 . In our context this is captured by the fact that valuations in which q holds are ‘closer’ to Γ 0 (thus are more plausible) than valuations in which q is falsified. In what follows we formalize this idea. In the sequel, unless otherwise stated, we shall consider finite theories (i.e., sets of premises, denoted by Γ ) in a propositional language L with a finite set Atoms of atomic formulas. The space of the two-valued interpretations on Atoms is denoted by Λ. The set of atomic formulas that occur in the formulas of Γ is denoted Atoms(Γ ) and the set of models of Γ (that is, the two-valued interpretations ν ∈ Λ such that ν(ψ) = t for every ψ ∈ Γ ) is denoted mod(Γ ). Definition 1. A total function d : U×U → R+ is called pseudo distance on U if it is symmetric (∀u, v ∈ U d(u, v) = d(v, u)) and preserves identity (∀u, v ∈ U d(u, v) = 0 iff u = v). A distance function on U is a pseudo distance on U that satisfies the triangular inequality (∀u, v, w ∈ U d(u, v) ≤ d(u, w) + d(w, v)). Example 1. The the following two functions are distances on Λ. – The Hamming distance: dH (ν, µ) = |{p ∈ Atoms | ν(p) 6= µ(p)} |.

2

– The drastic distance: dU (ν, µ) = 0 if ν = µ and dU (ν, µ) = 1 otherwise. Definition 2. A numeric aggregation function f is a total function that accepts a multiset of real numbers and returns a real number. In addition, f is non-decreasing in the values of its argument,3 f ({x1 , . . . , xn }) = 0 iff x1 = . . . = xn = 0, and ∀x ∈ R f ({x}) = x. The aggregation functions in Definition 2 may be, e.g., a summation or the average of the distances, the maximum value among those distances (which yields a worst case analysis), a median value (for mean case analysis), and so forth. Such functions are common, for instance, in data integration systems (see, e.g., Example 5 in Section 4.3). Definition 3. Given a theory Γ = {ψ1 , . . . , ψn }, a two-valued interpretation ν, a pseudodistance d and an aggregation function f , define: – d(ν, ψi ) = min{d(ν, µ) | µ ∈ mod(ψi )}, – δd,f (ν, Γ ) = f ({d(ν, ψ1 ), . . . , d(ν, ψn )}). The next definition captures the intuition behind distance semantics that the relevant interpretations of a theory Γ are those that are δd,f -closest to Γ . Definition 4. The most plausible valuations of Γ (with respect to a pseudo distance d and an aggregation function f ) are the valuations ν that belong to the following set: © ª ∆d,f (Γ ) = ν ∈ Λ | ∀µ ∈ Λ δd,f (ν, Γ ) ≤ δd,f (µ, Γ ) . Corresponding consequence relations are now defined as follows. 2

3

I.e., dH (ν, µ) is the number of atoms p s.t. ν(p) 6= µ(p). This function is also known as the Dalal distance [13]. That is, the function value is non-decreasing when an element in the multiset is replaced by a larger element.

Paraconsistent Reasoning and Distance Minimization

55

Definition 5. For a pseudo distance d and an aggregation function f , define Γ |=d,f ψ if ∆d,f (Γ ) ⊆ mod(ψ). That is, conclusions should follow from all of the most plausible valuations of the premises. Example 2. Consider Γ = {p, q, r, ¬p∨¬q, r∧s}. This theory is not consistent, and so everything classically follows from it, including, e.g., ¬r, which seems to be a very strange conclusion in this case.4 Using distance-based semantics, this anomaly can be lifted. The following table lists the distances between the relevant valuations and Γ according to several common metrics: ν1 ν2 ν3 ν4 ν5 ν6 ν7 ν8

p t t t t t t t t

q t t t t f f f f

r t t f f t t f f

s δdU,Σ δdH,Σ δdH,max t 1 1 1 f 2 2 1 t 3 3 1 f 3 4 2 t 1 1 1 f 2 2 1 t 3 3 1 f 3 4 2

ν9 ν10 ν11 ν12 ν13 ν14 ν15 ν16

p f f f f f f f f

q t t t t f f f f

r t t f f t t f f

s δdU,Σ δdH,Σ δdH,max t 1 1 1 f 2 2 1 t 3 3 1 f 3 4 2 t 2 2 1 f 3 3 1 t 4 4 1 f 4 5 2

Here, ∆dU,Σ (Γ ) = ∆dH,Σ (Γ ) = {ν1 , ν5 , ν9 }, thus Γ |=dU,Σ r and Γ |=dH,Σ r, while Γ 6|=dU,Σ ¬r and Γ 6|=dH,Σ ¬r. The same thing happens with s, as intuitively expected. Note also that the atoms p, q that are involved in the inconsistency are not deducible from Γ , nor their complements. The entailment |=dH,max is more cautious; it does not allow to infer neither ¬r (as expected) nor r, but the weaker conclusion r ∨ s is deducible.

3

Reasoning with |=d,f

The principle of uncertainty minimization by distance semantics, depicted in Definition 5, is in fact a preference criterion among different interpretations of the premises. In this respect, the formalisms that are defined here may be considered as a certain kind of preferential logics [27, 28, 32], as only ‘preferred’ valuations (those that are ‘as close as possible’ to the set of premises) are taken into consideration for drawing conclusions from the premises. When the set of premises is classically consistent, its set of models is not empty, so it is natural to choose these valuations as the preferred (i.e., most plausible) ones. The following proposition shows that the models of a theory Γ are indeed closest to Γ .5 Proposition 1. Let Γ be a consistent theory. For every pseudo distance d and aggregation function f , ∆d,f (Γ ) = mod(Γ ). Corollary 1. Let |= be the standard entailment of classical logic. For every classically consistent set of formulas Γ and formula ψ, Γ |= ψ iff Γ |=d,f ψ. A characteristic property of distance-based entailments is that contradictions do not have an explosive character: Proposition 2. For every pseudo distance d and aggregation function f , |=d,f is paraconsistent. Corollary 1 and Proposition 2 imply the following desirable property of |=d,f : 4

5

Indeed, r is not part of the inconsistent fragment of Γ , therefore it is not sensible in this case to conclude its complement. Due to space limitations proofs will appear in an extended version of this paper.

56

Ofer Arieli

Corollary 2. For every pseudo distance d and aggregation function f , |=d,f is the same as the classical entailment with respect to consistent premises, and is non-trivial otherwise. For the next propositions we concentrate on unbiased distances: Definition 6. A (pseudo) distance d is unbiased , if for every formula ψ and every two-valued interpretations ν1 , ν2 , if ν1 (p) = ν2 (p) for every p ∈ Atoms(ψ), then d(ν1 , ψ) = d(ν2 , ψ). The last property assures that a distance between an interpretation and a formula depends only on the relevant atoms (i.e., those that appear in the formula), so it is not ‘biased’ by irrelevant atoms. Note, e.g., that the distances in Example 1 are unbiased. Next we show that entailments that are defined by unbiased distances, are non-monotonic, and so conclusions may be retracted in light of new information. Proposition 3. For every unbiased pseudo distance d and aggregation function f , |=d,f is non-monotonic. It is important to note that often, non-monotonicity goes along with rationality, that is: previously drawn conclusions do not have to be revised in light of new information that has no influence on the existing set of premises. This is shown in Proposition 4 below: Definition 7. An aggregation function f is hereditary, if f ({x1 , . . . , xn }) < f ({y1 , . . . , yn }) implies f ({x1 , . . . , xn , z1 , . . . , zm }) < f ({y1 , . . . , yn , z1 , . . . , zm }).6 Proposition 4. Let d be an unbiased pseudo distance and f a hereditary aggregation function. If Γ |=d,f ψ then Γ, φ |=d,f ψ for every formula φ such that Atoms(Γ ∪ {ψ}) ∩ Atoms({φ}) = ∅. Intuitively, the condition on φ in Proposition 4 guarantees that φ is ‘irrelevant’ for Γ and ψ. The intuitive meaning of Proposition 4 is, therefore, that the reasoner does not have to retract ψ when learning that φ holds.7 We conclude this section by observing that in general, a distance-based entailment of the form |=d,f does not satisfy any of the three properties that a Tarskian consequence relation [33] should have. Indeed, let d be an unbiased pseudo distance and f a hereditary aggregation function. Then 1. Example 2 shows that |=d,f is not reflexive, 2. Proposition 3 shows that |=d,f is not monotonic, and 3. the cut rule is violated as well: by Corollary 1, p |=d,f ¬p → q and ¬p, ¬p → q |=d,f q, but, as it is easily verified, p, ¬p 6|=d,f q. Yet, for unbiased pseudo distances and hereditary functions, |=d,f does satisfy the weaker conditions stated in Definition 9 below, guaranteeing a ‘proper behaviour’ of nonmonotonic entailments in the context of inconsistent information. Definition 8. Denote by Γ = Γ 0 ⊕ Γ 00 that Γ can be partitioned to two subtheories Γ 0 and Γ 00 (i.e., Γ = Γ 0 ∪ Γ 00 and Atoms(Γ 0 ) ∩ Atoms(Γ 00 ) = ∅). Definition 9. A cautious consequence relation is a relation |∼ between sets of formulae and formulae, that satisfies the following conditions: cautious reflexivity: if Γ = Γ 0 ⊕ Γ 00 and Γ 0 is consistent, then Γ |∼ ψ for ψ ∈ Γ 0 . cautious monotonicity [16]: if Γ |∼ ψ and Γ |∼ φ, then Γ, ψ |∼ φ. cautious cut [23]: if Γ |∼ ψ and Γ, ψ |∼ φ, then Γ |∼ φ. Proposition 5. For an unbiased pseudo distance d and a monotonic hereditary aggregation function f , |=d,f is a cautious consequence relation. 6

7

Note that hereditary, unlike monotonicity, is defined by strict inequalities. Thus, for instance, the summation is hereditary (as distances are non-negative), while the maximum function is not hereditary. To see that the condition on f in Proposition 4 is indeed necessary, consider again the theory Γ in Example 2 and let Γ 0 = {r, r ∧ s}. Then Γ 0 |=dH,max r but Γ 6|=dH,max r.

Paraconsistent Reasoning and Distance Minimization

4

57

Some Applications

The general form of distance-based reasoning allows us to apply it in several areas. Below we show this in the context of three basic operations in information systems: repair (Section 4.1), revision (Section 4.2) and merging (Section 4.3). 4.1

Database Repair

Definition 10. A database DB is a pair (D, IC), where D (the database instance) is a finite set of atoms, and IC (the integrity constraints) is a finite and consistent set of formulae. The meaning of D is determined by the conjunction of its facts, augmented with Reiter’s closed world assumption [31], stating that each atomic formula that does not appear in D is false: CWA(D) = {¬p | p ∈ / D}. A database DB = (D, IC) is thus associated with the following theory: ΓDB = D ∪ CWA(D) ∪ IC. A database (D, IC) is consistent if all the integrity constraints are satisfied by the database instance, that is: D ∪ CWA(D) |= ψ for every ψ ∈ IC. When a database is not consistent, at least one integrity constraint is violated, and so it is usually required to ‘repair’ the database, i.e., restore its consistency. Clearly, the repaired database instance should be consistent and at the same time as close as possible to D. This can be described in our framework as follows: given a pseudo distance d and an aggregation function f , we consider for every database DB the following set of (most plausible) interpretations (cf. Definition 4, where Λ is replaced by mod(IC)): © ∆d,f (ΓDB ) = ν ∈ mod(IC) | ∀µ ∈ mod(IC) ¡ ¢ ¡ ¢ª δd,f ν, D ∪ CWA(D) ≤ δd,f µ, D ∪ CWA(D) . Again, we denote DB |=d,f ψ if ∆d,f (ΓDB ) ⊆ mod(ψ). Now we can represent consistent query answering [2, 3] in our framework: Definition 11. Let DB be a (possibly inconsistent) database, and let ψ be a formula in L. – If ∆d,f (ΓDB ) ∩ mod(ψ) 6= ∅ (i.e., ψ is satisfied by some most plausible interpretation of ΓDB ), we say that ψ credulously follows from DB. – If DB |=d,f ψ (i.e., ψ is satisfied by every most plausible interpretation of ΓDB ), then ψ conservatively follows from DB. Example 3. Let D = {p, r}, and IC = {p → q}. Here, ΓDB = {p, r, ¬q, p → q}. When d is the drastic distance and f is the summation function, ∆d,f (ΓDB ) has two elements: ν1 (p) = t, ν1 (q) = t, ν1 (r) = t and ν2 (p) = f, ν2 (q) = f, ν2 (r) = t. In terms of distance entailment, then, ΓDB |=dU ,Σ r, while ΓDB 6|=dU ,Σ p and ΓDB 6|=dU ,Σ q. This can be justified by the fact that there are two ways to restore the consistency of DB by minimal changes in the database instance: either q is inserted to the database instance or p is deleted from it. This leave r the only element that is always in the ‘repaired’ database instance. Indeed, there is no reason to remove r from D, as this will not contribute to the consistency restoration of DB. It follows, then, that r conservatively (and so credulously) follows from DB, while p and q credulously (but not conservatively) follow from DB. The same results are obtained by the query answering formalisms considered e.g. in [2, 3, 6, 17]. 4.2

Belief Revision

A belief revision theory describes how a belief state is obtained by the revision of a belief state B by some new information, ψ. The new belief state, denoted B ◦ ψ, is usually characterized by

58

Ofer Arieli

the ‘closest’ worlds to B in which ψ holds. Clearly, this principle of minimal change is derived by distance considerations, so it is not surprising that it can be expressed in our framework. Indeed, given a pseudo distance d and an aggregation function f , the most plausible representations of the new belief state may be defined as follows: ¡ ¡ © ∆d,f (B ◦ ψ) = ν ∈ mod(ψ) | ∀µ ∈ mod(ψ) δd,f ν, B) ≤ δd,f µ, B)}. The revised conclusions of the reasoner may now be represented, again, by a distance-based entailment: B ◦ ψ |=d,f φ iff ∆d,f (B ◦ ψ) ⊆ mod(φ). Example 4. The revision operator ∆dH,Σ is the same as the one considered in [13]. It is wellknown that this operator satisfies the AGM postulates [1]. 4.3

Information Integration

Integration of autonomous data-sources under global integrity constraints (see [22]) is also applicable in our framework. Given n independent data-sources Γ1 , . . . , Γn and a consistent set of global integrity constraints IC, the sources should be merged to a theory Γ that reflects the collective information of the local sources in a coherent way (that is, Γ |= ψ for every ψ ∈ IC). Clearly, the union of the distributed information might not preserve IC, and in such cases the intuitive idea is to minimize the overall distance between Γ and Γi (1 ≤ i ≤ n). This can be done by the following straightforward extension of Definition 4: Definition 12. Let Γ = {Γ1 , . . . , Γn } be a set of n finite theories in L, d a pseudo-distance function, and f, g two aggregation functions. For an interpretation ν ∈ Λ and a theory Γ , let δd,f (ν, Γ ) be the same as in Definition 3. Now, define: ¡ ¢ δd,f,g (ν, Γ ) = g {δd,f (ν, Γ1 ), . . . , δd,f (ν, Γn )} . The most plausible valuations for the integration of the elements in Γ (with respect to d, f and g) are the valuations ν that belong to the following set: © ª ∆d,f,g (Γ , IC) = ν ∈ mod(IC) | ∀µ ∈ mod(IC) δd,f,g (ν, Γ ) ≤ δd,f,g (µ, Γ ) . Information integration is now definable as a direct extension of Definition 5: Definition 13. Γ , IC |=d,f,g ψ iff ∆d,f,g (Γ , IC) ⊆ mod(ψ). Example 5. [22] Four flat co-owners discuss the construction of a swimming pool (s), a tenniscourt (t) and a private car-park (p). It is also known that an investment in two or more items will increase the rent (r), otherwise the rent will not be changed. The opinions of the owners are represented by the following four data-sources: Γ1 = Γ2 = {s, t, p}, Γ3 = {¬s, ¬t, ¬p, ¬r}, 8 and Γ4 = {t, ¡ p, ¬r}. The impact on¢the rent may be represented by the integrity constraint IC = {r ↔ (s ∧ t) ∨ (s ∧ p) ∨ (t ∧ p) }. Note that although the opinion of owner 4 violates the integrity constraint (while the solution must preserve the constraint), it is still taken into account. Consider now two merging contexts in which d is the drastic distance and f is the summation function. The difference is that according to one merging context the summation of the distances to the source is minimized (i.e., g = Σ), and in the other context minimization of maximal distances is used for choosing optimal solutions (that is, g = max). The models of IC and their distances to Γ = {Γ1 , . . . , Γ4 } are listed below. 8

Here, q ∈ Γi denotes that owner i supports q and ¬q ∈ Γi denotes that i is against q.

Paraconsistent Reasoning and Distance Minimization

ν1 ν2 ν3 ν4

s t t t t

t t t f f

p t f t f

r δdU,Σ,Σ δdU,Σ,max t 5 4 t 7 3 t 7 3 f 7 2

ν5 ν6 ν7 ν8

s f f f f

t t t f f

p t f t f

59

r δdU,Σ,Σ δdU,Σ,max t 7 3 f 6 2 f 6 2 f 8 3

The most plausible interpretations in each merging context are determined by the minimal values in the two right-most columns. It follows that according to the first context ν1 is the (unique) most-plausible interpretation for the merging, thus: Γ , IC |=dU ,Σ,Σ s ∧ t ∧ p, and so the owners decide to build all the three facilities (and the rent increases). In the other context we have three optimal interpretations, as ∆dU,Σ,max (Γ , IC) = {ν4 , ν6 , ν7 }. This implies that only one out of the three facilities will be built, and so the rent will remain the same. See [21, 22] for detailed discussions on distance operators for merging constraint belief-bases and some corresponding complexity results.

5

Multiple-valued semantics

Our framework can be easily extended to multiple-valued semantics. In this case, the underlying semantics is given by multiple-valued structures, which are triples of the form S = hV, O, Di, where V is the set of the truth values, O is a set of operations on V that correspond to the connectives in the language L, and D is a nonempty proper subset of V, representing the designated values of V, i.e., those that correspond to true assertions. In this setting, Vinterpretations are functions from the atomic formulas to V, and their extensions to complex formulas are as usual. A V-valuation ν is an S-model of Γ if ν(ψ) ∈ D for every ψ ∈ Γ . The set of S-models of Γ is denoted by modS (Γ ). The notions of basic S-entailments and distance-based entailments are the obvious generalizations to the multiple-valued case of the corresponding definitions for the two-valued case: Γ |=S ψ iff every S-model of Γ is an S-model of ψ. For a pseudo distance function d and an aggregation function f , Γ |=Sd,f ψ iff ∆d,f (Γ ) ⊆ modS (ψ). The only difference from the two-valued case is that now ∆d,f (Γ ) is defined with respect to V-valued interpretations rather than two-valued ones. Multiple-valued settings, such as the three-valued frameworks of Kleene [20] and Priest [29], Belnap’s four-valued structure [8], (bi)lattice-valued logics [5], fuzzy logics [19], and so forth, open the door to the introduction of many new distance functions. For instance, in the threevalued case, where a middle element m is added to the classical values t and f, a natural generalization of the Hamming distance dH (Definition 1) may be defined P by associating the (ν, µ) = values 1, 21 , and 0 with t, m, and f (respectively), and letting dH 3 p∈Atoms |ν(p) − µ(p)|. This function is used, e.g., in [14] for defining (three-valued) integration systems.

6

Conclusion

The principle of minimal change is a primary motif in commonsense reasoning, and it is often implicitly derived by distance considerations. In this paper we introduced a simple and natural framework for representing this principle in an explicit way, and explored the main logical properties of the corresponding consequence relations. It is shown that such entailments sustain different aspects of human thinking, such as non-monotonicity, paraconsistency, and rationality. A number of applications are also considered.

60

Ofer Arieli

References 1. C. E. Alchourrón, P. Gärdenfors, and D. Makinson. On the logic of theory change: Partial meet contraction and revision function. Journal of Symbolic Logic, 50:510–530, 1985. 2. M. Arenas, L. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In Proc. PODS’99, pages 68–79, 1999. 3. M. Arenas, L. Bertossi, and J. Chomicki. Answer sets for consistent query answering in inconsistent databases. Theory and Practice of Logic Programming, 3(4–5):393–424, 2003. 4. O. Arieli. Distance-based semantics for multiple-valued logics. In Proc. 11th Int. Workshop on Non-Monotonic Reasoning, pages 153–161, 2006. 5. O. Arieli and A. Avron. Bilattices and paraconsistency. In Batens et al. [7], pages 11–27. 6. O. Arieli, M. Denecker, and M. Bruynooghe. Distance-based repairs of databases. In Proc. JELIA’06, LNAI 4160, pages 43–55. Springer, 2006. 7. D. Batens, C. Mortenson, G. Priest, and J. Van Bendegem, editors. Frontiers of paraconsistent logic. Research Studies Press, 2000. 8. N. D. Belnap. How a computer should think. In G. Ryle, editor, Contemporary Aspects of Philosophy, pages 30–56. Oriel Press, 1977. 9. J. Ben Naim. Lack of finite characterizations for the distance-based revision. In Proc. KR’06, pages 239–248, 2006. 10. W. Carnielli, M. Coniglio, and I. Dóttaviano, editors. Paraconsistency: The logical way to the inconsistent, volume 228 of Lecture Notes in Pure and Applied Mathematics. Marcel Dekker, 2002. 11. J. Chomicki and J. Marchinkowski. Minimal-change integrity maintenance using tuple deletion. Information and Computation, 197(1–2):90–121, 2005. 12. N. C. A. da Costa. On the theory of inconsistent formal systems. Notre Dame Journal of Formal Logic, 15:497–510, 1974. 13. M. Dalal. Investigations into a theory of knowledge base revision. In Proc. AAAI’88, pages 475–479. AAAI Press, 1988. 14. S. de Amo, W. A. Carnielli, and J. Marcos. A logical framework for integrating inconsistent information in multiple databases. In Proc. FoIKS’02, LNCS 2284, pages 67–84. Springer, 2002. 15. D. Dubois and H. Prade. Belief change and possibility theory. In P. Gärdenfors, editor, Belief Revision, pages 142–182. Cambridge Press, 1992. 16. D. Gabbay. Theoretical foundation for non-monotonic reasoning, Part II: Structured nonmonotonic theories. In Proc. SCAI’91. IOS Press, 1991. 17. S. Greco and E. Zumpano. Querying inconsistent databases. In Proc. LPAR’2000, LNAI 1955, pages 308–325. Springer, 2000. 18. A. Grove. Two modellings for theory change. Journal of Philosophical Logic, 17:157–180, 1988. 19. P. Hájek. Metamatematics of fuzzy logic. Kluwer, 1998. 20. S. C. Kleene. Introduction to Metamathematics. Van Nostrand, 1950. 21. S. Konieczny, J. Lang, and P. Marquis. Distance-based merging: A general framework and some complexity results. In Proc. KR’02, pages 97–108, 2002. 22. S. Konieczny and R. Pino Pérez. Merging information under constraints: a logical framework. Logic and Computation, 12(5):773–808, 2002. 23. S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence, 44(1–2):167–207, 1990. 24. C. Lafage and J. Lang. Propositional distances and preference representation. In Proc. ECSQARU2001, LNAI 2143, pages 48–59. Springer, 2001. 25. D. Lehmann, M. Magidor, and K. Schlechta. Distance semantics for belief revision. Journal of Symbolic Logic, 66(1):295–317, 2001. 26. J. Lin and A. O. Mendelzon. Knowledge base merging by majority. In Dynamic Worlds: From the Frame Problem to Knowledge Management. Kluwer, 1999. 27. D. Makinson. General patterns in nonmonotonic reasoning. In D. Gabbay, C. Hogger, and J. Robinson, editors, Handbook of Logic in Artificial Intelligence and Logic Programming, volume 3, pages 35–110. 1994. 28. J. McCarthy. Circumscription – A form of non monotonic reasoning. Artificial Intelligence, 13(1– 2):27–39, 1980. 29. G. Priest. Reasoning about truth. Artificial Intelligence, 39:231–244, 1989.

Paraconsistent Reasoning and Distance Minimization

61

30. G. Priest. Paraconsistent logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, volume 6, pages 287–393. Kluwer, 2002. 31. R. Reiter. On closed world databases. In Logic and Databases, pages 55–76. 1978. 32. Y. Shoham. Reasoning About Change. MIT Press, 1988. 33. A. Tarski. Introduction to Logic. Oxford University Press, 1941.

Syntactic Approximations to Computational Complexity Classes Argimiro Arratia1? and Carlos E. Ortiz2?? 1

Dpto. de Matemática Aplicada, Facultad de Ciencias, Universidad de Valladolid, Valladolid 47005, Spain, [email protected], 2 Department of Mathematics and Computer Science, Arcadia University, 450 S. Easton Road, Glenside, PA 19038-3295, U.S.A. [email protected]

Abstract. We present a formal syntax of approximate formulas suited for the logic with counting quantifiers SOLP. This logic was studied by us in [1] where, among other properties, we showed: (i) In the presence of a built–in (linear) order, SOLP can describe NP–complete problems and fragments of it capture classes like P and NL; (ii) weakening the ordering relation to an almost order we can separate meaningful fragments, using a combinatorial tool adapted to these languages. The purpose of the approximate formulas presented here, is to provide a syntactic approximation to logics contained in SOLP with built-in order, that should be complementary of the semantic approximation based on almost orders, by producing approximating logics where problems are described within a small counting error. We state and prove a Bridge Theorem that links expressibility in fragments of SOLP over almost-ordered structures to expressibility with respect to approximate formulas for the corresponding fragments over ordered structures. A consequence of these results is that proving inexpressibility results over fragments of SOLP with built-in order could be done by proving inexpressibility over the corresponding fragments with built-in almost order, where separation proofs are allegedly easier.

1

Introduction

Descriptive Complexity deals mainly with producing logics that define all problems of particular computational complexity, and adapting the classical tools for showing inexpressibility of queries in logics in the context of finite models, in the hope to obtain worthy lower bounds for computational classes such as P or NP. The limitations of this logical approach to showing computational complexity bounds for classes like say, P, NL (nondeterministic logspace), and others within NP, boils down to the fact that, as of today, all known logics that define problems in these classes need a relation of linear order built–in into their semantics. In the presence of this built–in linear order, logical inexpressibility tools such as Ehrenfeucht-Fraïssé games have little power for telling structures apart (e.g. see [4, § 6.6]). But, on the other hand, in the absence of this built–in linear order, logics loose significantly expressive power: For example, first order logic (FO) extended with a least fixed point operator (LFP(FO)) with order captures all of P (in the sense that it is capable of defining all polynomial time computable properties), but without order can not express the parity of the size of a set. To overcome this difficulty, a natural idea is to study approximations to logics with built–in order, where techniques like Ehrenfeucht-Fraïssé games become effective in showing separability results, and hopefully these separations in the approximate setting will give a clue on how to go about separating the associated logics with order. ?

??

Supported by grants MOISES (TIN2005-08832-C03-02) and SINGACOM (MTM2004-00958), MEC– Spain Supported by a Faculty Award Grant from the Christian R. & Mary F. Lindback Foundation, and a Visiting Research Fellowship from Universidad de Valladolid

Syntactic Approximations

63

There are two main approaches to define approximate logics in model theory. One is to play with the semantics, where constructs as built–in order is weakened to an “almost–order”, and, frequently, some counting operator is added to compensate for the loss of expressive power. This has been the typical approach within the Descriptive Complexity community (e.g. [2], [6] among others), and it has some severe limitations: For example, the paper by Libkin and Wong [6] shows that a very powerful extension of first order logic with additional counting quantifiers, known as L∗∞ω (C), which subsumes various counting extensions of FO, in the presence of almost–orders has the bounded number of degrees property (or BNDP) and thus cannot express the transitive closure of a binary relation. The other approach is syntactic and is found in classical model theory as in, for example, Keisler’s logic of probability quantifiers (see [5]), who conceived it as a logic appropriate for his investigations on probability hyperfinite spaces, or infinite structures suitable for approximating large finite phenomena of applied mathematics. Under this approach, for each formula ϕ of a logic and every real number ² one construct an approximate formula ϕ² with the property that in every model A, if ²1 < 0 < ²2 then ϕ²1 → ϕ → ϕ²2 , and as ² tends to 0, the interpretation of ϕ² should be closer to ϕ. This approach has been developed with success in the theory of classical metric spaces but not, to our knowledge, in Computational Complexity theory. In this paper we develop a syntactic approach to the task of approximating logics with built– in order based on a notion of approximate formulas á la Keisler, and show how it relates to the semantic approach based on almost orders. This approach is potentially relevant to the problem of separating logics with built-in order, since we obtain a result that implies that separation of logics with built-in almost-order can be translated into separation of corresponding logics with built-in order. The framework for our results is the second order logic of proportionality quantifiers, SOLP, defined in [1]. The quantifiers for this logic are counting quantifiers acting upon second order terms. When restricted to built-in almost orders this logic avoids the BNDP, has non trivial expressive power, and general separation results of combinatorial nature can be obtained. We review the definition of SOLP and summarise facts found in [1] about its expressive power in the presence of almost orders in section 2. In section 3 we introduce the new syntax of approximate formulas suited for SOLP, and prove a Bridge Theorem which establish a correspondence between satisfaction of formulas in SOLP in almost ordered structures and satisfaction of the corresponding approximate formulas in ordered structures. In section 4 we introduce the notion of the ²–approximate logic L² , for every fragment L of SOLP; a logic that should have an expressive power “almost” similar to the expressive power of L. This notion in turn generates the notions of strong expressibility and ²–relaxed fragments. An ²– relaxed fragment is one for which Lδ = L (in terms of expressive power) for every δ ∈ (−², ²). Surprisingly, fragments of SOLP with built–in order that capture P and NL are ²–relaxed. A nice property of ²–relaxed logics is that for them strong expressibility and expressibility are “almost” equivalent (an idea that we will formalise). A consequence of this is Theorem 4 that shows that to prove inexpressibility of problems in ²–relaxed logics with built–in order it is enough to prove inexpressibility of the same problem in the δ-approximate logics (δ ∈ (−², ²)) with respect to almost ordered structures. Since proving inexpressibility for logics over almost orders is easier, in practice, than the usual checking of satisfaction in ordered structures, this last result has potential applicability for studying separation of well known logics with built-in order, such as the ones that capture NL and P. Due to the CiE strict policy on the number of pages, we are presenting our results without proofs. The full version can be obtained from the first author’s web page.

64

2

A. Arratia and C. Ortiz

The Second Order Logic of Proportional Quantifiers

Definition 1. The Second Order Logic of Proportional quantifiers, SOLP, is the set of formulas of the form Q1 · · · Qu θ(x1 , . . . , xs , X1 , . . . , Xr ) (1) where θ(x1 , . . . , xs , X1 , . . . , Xr ) is a first order formula over some vocabulary τ with (free) first order variables x1 , . . . , xk and second order variables, X1 , . . . , Xr ; each Qj (j ≤ u) is either (P (Xi ) ≥ ti ) or (P (Xi ) ≤ ti ), where ti is a rational in (0, 1), for some i ≤ r. Whenever we want to make the underlying vocabulary τ explicit we will write SOLP(τ ). We also define SOLP(τ )[r1 , . . . , rk ], for a given vocabulary τ and sequence r1 , r2 ,. . . , rk of distinct natural numbers, as the sublogic of SOLP(τ ) where the proportional quantifiers can only be of the form (P (X) ≤ q/ri ) or (P (X) ≥ q/ri ), for i = 1, . . . , k and q a natural number such that 0 < q < ri . Another fragment of SOLP which will be of interest for us is the Second Order Monadic Logic of Proportional quantifiers, denoted SOMLP, which is SOLP with the arity of the second order variables in (1) being all equal to 1. The interpretation for the proportional quantifiers is very natural: Let X be a second order variable of arity k, Y a vector of second order variables, x = x1 , . . . , xm first order variables and φ(x, Y , X) a formula in SOLP(τ ) over some (finite) vocabulary τ (which does not contains X or any of the variables in Y as a relation symbol). Let r be a rational in (0, 1). Then (P (X) ≥ r)φ(x, Y , X) and (P (X) ≤ r)φ(x, Y , X) have the following semantics. For an appropriate finite τ –structure A, elements a = (a1 , . . . , am ) in A and an appropriate vector of relations B over A, we have A |= (P (X) ≥ r)φ(a, B, X) ⇐⇒ there exists S ⊆ Ak such that A |= φ(a, B, S) and |S| ≥ r · |A|k Similarly for (P (X) ≤ r)φ(x, Y , X), substituting ≥ for ≤ above. Example 1. Let τ = {R, s, t} where R is a ternary relation symbol, and s and t are constant symbols. Let r be a rational with 0 < r < 1. We define NOT-IN-CLOS≤r := {A = hA, R, s, ti : A has a set containing s but not t, closed under R, and of size at most a fraction r of |A| }. Let βnclos (X) be the following formula ∀x∀u∀v [X(s) ∧ ¬X(t) ∧ (X(u) ∧ X(v) ∧ R(u, v, x) → X(x))] Then A ∈ NOT-IN-CLOS≤r ⇐⇒ A |= (P (X) ≤ r)βnclos (X). In [1] it is shown that, for r = 1/n, this problem is complete for P under first order reductions. Example 2. Let τ = {E, s} where E is a binary relation symbol and s is a constant symbol. We think of τ -structures as graphs or digraphs (directed graphs) with a specified vertex s. Let r be a rational with 0 < r < 1. We define NCON≥r := {A = hA, E, si : hA, Ei is a digraph and at least a fraction r of the vertices are not connected to s} Let αncon (Y ) be the following formula αncon (Y ) := ¬Y (s) ∧ ∀x∀y(E(x, y) ∧ Y (x) → Y (y)) Then A ∈ NCON≥r ⇐⇒ A |= (P (Y ) ≥ r)αncon (Y ). We proved NCON≥1/2 is complete for NL under first order reductions (see [1]).

Syntactic Approximations

2.1

65

Summary of Facts about SOLP

In [1] we study the expressive power of SOLP in the presence of built–in order and when this external predicate is weakened to an almost order (see [4] for the notion and use of built–in numerical predicates in Descriptive Complexity). We summarise below the facts from [1] that we need about what could be called “semantic approximations” to definability in SOLP. We have shown that: (1) In the presence of order (at least a built–in successor), P ⊆ SOLP[2] and, furthermore, it is captured by the fragment SOLPHorn[2], consisting of formulas of the form (P (X1 ) ≤ 1/2) · · · (P (Xr ) ≤ 1/2)α, where α is a universal Horn formula over some vocabulary τ and second order variables X1 , . . . , Xr . (2) In the presence of order, NL is captured by SOLPKrom[2], a fragment consisting of formulas of the form (P (X1 ) ≥ 1/2) · · · (P (Xr ) ≥ 1/2)α, where α is a universal Krom formula. (This and the previous capturing of P by fragments of SOLP are inspired on Grädel’s [3], but taking into account the limitations in the cardinalities of second order variables imposed by our counting quantifiers.) (3) With respect to almost ordered structures we have an infinite hierarchy within the monadic fragment SOMLP, namely, SOMLP[2] ( SOMLP[2, 3] ( SOMLP[2, 3, 5] ( . . . (4) With respect to almost ordered structures and unbounded arity we have that SOLPHorn[2] ( SOLP[2, 3]. The separation results listed in (3) and (4) were obtained with appropriate Ehrenfeucht–Fraïssé games. The concept of almost order (taken from [6]) constitutes the core of our “semantic approximations", around which we work our syntactic approximations; hence, we review this concept below. Definition 2. A function g : N → N is sublinear if, for all n ∈ N, g(n) < n. For a fixed positive integer k, a k-preorder over a set A is a binary, reflexive and transitive relation P in which every induced equivalence class of P ∩ P −1 has size at most k. An almost linear order over a set A of cardinality n, determined by a sublinear function g : N → N, is a binary relation ≤g over A with a partition of the universe A into two sets B, C, such that B has cardinality n − g(n) and ≤g restricted to B is a linear order, ≤g restricted to C is a 2-preorder, and for every x ∈ C and every y ∈ B, x ≤g y, but y 6≤g x. Note that for any function g : N → N, the almost linear order ≤g over a set A induces an equivalence relation ∼g in A defined by a ∼g b iff a ≤g b and b ≤g a. For a ∈ A, let [a]g denote its ∼g –equivalence class, and [A]g := {[a]g : a ∈ A}. Definition 3. Fix a sublinear g : N → N and let R be a k-ary relation on a set A. Let ≤g be an almost order determined by g in A. We say that R is consistent with ≤g if for every pair of vectors (a1 , . . . , ak ) and (b1 , . . . , bk ) of elements in A with ai ∼g bi for every i ≤ k, we have that R(a1 , . . . , ak ) holds if and only if R(b1 , . . . , bk ) holds. Let A = hA, R1A , . . . , RtA , C1A , . . . , CsA i be a τ -structure. We say that A is consistent with ≤g if and only if for every i ≤ t, RiA is consistent with ≤g . For a τ -structure A, consistent with ≤g , it makes sense to define the quotient structure A/∼g , as a τ -structure consisting of [A]g as its universe, and for a k-ary relation R ∈ τ , RA/∼g := {([a1 ]g , . . . , [ak ]g ) : (a1 , . . . , ak ) ∈ RA }. Furthermore, for B ⊆ A we define its ≤g -contraction as [B]g := {[b]g : b ∈ B}.

66

A. Arratia and C. Ortiz

By SOLP + ≤g , for an almost order ≤g , we understand the logic SOLP with the almost order ≤g as additional built-in relation, and where we only consider models A that are consistent with ≤g . Furthermore, for the formulas of the form (P (X) ≥ r)φ(x, Y , X) and (P (X) ≤ r)φ(x, Y , X), we require the following modification of the semantics: For an appropriate finite model A consistent with ≤g , for elements a = (a1 , . . . , am ) in A and an appropriate vector of relations B, consistent with ≤g , we should have A |= (P (X) ≥ r)φ(a, B, X) ⇐⇒ there exists S ⊆ Ak , consistent with ≤g , such that A |= φ(a, B, S) and |S| ≥ r · |A|k Similarly for (P (X) ≤ r)φ(x, Y , X), substituting ≥ for ≤ above. The property of being consistent for ≤g extends to all the formulas in SOLP(τ )≤g . Remark 1. Given a logic L ⊆ SOLP, we use L+ ≤g to indicate that all possible (finite) models of L have an almost order ≤g , determined by a sublinear function g. Also L+ ≤ indicates that the models have an additional linear order.

3

A Syntax of Approximate Formulas

We now introduce the notion of approximate formulas for SOLP. The purpose of these formulas is to provide a link between satisfaction in almost ordered structures and satisfaction in their corresponding quotient structures. This we will make precise in Theorem 1 below. Definition 4 (Approximate Formulas). For every ² ∈ [0, 1) and for every formula θ(x, X) ∈ SOLP(τ ), we define the positive (resp. negative) ²-approximation of θ(x, X), denoted θ(x, X)² (resp. θ(x, X)−² ), as follows: First order formulas If θ(x, X) is a first order formula with free second order variables among the X and free first order variables among the x, then θ(x, X)² = θ(x, X)−² := θ(x, X). Proportional quantifiers If θ(x, X) := (Q1 . . . Qu )ϕ(x, X), where ϕ(x, X) is a first–order formula and Q1 , . . . , Qu are proportional quantifiers, its ²-approximation is the SOLPformula (θ(x, X))² := (Q01 . . . Q0u )ϕ(x, X), where, for each j, the proportional quantifier Q0j is chosen as follows: (a) If Qj is of the form (P (Y ) ≥ r), where Y is of arity k ≥ 1, then Q0j is of the form   (P (Y ) ≥ (1 − ²)k−1 [r − k²]) 

(P (Y ) ≥ 0)

if r − k² > 0 otherwise

(b) If Qj is of the form (P (Y ) ≤ r), then Q0j is of the form   (P (Y ) ≤ (1 + ²)k−1 [r + k²]) if (1 + ²)k−1 (r + k²) < 1 

(P (Y ) ≤ 1)

otherwise

And the negative approximation, θ(x, X)−² := (Q01 . . . Q0u )ϕ(x, X), is given by: (a) If Qj is of the form (P (Y ) ≥ r), where Y is of arity k ≥ 1, then Q0j is of the form   (P (Y ) ≥ 

1 [r (1−²)k−1

(P (Y ) ≥ 1)

+ k²(1 − ²)k−1 ])

if

r (1−²)k−1

otherwise

+ k² < 1

Syntactic Approximations

67

(b) If Qj is of the form (P (Y ) ≤ r), then Q0j is of the form  1 r k−1 ]) if (1+²)  (P (Y ) ≤ (1+²)k−1 [r − k²(1 − ²) k−1 − k² > 0 

(P (Y ) ≤ 0)

otherwise

Remark 2. We can always assume that ² is small enough so that the ²–approximation for formulas with proportional quantifiers is the first option in their definition, e.g., for (P (Y ) ≤ r)ϕ(Y ) it will always be (P (Y ) ≤ (1 + ²)k−1 [r + k²])ϕ(Y )² . Remark 3. Observe that when Y is monadic (has arity 1) the ²–approximation of a formula preceded by quantifier (P (Y ) ≥ r) (resp. (P (Y ) ≤ r)) is the ²–approximation of the formula preceded by quantifier (P (Y ) ≥ r − ²) (resp. (P (Y ) ≤ r + ²)), which is what one would expect in this case. Our definition for a Y of any arity may seem awkward, but it is the right one for establishing a correspondence between satisfaction of formulas in SOLP in almost ordered structures and satisfaction of the corresponding approximate formulas in ordered structures, as we shall prove below. The basic link between positive and negative approximate formulas, and the formula that they approximate is given by the following lemma. Lemma 1. For every formula θ(x, X) ∈ SOLP(τ ), for every finite τ –structure A, for every interpretation A of relation symbols X in A, for every tuple of elements a in A and for ² and δ such that 0 < δ < ² < 1, we have that: A |= θ(a, A)−² → θ(a, A)−δ → θ(a, A) → θ(a, A)δ → θ(a, A)² . Furthermore, for every formula θ(x, X) ∈ SOLP(τ ), for every ² with 0 < ² < 1 (θ(x, X)−² )² = θ(x, X) = (θ(x, X)² )−² .

¤

We will now show that it is possible to jump from satisfaction in almost order (respectively, linearly ordered) structures to satisfaction of approximate formulas in linearly ordered (respectively, almost ordered) structures. Theorem 1 (Bridge Theorem). Fix a sublinear function g and an almost order ≤g . For every formula θ(x1 , . . . , xk , X) ∈ SOLP(τ ), for every τ -structure A of size m and consistent with ≤g , for every a = (a1 , . . . , ak ) ∈ Ak , for every predicate S of arity t ≥ 1, the following holds: g(m) 2m − g(m) g(m) (ii) A/∼g |= θ([a]g , [S]g ) implies A |= θ(a, S)β(m) , where β(m) = 2m (iii) A |= θ(a, S)−γ(m) implies A/∼g |= θ([a]g , [S]g ), γ as in (i) (iv) A/∼g |= θ([a]g , [S]g )−β(m) implies A |= θ(a, S), β as in (ii) (i) A |= θ(a, S) implies A/∼g |= θ([a]g , [S]g )γ(m) , where γ(m) =

u t

The picture that we have relating satisfaction in the almost ordered world with satisfaction in the ordered world is the following (the horizontal arrows are given by lemma 1 and the diagonal arrows by the Bridge Theorem): - θ - θβ @ ¡ @ ¡ µ µ ¡ ¡ @ @ ¡ @ R ¡ @ R |= θ−β θ θγ

A |= θ−γ A/∼g

(almost order) (order)

68

A. Arratia and C. Ortiz

Now the ground is set. From experience we know that inexpressibility results are, in general, easier to accomplish in the presence of almost order, but to transfer these separations to the truly (linearly) ordered world is hard. Our picture shows that, in fact, the passing from the almost ordered world to a corresponding ordered world (or vice versa) changes the syntactic description of some problem for an approximate description. Is an approximate description as good as an exact description for determining inexpressibility of a class of ordered structures? We feel that the answer to this last question is “yes in almost all cases”, and in the remainder of this paper we give formal support to this intuition.

4

Strong Expressibility

Definition 5. Fix two sentences φ, ψ ∈ SOLP. We say that φ is strongly equivalent to ψ (in symbols φ ⇔S ψ) iff there exists ² ∈ (0, 1) such that in every model A: A |= φ² → ψ−² and A |= ψ² → φ−² . The intuition is that two sentences that are strongly equivalent can be syntactically approximate as much as we like. Formally what this means is that, if φ ⇔S ψ then there exists an ² > 0 such that for every β, γ ∈ (−², ²), |= φ ↔ φβ ↔ ψγ ↔ ψ. Note that if φ is strongly equivalent to ψ then for every model A, A |= φ ↔ ψ (i.e. φ and ψ are equivalent). Note also that it is not clear at all that φ ⇔S φ. The next example proves that this happens sometimes. Example 3. The property of being 2-colorable can be expressed in SOMLP[2] as follows. Let X and Y be two unary second order variables, and let θ(X, Y ) be a formula that says that: (X and Y are disjoint ) ∧ ∀x∀y(((X(x) ∨ Y (x)) ∧ E(x, y) → ¬(X(y) ∨ Y (y))) ∧ (¬(X(x) ∨ Y (x)) ∧ E(x, y) → (X(y) ∨ Y (y))). Then the SOMLP[2]({E}) sentence φ := (P (X) ≤ (1/2))(P (Y ) ≤ (1/2))θ(X, Y ) holds in a graph, if and only if the graph is 2-colorable. (The idea is that X and Y constitute a partition of one of the possible two colors; in fact the color applied to the fewest number of vertices.) Now observe that for every ² 0 such that (µ − ω, µ + ω) ⊆ (−², ²), for every formula θ ∈ (Lµ + ≤) there exists a sublinear function g and two models A, B in (L+ ≤g ) such that: • A |= θ then B |= θ; A/∼g |= φ and B/∼g 6|= φ; and • if |A| = m1 and |B| = m2 then g(mi )/(2mi − g(mi )) < ω, for i = 1, 2. Then φ is not expressible in (L+ ≤).

u t

70

A. Arratia and C. Ortiz

References 1. Arratia, A., and Ortiz, C. (2006) Expressive Power and Complexity of a Logic with Quantifiers that Count Proportions of Sets, J. of Logic and Computation, 16 (6): 817-840 . 2. Etessami, K., and Immerman, N. (1995), Reachability and the power of local ordering, Theoretical Comp. Sci. 148 (2): 261–279. 3. Grädel, E. (1992), Capturing complexity classes by fragments of second order logic, Theoretical Comp. Sci. 101, 35–57. 4. Immerman, N., Descriptive Complexity (Springer, 1998). 5. Keisler, H.J., Hyperfinite model theory. In: R.C. Gandy and J.M. E. Hyland (eds) Logic Colloquium 76 (North-Holland,1977). 6. Libkin, L., and Wong, L., (2002) Lower bounds for invariant queries in logics with counting. Theoretical Comp. Sci. 288, 153-180.

Nondeterminism without Turing Machines Mathias Barra1 , Lars Kristiansen1,2 , and Paul J. Voda3 1

Department of Mathematics, University of Oslo Faculty of Engineering, Oslo University College Institute of Informatics, Comenius University, Bratislava 2

3

Abstract. We investigate three hierarchies; S and L which are naturally deterministic, ˜ which and which arise from computational models which characterises linspace, and H is nondeterministic and characterises nlinspace. We give nondeterministic extensions S˜ and L˜ of the former two, and a deterministic restriction H of the latter. Then we show that the resulting computational models match up. The models are stratified into hierarchies of height ω.

1

Introduction and Preliminaries

The classical deterministic complexity classes – defined by imposing resource bounds on Turing machines – have proven themselves robust and natural mathematical entities. In particular, these classes have turned out to be invariant over any reasonable model of deterministic computation. Moreover, these classes admit numerous natural, and often very surprising, characterisations. Many of these are so-called implicit, that is, the characterisation does not refer to explicit resource bounds, see e.g. Bellantoni and Cook [2] and Kristiansen [4]. The Turing machine also provides a model for nondeterministic computations, whence a nondeterministic version of any deterministic complexity class is well-defined. Numerous natural problems are shown to be complete (under some reducibility relation) for such nondeterministic classes. The uncontroversial conclusion should be that complexity theory is a natural mathematical theory, and that the Turing machine is a versatile and apt tool for developing that theory. There are, however, deterministic complexity classes, or perhaps we should say complexitylike classes, that seem both robust and natural, but which have no known natural characterisation by Turing machines. Hence, no canonical nondeterministic counterpart is available. Attempting to introduce such counterparts by a contrived Turing machine characterisation seems unsatisfactory, and we will have to resort to other models of computation to find interesting and natural candidates for nondeterministic counterparts. In this paper we study three hierarchies which contain deterministic complexity classes. The hierarchies match up level by level, and each hierarchy adds up to the Turing machine defined complexity class linspace. The levels of the hierarchies are examples of classes without known Turing machine characterisations. For each of the three deterministic hierarchies, our definitions suggest a particular way of introducing nondeterminism, and thus, a particular nondeterministic variant of the hierarchy. We study how the three different notions of nondeterminism relate to each other, e.g. is the nondeterminism inherent in a rewriting system stronger, or weaker, than the nondeterminism inherent in a set of Horn clauses? Our approach begs many questions, and we end up with quite a few new open problems closely related to several of the notorious open problems of subrecursion theory and complexity theory. The research presented in this paper is closely related to the research in Kristiansen and Barra [5] and Kristiansen [6]; for further references see these papers. A function is a function f : Nk → N for some k. Any symbol occurring under an arrow indicates a list of length k unless otherwise indicated. An inductively defined class (of functions),

72

Mathias Barra, Lars Kristiansen, and Paul J. Voda

is generated from a set X of initial or basic functions, as the least class containing X and closed under the schemata or operations of some set op of functionals. We write [X ; op] for this set. Inductively defined classes are also known as function algebras in the literature, e.g. in Clote [3]. If F is a set of functions, F? denotes the set of problems of F, that is, those subsets P of Nk such that for some f ∈ F we have that f vanishes exactly on P . F? is also referred to as the (induced) relational class of F. The set of constant functions, {0, 1, 2, . . .} is denoted N ; the set of projections, Ini from Nn onto the ith coordinate, is denoted I, and S and P are the successor and predecessor functions. The notation f (n) (x) is defined by f (0) (x) = x and f (n+1) (x) = f (f (n) (x)), and will be used also for λ-terms etc. Familiarity with notions from rewriting theory in general, and with (typed) λ-calculus in particular, is assumed; e.g. open/closed terms, free variables, reductions, redexes, etcetera. Standard conventions apply where such issues are discussed, e.g. that a reduction rule may be applied to a subterm, that no free variables should be bound in a substitution M [x := N ] and so on. Notation for types is also fairly standard (see e.g. [7]). For syntactical constructs (λ-terms, Horn clauses) ≡ means syntactically identical.

2

˜ λ-calculus – The hierarchies L and L

˜ which may be viewed as fragments of an extended We next define rewriting systems L and L, version of typed λ-calculus. They are modeled on the system L found in Kristiansen and Barra [5], but with several technical differences. The essential modification from [5] consists of a new clause for forming terms: the clause (chs) of Definition 1. This is where the seed of nondeterminism lies. Definition 1. Types: ι is a type; ϑ0 ⊗ ϑ1 and ϑ0 → ϑ1 are types when the ϑi ’s are types. The level or degree of a type is defined the standard way. The product rank of ϑ, written rk(ϑ), is defined by rk(ι) = 0; rk(ϑ0 ⊗ ϑ1 ) = 1 + rk(ϑ0 ) + rk(ϑ1 ) and rk(ϑ0 → ϑ1 ) = max(rk(ϑ0 ), rk(ϑ1 )). ˜ consists of terms and reduction rules. The terms are typed. We indicate by M ϑ or M : ϑ L ˜ that the term M is a term of type ϑ. The L-terms are defined inductively by: (var) We have a countable supply of variables xϑ0 , xϑ1 , xϑ2 , . . ., for each type ϑ of level 0, and xϑi : ϑ. (cst) The numerals n : ι (for each n ∈ N), and the recursors Rϑ : ι, (ι, ϑ → ϑ), ϑ → ϑ, for each type ϑ of level 0, are terms. (prd) hM, N i is a term of type ϑ0 ⊗ ϑ1 if M : ϑ0 and N : ϑ1 (prj) prji M is a term of type ϑi if M : ϑ1 ⊗ ϑ2 , for i = 1, 2. (abs) (λx.M ) : ϑ0 → ϑ1 is a term if x : ϑ0 and M : ϑ1 . (app) (M N ) : ϑ1 is a term if M : ϑ0 → ϑ1 and N : ϑ0 are terms. (chs) (M |N ) : ϑ is term if M : ϑ and N : ϑ are terms. (variables, constants, products, projections, abstraction, application and choice) The reduction ˜ are standard α-reductions (renaming of bound variables) and the reductions: rules of L (β) : prji hM1 , M2 i B Mi and (λx.M )N B M [x := N ] (ρ) : R0HG B G

R(n + 1)HG B Hn(RnHG) and (ν) : M |N B M

M |N B N

We overload B to also denote its transitive-reflexive closure. Note that we only have variables of level 0. One may w.l.o.g. assume that terms of level 0 have the same type by identifying e.g. ι ⊗ (ι ⊗ ι) and (ι ⊗ ι) ⊗ ι. Hence, we shall write hM1 , . . . , Mk i and ι ⊗ · · · ⊗ ι for these terms and types. The calculus L is defined by omitting the clause (chs) for forming terms and the νreductions above (without choice-terms there are no ν-redxes.). 4

Nondeterminism without Turing Machines

73

Traditionally, the numerals are introduced via the constant zero 0 : ι and the successor S : ι → ι and defined as n = S(n) 0. We think of our numerals as so constructed, but do not allow S as an ordinary constant subject to forming terms with arbitrary terms as SN . It is easy to see that L is a subsystem of the simply typed λ-calculus with recursors, also known as Gödels T. As such it is strongly normalising and have unique normal forms. For all closed terms N : ι the normal form of N is a numeral. Hence a closed L-term M : ι → ι defines ˜ a function fM by fM (n) = m ⇔ M n B m. For general L-terms, uniqueness of normal forms is lost. We retain strong normalisation and the property that closed normal terms of type ι are numerals. ˜ (and L-) terms. By defining the product rank of terms, we obtain a stratification of the LDefinition 2. The product rank, rk(M ), of an L− -term M is defined via the product rank of types: rk(M ) = max{rk(ϑ) | N : ϑ is a subterm of M }. We define Lk = {f | f is definable by an L-term M with rk(M ) ≤ k} S L is defined as the union k 0 Definition 6. We define the class S k by S k = [I, N ; comp, sprk+1 ]. The hierarchy S is defined S k as k 0 (x of arbitrary length). The schema ½ G(v, x,¡u) ¢ ,i = 0 F (i, v, x, u) ⇔ ∃z ≤ v F (i − 1, v, x, z) ∧ H(i − 1, x, z, u) , i > 0 defines a relation F in terms of the relations G and H. The schema is called k-fold graph recursion, and is denoted by greck . Let S˜?k be defined as the closure of S?k under, disjunction, S conjunction, bounded existential 4 quantification and greck . Finally, define S˜? as the union k f (un). But this implies that f (w) ≥ f (un) + 2−n . Therefore we obtain that f (w ∗ 0) ≥ f (w) ≥ f (un) + 2−n > f (u), which is a contradiction, since u ∈ T . Thus T is an infinite tree. By WKL, there is an infinite branch α of T . Since B is a bar, there is N such that for all w we have v ∗ αN ∗ w ∈ D. We have v ∈ B ↔ ∀n ≤ N (v ∗ αn ∈ D) , and the statement on the right hand side is decidable. Therefore, we can decide whether v belongs to B or not. The next Corollary is an immediate consequence of Proposition 1. Corollary 1. Under WKL, the axioms FTC and FT∆ are equivalent. Finally, we can show our main result. Corollary 2. WKL implies FTC . Proof. This follows from Corollary 1, combined with Ishihara’s result in [3] that WKL implies FT∆ . Acknowledgments. This result was conceived while holding a Postdoctoral Fellowship for Foreign Researchers provided by the Japan Society for the Promotion of Science.

88

Josef Berger

References 1. Josef Berger. The logical strength of the uniform continuity theorem. A. Beckmann, U. Berger, B. Löwe, and John V. Tucker, eds., Logical Approaches to Computational Barriers, Lecture Notes in Computer Sciences 3988, Springer-Verlag (2006), 35–39 2. Errett Bishop. Foundations of Constructive Analysis. McGraw–Hill, New York (1967) 3. Hajime Ishihara. Weak König’s Lemma Implies Brouwer’s Fan Theorem: A Direct Proof. Notre Dame Journal of Formal Logic 47, no 2 (2006), 249–252 4. Anne S. Troelstra and Dirk van Dalen. Constructivism in Mathematics. An Introduction. Vol I. Studies in Logic and the Foundation of Mathematics, Vol. 121, North–Holland (1988)

Semantics of Sub-Probabilistic Programs? Yixiang Chen1 and Hengyang Wu2 1

Institute of Theoretical Computing EastChina Normal University, Shanghai 200062, P.R.China 2 School of Mathematics, Physics and Informatics Shanghai Normal University, Shanghai 200234, P.R. China

Abstract. The aim of this paper is to extend the probabilistic choice in probabilistic programs to sub-probabilistic choice, i.e., of form (p)P ./ (q)Q where p + q ≤ 1, which means that program P is executed with probability p and program Q is executed with probability q. Then, starting from an initial state, the execution of a subprobabilistic program results in a sub-probability distribution. This paper presents two equivalent semantics for a high level sub-probabilistic while-programming language. One of these interprets programs as sub-probabilistic distributions on state spaces via denotational semantics. The other interprets programs as bounded expectation transformers via wp−semantics. This paper proposes the axiomatic systems for total logic, and shows its consistence and completeness as well.

1

Introduction

The analysis and design of complex software and hardware systems often include certain random phenomena. This motivates one to develop some formal methods for modeling and reasoning about programs containing probability information. Early in 1970’s, Gill ([7]) and Paz ([17]) established probabilistic automata. Yao ([22]) and Rabin ([19]) grouped research in probabilistic algorithm into two areas, which Yao termed the distributional approach and the randomized approach. The equivalence of these two approach was gotten by Yao ([22]) in terms of establishing a connection between the two approaches by defining a measure of complexity based on each. The formal semantics herein provides a common framework in which the two approaches are unified. Later, in 1981, Kozen ([12, 13]) investigated semantics of probabilistic program for a high level probabilistic programming language including random assignment x := random, formalization of probabilistic programs has become an important topic of investigations in theoretical computer science. Since then, the formalization semantics of various probabilistic programming languages has been studied ([3], [6], [8], [11], [14], [15], [21], [23]). He in [8] studied the probability version of Dijkstra’s guarded command languages including both probabilistic choice and non-deterministic choice. Jones in [10, 11] discussed the probabilistic while language only including probabilistic choices. Morgan ([14]) investigated the semantics of He’s relational semantical model through probabilistic predicate transformers. McIver ([15]) studied the partial correctness for probabilistic demonic programs. Recently, Ying ([23]) developed formal methods and mathematical tools for modelling and reasoning about programs containing probability information. Tix ([21]) studied semantics domains for combing probability and non-determinism. Chen ([3]) tried to provide the fuzzy semantics of probabilistic programs. In all cases of discussions above, the probabilistic choice is of P p+ Q, which means that program P is executed with probability p and program Q is executed with probability 1 − p. This choice is often called to be a total probabilistic choice. ?

This presentation is supported by the NSF of China (No.60273052, 60673117), the Doctorial Project of the Educational Ministry of China (No.20050270004) and by STCSM (project No.06JC14022).

90

Yixiang Chen and Hengyang Wu

The aim of this paper is to extend the total probabilistic choice to sub-probabilistic choice, i.e., of form (p)P ./ (q)Q where p + q ≤ 1. Its meaning is that program P is executed with probability p and program Q is executed with probability q. Then, starting from an initial state, the execution of a sub-probabilistic program results in a sub-probability distribution. Our sub-probability choice is different to total one in three aspects. First, two parameters p and q are independently specified in our choice, whereas only one parameter p is specified, other q completely depends on the parameter p, in fact, q = 1 − p in total one. Second, we adopt the sub-stochastic model in this paper and merely require p + q ≤ 1, instead of the stochastic condition p + q = 1. Third, the sub-stochastic condition motivates us to consider about the appearing of non-regular things such as “no state at all ”, deadlock and others. This paper presents two equivalent semantics for a high level sub-probabilistic while-programming language. One of these interprets programs as sub-probabilistic distributions on state spaces via denotational semantics. The other interprets programs as bounded expectation transformers via wp−semantics. This paper proposes the axiomatic systems for total logic, and shows its consistence and completeness as well.

2

Preliminaries

Following Morgan’s paper [14] and Ying’s paper [23], we consider the case of a countable state space S. Definition 1. For a countable state space S, the set of sub-probability distributions over S is X D(S) := {µ : S → [0, 1] | µ(s) ≤ 1}. s∈S

P

= 1 then µ(s)(s0 ) is the probability that µ takes s to According P to Morgan, if s∈S µ(s) 0 s ; but if s∈S µ(s) < 1 then µ(s)(s ) is only a lower P bound for that probability. It is more general than usual definitions in that the restriction s∈S µ(s) ≤ 1 is not an equality. So, for µ in D(S), the difference X 1− µ(s) 0

s∈S

may be regarded as the probability of “no state at all”– a convenient treatment of nontermination that allows to forgo ⊥. We can consider the point-wise order between sub-probability distributions, i.e., for any µ, µ0 ∈ D(S), µ v µ0 := (∀s ∈ S. µ(s) ≤ µ0 (s)). Then, (D(S), v) is a poset. Furthermore, we have the following proposition. Proposition 1. (1) If S is a single point set {1}, then D(1) is isomorphic to the interval [0, 1]. (2) For a countable state space S, its sub-probability distributions (D(S), v) is a complete partial order (see, [15], Lemma 2.4, page 518), the least element is ⊥(s) = 0 for any s ∈ S, (3) D(S) is convex. That is, for any µ1 , µ2 ∈ D(S) and p, q ∈ [0, 1] with p + q ≤ 1, p · µ1 + q · µ2 ∈ D(S), and (4) S −→ D(S) is a dcpo under the pointwise order. ¦ The definition below is due to Kozen ([12], page 331), to Morgan ([14], page 329), or to He ([8], page 174). Definition 2. For state s ∈ S, the point distribution or point mass at s is defined ½ 1, if s = s0 0 s¯(s ) = 0, otherwise.

Semantics of Sub-Probabilistic Programs

91

Basing on point masses, one can define a map ηS : S −→ D(S) by setting ηS (s) = s¯, which is an embedding map. Below, we introduce the notion of probabilistic predicates following Morgan ([14], page 332) and Ying ([23], page 325). Definition 3. A probabilistic predicate on the state space S is defined to be a bounded expectation on S, namely, a function α of type S → R+ (the set of non-negative reals) such that there is M ∈ R+ with α(s) ≤ M for all s ∈ S. We denote all probabilistic predicates on S by PS. The order between probabilistic predicates is defined point-wise, i.e., for any α, β ∈ PS, α ≡> β := (∀s ∈ S. α(s) ≤ β(s)). It is clear that sups∈S α(s) is a finite real, for any probabilistic predicate α. In the sense of Ying ([23], page 325), intuitively ≡> means “ everywhere no more than ”. Ying also pointed out ([23], page 326) that (PS, ≡>) is a u-complete, atomless, distribute lattice, but not t-complete because the least upper bound of infinite bounded expectations may be no longer bounded. But, clearly, one can get that if αn ≤ M for any n ∈ ω, then tn∈ω αn is a probabilistic predicate over S. Ying ([23], page 326) defined the arithmetic operations on P(S). Let α, β ∈ PS, r ∈ [0, 1]. Then the sum α ⊕ β and scalar product r ¯ α are in P(S) and for each s ∈ S, (α ⊕ β)(s) := α(s) + β(s), (r ¯ α)(s) := r × α(s). Clearly, the point distribution s¯ at s also is a probabilistic predicate. The next proposition gives a representation of probabilistic predicates in terms of point masses. Proposition 2. For any α ∈ PS, we have α =

P

s∈S (α(s)

¯ s¯).

¦

The following definition is important to probabilistic computation. It shows a connection between probabilistic distributions and probabilistic predicates, and meanwhile, it also gives a kind of measure of probabilistic predicates with respect to probabilistic distributions. This R measure gives the expected value of expectations, denoted by using the integration notation . Definition 4. [14] For probabilistic predicate α : S → R+ and sub-probability distribution µ ∈ D(S), the expected value of α over µ is Z α dµ :=

X

(α(s) × µ(s)).

s∈S

R By the definition of sub-probabilistic distributions, it follows easily that α dµ ≤ sups∈S µ(s). On the integration, one can get some properties below. R Proposition 3. (1) a dµ ≤ a, where a ∈ R+ , R R R (2) (α ⊕ β) dµ = α dµ + β dµ, R R (3) r ¯ α dµ = r × α dµ, where r ∈ R+ , R R Pn Pn Pn (4) α d( i=1 ri · µi ) = i=1 (ri · α dµi ), where i=1 ri ≤ 1, µi ∈ D(S), R R (5) α d(ti∈I µi ) = supi∈I α dµi , for any directed subsets {µi : i ∈ I} of D(S).

¦

92

3

Yixiang Chen and Hengyang Wu

Denotational Semantics

This paper mainly investigate semantics of a simple sub-probabilistic while language, based on the sub-probabilistic distributions. This language is defined below. P ::= skip | assign f | (p)P ./ (q)Q | P ; Q | if B then P else Q | while B do P where p, q ∈ [0, 1] with p + q ≤ 1, and f : S −→ S is a function. Let us list all denotations and discuss their meaning afterwards. The denotation of subprobabilistic program P will be given as a function [[P ]] : S → D(S). Let b = [[B]] for a Boolean expression B, whose mean is ½ 1, if [[B]]s = true b(s) = 0, if [[B]]s = false. For any state s ∈ S, we have [[skip]]s := s¯ [[assign f ]]s := f (s), for a function f : S −→ S [[(p)P ./ (q)Q]]s := p · [[P ]]s + q · [[Q]]s [[P ; Q]]s := [[Q]]† ◦ [[P ]]s, (see below how [[P ]]† is lifted ) [[if B then P else Q]]s := b(s) · [[P ]]s + (1 − b(s)) · [[Q]]s [[while B do P ]]s := tn∈ω fn (s), where fn : S → D(S) is defined by f0 = λs.0 and fn+1 (s) = b(s) · fn† ([[P ]]s) + (1 − b(s)) · s¯. The meanings of items above except items [[P ; Q]] and [[while B do P ]] are clear. Now, we establish meanings of these two items. We define † at first. Given f : S → D(S), by the equation Z f † (µ)(s) = f (s0 )(s)dµ (µ ∈ D(X), s ∈ S), s0 ∈S

we get the map f † : D(S) −→ D(S). Notice that λs0 : S · f (s0 )(s) Ris a function from S R into [0, 1]. It is a bounded expectation with a bound 1. So, the integration s0 ∈S f (s0 )(s)dµ = λs0 : S · f (s0 )(s)dµ is well-defined for any µ ∈ D(S) and s ∈ S. By the way, mapping † is monotone, i.e., f † ≤ g † whenever f ≤ g of type S → D(S). We have given the meaning of denotation of the sequent composition P ; Q. Then, the meaning of while program is followed from the proposition below. Proposition 4. For any n ∈ ω and s ∈ S, fn (which is defined above) has some properties. 1. fn (s) is a sub-probability distribution on S, i.e., fn is defined well, 2. fn ≤ fn+1 , that is, {fn }n∈ω is an increase chain, 3. tn∈ω fn ∈ (S −→ D(S), and is a fix point of F, where F : (S → D(S)) → (S → D(S)) is defined by F (f )(s) = b(s) · f † ([[P ]])(s) + (1 − b(s)) · s¯ for any f : S → D(S) and s ∈ S.

¦

Semantics of Sub-Probabilistic Programs

4

93

Axiomatic Semantics For Total Correctness

This section will investigate the total logic of triple α{P }β, where α and β are probabilistic predicates and P is a sub-probabilistic program, whose meaning is given through the denotational semantics in the previous section. We say that a state s satisfies a probabilistic predicate α with an expected value ι ∈ R+ if α(s) ≥ ι. A sub-probabilistic R distribution µ satisfies a probabilistic predicate α with an expected value ι if the integral α dµ ≥ ι. Total correctness for a probabilistic triple α{P }β with the expected value ι means that, for any state s ∈ S, if s satisfies α with the expected value ι, then program P will terminate at state s, and the output P (s) satisfies the postcondition β with the expected value ι too. That R is, if α(s) ≥ ι then P terminates at s and β d[[PR]](s) ≥ ι, for any state s ∈ S. We call triple α{P }β valid if ∀s ∈ S · α(s) ≤ β d[[P ]](s). The valid of a triple α{P }β means intuitionally that, if the precondition α is satisfied by state s with the expected value α(s), then P is guaranteed to terminate at s, and Rthe output sub-distribution [[P ]](s) will satisfy the postcondition β with the expected value β d[[P ]](s) R and α(s) ≤ β d[[P ]](s). We use the notation |= α{P }β to denote that this triple is valid. Now, we give an axiomatic system for total correctness. [skip] [ass] [probability] [comp] [if] [while] [cons]

α{skip}α α{assignf }β,

if α(s) = β(f (s))

α1 {P }β, α2 {Q}β p ¯ α1 ⊕ q ¯ α2 {(p)P ./ (q)Q}β α{P }β, β{Q}γ α{P ; Q}γ α{P }β, α0 {Q}β 0 b ¯ α ⊕ (1 − b) ¯ α {if B then P else Q}β αn+1 {P }(bαn + (1 − b)β) b ¯ (tn∈ω αn ) ⊕ (1 − b) ¯ β{ while B do P }β α{P }β , α0 {P }β 0

if α0 ≤ α and β ≤ β 0 .

A proof of a triple is a sequence of triples in which each term is an instance of an axiom or is derived from previous terms by one of the rules above. The last triple, e.g., α{P }β, is called a theorem and denoted by using ` α{P }β. Usually, although αn are probabilistic predicates, tn∈ω αn need not be a probabilistic predicate. This is because the least upper bound of infinite bounded expectations may be no longer bounded. So we need to explain that tn∈ω αn is a probabilistic predicate in [while] of axiomatic semantics. In fact, we have the following proposition. Proposition 5. If ` αn+1 {P }(b ¯ αn ⊕ (1 − b) ¯ β), then αn (s) ≤ supt∈S β(t) for any s ∈ S, n ∈ ω, where α0 = λs.0. So tn∈ω αn is a probabilistic predicate. ¦ Theorem 1. (Consistency and Completeness) Given sub-probabilistic triple α{P }β, we have ` α{P }β if and only if |= α{P }β. ¦

94

5

Yixiang Chen and Hengyang Wu

Equivalence Between Semantics

This section will discuss the equivalence of semantics of sub-probabilistic programs through wp− calculus. Given sub-probabilistic program P (i.e., a function from S into D(S)) and probabilistic predicate β, one can define the weakest precondition wp(P, β) to be the weakest one of probabilistic predicates α making α{P }β valid. Note that the weakest one in the total logic means the largest predicate. So, we define wp(P, β) by setting, for any state s ∈ S, Z wp(P, β)(s) = β dP (s). It follows that wp(P, β) is a probabilistic predicate over state space S, for sub-probabilistic program P and a probabilistic predicate β by Z X X (β(s) × P (s)(t)) ≤ (sup β(t)) P (s)(t) ≤ sup β(t). β dP (s) = t∈S

t∈S

t∈S

t∈S

Theorem 2. Given a probabilistic predicate β, wp is computed through the following equations: 1. wp(skip, β) = β, 2. wp(assign f, β) = λs : S.β(f (s)), 3. wp((p)P ./ (q)Q, β) = p ¯ wp(P, β) ⊕ q ¯ wp(Q, β), 4. wp(P ; Q, β) = wp(Q, wp(P, β)), 5. wp(if B then P else Q, β) = b ¯ wp(P, β) ⊕ (1 − b) ¯ wp(Q, β), 6. wp( while B do P, β) = tn∈ω αn , where α0 = λs.0, αn+1 = b ¯ wp(P, αn ) ⊕ (1 − b) ¯ β. ¦ Theorem 3. Given a sub-probabilistic program P , wp has the following properties. Miracle: wp(P, 0) = 0. Monotonicity: wp(P, β1 ) ≤ wp(P, β2 ), if β1 ≤ β2 . Homogeneity: wp(P, r ¯ β) = r ¯ wp(P, β), where r ∈ R+ . Pn Pn Affineness: wp(P, i=1 ri · βi ) = i=1 ri ¯ wp(P, βi ), where ri ∈ R+ . Continuity: If {βi : i ∈ I} is a directed subset of probabilistic predi– cates, and ti∈I βi exists, then wp(P, ti∈I βi ) = ti∈I wp(P, βi ). Boundness: for any s ∈ S,

P y∈S

wp(P, y¯)(s) ≤ 1.

Given a sub-probabilistic program P , wp(P, −) defines a function from PS into PS, which is indeed a probabilistic predicate transformer over state space S. This transformer can be used to define a semantics of sub-probabilistic programs as wp(P ) = wp(P, −), which is called wp−semantics. The denotational semantics and wp−semantics have the connection below.

Semantics of Sub-Probabilistic Programs

95

Theorem 4. Given sub-probabilistic program P , we have that Z Z β d[[P ]]† (µ) = wp(P, β) dµ for any probabilistic predicate β over S and any sub-probabilistic distribution µ on S.

¦

Now, we try to answer this question: which transformers can be defined by a sub-probabilistic program. Definition 5. (1) A probabilistic predicate transformer over state space S is a mapping from PS to PS. (2) A probabilistic predicate transformer t is said to be healthy, if it satisfies the following healthy conditions: P 1. For any s ∈ S, y∈S t(¯ y )(s) ≤ 1; + 2. For any y ∈ S, r ∈ R , t(r ¯ y¯) = r ¯ t(¯ y ); P P 3. t( y∈S ry ¯ y¯) = y∈S ry ¯ t(¯ y ), where ry ∈ R+ . The notation (PS −→H PS) will denote the set of all healthy probabilistic predicate transformers over state space S with the pointwise order. Proposition 6. (PS −→H PS) is closed under Pn linear operator. That is, for any ti ∈ (PS −→H Pn ¦ PS), i=1 ri ti ∈ (PY −→H PX), where i=1 ri ≤ 1. Proposition 7. Let t1 and t2 be healthy probabilistic predicate transformers. If t1 (¯ s) = t2 (¯ s) for any s ∈ Y , then t1 = t2 . ¦ Proposition 8. Given any α ∈ PS, s ∈ S and t ∈ (PS −→H PS), we have t(α)(s) ≤ supy∈S α(y). ¦ One can define a mapping rp from (PS −→H PS) to (S −→ D(S)) by, for any t ∈ (PS −→H PS), s ∈ S, y ∈ S rp(t)(s)(y) = t(¯ y )(s). By the definition of healthy probabilistic predicate transformers, one can get that X X rp(t)(s)(y) = t(¯ y )(s) ≤ 1. y∈S

y∈S

So, rp(t)(s) ∈ D(S). Then rp(t) ∈ (S −→ D(S)). Theorem 5. (1) For any t ∈ (PS −→H PS) and h ∈ (S −→ D(S)), we have wp(rp(t)) = t and rp(wp(h)) = h. (2) (PS −→H PS) is isomorphic to (S −→ D(S) under the pair of functions wp and rp. ¦ This theorem tells us if rp(t) can be defined by using a subprobabilistic program P , then this healthy probabilistic predicate transformer t can be defined by using the same program P too and wp(P ) = t. This theorem also shows the equivalent relation between denotational semantics and wp-semantics. Acknowledgements. The authors would like to thank the referees for their invaluable comments and suggestions.

96

Yixiang Chen and Hengyang Wu

References 1. Yixiang Chen. Stable semantics of weakest pre-predicates. Journal of Software, 24 (Suppl.): 161167, 2003. 2. Yixiang Chen, Achim Jung. An Introduction to Fuzzy Predicate Transformers, The invited talk at The Third International Symposium on Domain Theory, Shaanxi Normal University, Xi’an, China, 2004. 3. Yi-xiang Chen, Gordon Plotkin and Hengyang Wu. On Healthy Fuzzy Predicate Transformers. The invited talk ar the Fourth International Symposium on Domain Theory, Hunan University, Changsha, China, 2006. 4. W.P. de Roever. Dijkstra’s predicate transformer, non-determinism, recusion and termination, Spring Lecture Notes in Computer Science 45, Mathematical Foundations of Computer Science, 1976, pp. 472-481. 5. E.W. Dijkstra. A Discipline of Programming. Prentice Hall International, Englewood Cliffs, 1976. 6. G. Gierz, K.H. Hofmann, K. Keimel, J.D. Lawson, M.Mislove, and D.S. Scott. Continuous Lattices and Domains, volume 93 of Encyclopedia of Mathemmatics and its Applications. Cambridge University Press, 2003. 7. J. Gill. Computational complexity of probabilistic Turing machines, in Proceedings, 6th ACM Annual Symposium on Theory of Computing, pages 91-95, 1974. 8. J. He, K. Seidel, A.K. McIver. Probabilistic models for the guarded command language. Science of Computer Programming, 28: 171-192, 1997. 9. C.A.R. Hoare. Some properties of predicate transformers. Journal of the Association for Computing Machinery, 25(3): 461-480, 1978. 10. C. Jones. Probabilistic non-determinism. PhD thesis, University of Edinburgh, Edinburgh, 1990. Also published as Techniccal report No. CST-63-90. 11. C. Jones and G.Plotkin. A probabilistic powerdomain of evalutions, in Proceedings of the 4th Annual Symposium on Logic in Computer Science, pages 186-195. IEEE Computer Society Press, 1989. 12. D. Kozen. Semantics of probabilistic programs. Journal of Computer and System Science, 22: 328-350, 1981. 13. D. Kozen. A Probabilistic PDL. Journal of Computer and System Science, 30: 162-178, 1985. 14. C. Morgan, A. McIver, K. Seidel. Probabilistic predicate transformers. ACM Trans. Programmiing Languages and Systems, 18: 325-353, 1996. 15. A.K. McIver, C. Morgan. Partial correctness for probabilistic demonic programs. Theoretical Computer Science, 266: 513-541, 2001. 16. J.M. Morris. Non-deterministic expressions and predicate transformers. Information Processing Letters, 61: 241-246, 1997. 17. A. Paz. Introduction to Probabilistic Automata, Academic Press, New York, 1971. 18. G.D. Plotkin. Dijkstra’s predicate transformers and Smyth’s powerdomains. In D. Bjørner, editor, Abstract Software Specifications, volume 86 of Lecture Notes in Computer Science, pages 527-553, 1980. 19. M. O. Rabin. Probabilistic algorithm, in Alorithm and Complexity, (J. F. Traub, Eds), pp.21-40, Academic Press, New York, 1976. 20. M.B. Smyth. Power domains and predicate transformers: a topological view. In J. Diaz, editor, Automata, Languages and Programming, volume 154 of Lecture Notes in Computer Science, pages 662-675, Berlin, 1983. Springer Verlag. 21. R. Tix, K. Keimel, and G. Plotkin, Semantics domains for combining probability and nondeterminism, Electronic Notes in Theoretical Computer Scienc 129:1-104, 2005. 22. A. Yao. Probabilistic computations: toward a unified measure of complexity, in Proceeding 18th IEEE Symp. on Foundations of Computer Science, pages. 222-227, Providence, 1977. 23. M.S. Ying. Reasoning about probabilistic sequential programs in a probabilistic logic. Acta Informatica, 39: 315-389, 2003.

Concrete and Abstract Quantum Computational Logics Maria Luisa Dalla Chiara1 , Roberto Giuntini2 , and Roberto Leporini3 1

3

Dipartimento di Filosofia, Università di Firenze, Via Bolognese 52, I-50139 Firenze, Italy [email protected] 2 Dipartimento di Scienze Pedagogiche e Filosofiche, Università di Cagliari, Via Is Mirrionis 1, I-09123 Cagliari, Italy [email protected] Dipartimento di Matematica, Statistica, Informatica e Applicazioni, Università di Bergamo, Via dei Caniana 2, Bergamo, I-24127, Italy [email protected]

Abstract. The theory of logical gates in quantum computation has suggested some new forms of quantum logics, called quantum computational logics. In the standard semantics of these logics, formulas denote quantum information quantities (systems of qubits, or, more generally, mixtures of systems of qubits), while the logical connectives are interpreted as logical operations defined in terms of special quantum logical gates (which have a characteristic reversible and dynamic behavior). We consider two kinds of quantum computational semantics: 1) a compositional semantics, where the meaning of any compound formula is determined by the meanings of its parts; 2) a holistic semantics, which makes essential use of the characteristic “holistic” features of the quantum-theoretic formalism. The compositional and the holistic semantics turn out to characterize the same logic. Quantum computational logics can be applied to investigate different kinds of semantic phenomena where holistic and contextual patterns play an essential role (from natural languages to musical compositions). Is it sensible to look for an abstract quantum computational semantics that is not necessarily “Hilbert-space dependent”? In this perspective, we introduce a weak form of quantum computational logic (called abstract quantum computational logic), which is axiomatizable.

1

The Quantum-Computation Environment

Consider the two-dimensional Hilbert space C2 and let B (1) = {|0i , |1i} be the canonical orthonormal basis for C2 . A qubit is a unit vector |ψi of C2 . An n-qubit system (or n-quregister ) is a unit vector in the n-fold tensor product Hilbert space ⊗n C2 := C2 ⊗ . . . ⊗ C2 (where | {z } n−times

⊗1 C2 := C2 ). A qumix (or mixture of quregisters) is a density operator of ⊗n C2 (where n ≥ 1). We use x, y, . . . as variables ranging over the set {0, 1}, while |xi , |yi , . . . range over the basis B(1) . Any factorized unit vector |x1 i ⊗ . . . ⊗ |xn i (abbreviated as |x1 , . . . , xn i) is called a (classical) register. The set B(n) of all classical registers is an orthonormal basis for ⊗n C2 . Let Q(⊗n C2 ) and D(⊗n C2 ) represent respectively the set of all quregisters and the set of all qumixes of ⊗n C2 . Definition 1. A register |x1 , . . . , xn i is called true ( false) iff xn = 1 (xn = 0). (n)

(n)

On this basis, one can identify, in any ⊗n C2 , two special projections (P1 and P0 ) represent(n) ing the Truth-property and the Falsity-property, respectively. The projection P1 is determined (n) by the closed subspace spanned by the set of all true registers, while P0 is determined by the closed subspace spanned by the set of all false registers.

98

Maria Luisa Dalla Chiara, Roberto Giuntini, and Roberto Leporini

By applying the “Born-rule”, one can define the probability that a qumix satisfies the truthproperty. Definition 2. Probability of a qumix (n) For any qumix ρ ∈ D(⊗n C2 ), p(ρ) := tr(P1 ρ), where tr is the trace functional. In quantum computation information is processed by quantum logical gates (briefly, gates): unitary operators that transform quregisters into quregisters. In the standard quantum computational semantics we use at least the following gates. Definition 3. The negation For any n ≥ 1, the negation on ⊗n C2 is the unitary operator Not(n) such that for every element |x1 , . . . , xn i of the basis B(n) : Not(n) (|x1 , . . . , xn i) := |x1 , . . . , xn−1 i ⊗ |1 − xn i . Definition 4. The Petri-Toffoli gate For any m ≥ 1 and any n ≥ 1 the Petri-Toffoli gate is the unitary operator T(m,n,1) defined on ⊗m+n+1 C2 such that for every element |x1 , . . . , xm i ⊗ |y1 , . . . , yn i ⊗ |zi of the basis B(m+n+1) : T(m,n,1) (|x1 , . . . , xm i ⊗ |y1 , . . . , yn i ⊗ |zi) := |x1 , . . . , xm i ⊗ |y1 , . . . , yn i ⊗ |xm yn ¢ zi , where ¢ represents the sum modulo 2. Definition 5. The square root of the negation For any n ≥ 1, the square root of the negation on ⊗n C2 is the unitary operator that for every element |x1 , . . . , xn i of the basis B(n) : √

where i :=

Not



(n)



Not

(n)

such

1 (|x1 , . . . , xn i) := |x1 , . . . , xn−1 i ⊗ ((1 + i) |xn i + (1 − i) |1 − xn i), 2

−1.

√ (n) From a logical point of view, Not can be regarded as a “tentative partial negation” that transforms precise pieces of information into maximally uncertain ones. One is dealing with a typically quantum logical operation that does not admit any counterpart either in classical logic or in standard fuzzy logics (see [3]). These gates can be uniformly defined on the set Q of all quregisters in the expected way. Furthermore, they can be naturally [6]. When our gates are applied to √ generalized to qumixes √ density operators, we write: NOT, NOT, T (instead of Not, Not, T). The set D of all qumixes can be pre-ordered by the following relation: Definition 6. Preorder ρ ¹ σ iff the following conditions hold: (i) p(ρ) ≤ p(σ); √ √ (ii) p( NOT(σ)) ≤ p( NOT(ρ)). This gives rise to an equivalence relation √ (σ ≈ τ iff σ ¹ τ and τ ¹ σ), which is a congruence with respect to the operations AND, NOT, NOT.

Concrete and Abstract Quantum Computational Logics

2

99

Quantum Computational Formulas and Quantum Circuits

The minimal quantum computational language L contains a privileged atomic formula f (whose intended interpretation is the Falsity) and the following primitive V connectives: the negation (¬), √ the square root of the negation ( ¬), a ternary conjunction V (which corresponds to the PetriToffoli gate). For any formulas α and β, the expression (α, β, f ) is a formula of L. In this framework, the usualVconjunction α ∧ β is dealt with as a metalinguistic abbreviation for the ternary conjunction (α, β, f ), while α ∨ β := ¬(¬α ∧ ¬β). By atomic complexity of a formula α (indicated by At(α)) we mean the number of occurrences of atomic formulas in α. Since At(α) determines the dimension of the Hilbert space where a qumix representing information about α should live, the space ⊗At(α) C2 is called the semantic space of α (briefly indicated by Hα ). Any formula α can be naturally decomposed into its parts, giving rise to a special configuration called the syntactical tree of α (indicated by ST reeα ). Roughly, ST reeα can be represented as a sequence of levels: Levelk (α) . . . Level1 (α), where: 1) each Leveli (α) (with 1 ≤ i ≤ k) is a sequence of subformulas of α; 2) the bottom level (Level1 (α)) consists of α; 3) the top level (Levelk (α)) is the sequence of all atomic occurrences in α; 4) for any i (with 1 ≤ i < k), Leveli+1 (α) is the sequence obtained by dropping the principal connective in all molecular formulas occurring at Leveli (α), and by repeating all the atomic formulas that possibly occur at Leveli (α). By Height of α (indicated by Height(α)) we mean the number of levels of ST reeα . The syntactical tree of α uniquely determines a sequence α (Gα 1 , . . . , GHeight(α)−1 )

of gates that are all defined on the semantic space of α. We call this gate-sequence the qubit α tree of α. Qubit trees can be naturally generalized to qumixes. Let (Gα 1 , . . . , Gk−1 ) be the D α D α qubit tree of α. Consider the sequence of functions ( G1 , . . . , Gk−1 ) s.t. for any ρ ∈ D(Hα ), α∗ D α is the adjoint of Gα Gi (ρ) = Gα ρ Gα∗ i ). Such sequence is called the qumix tree i (where Gi of α. By α-computation we mean a sequence (ρk , . . . , ρ1 ) of qumixes of Hα , where: ρk−1 = D α Gk−1 (ρk ), . . . , ρ1 = D Gα 1 (ρ2 ). The qumix ρk can be regarded as a possible input-information concerning the atomic parts of α, while ρ1 represents the output-information about α, given the input-information ρk . Each ρi corresponds to the information about Leveli (α), given the inputinformation ρk . How to determine an information about the parts of α under a given input? It is natural to apply the standard quantum-theoretic rule that determines the states of the parts of a compound system. Suppose that Leveli (α) = βi1 , . . . βir , and let (ρk , . . . , ρi , . . . , ρ1 ) be an α-computation. Consider redj (ρi ), the reduced state of ρi with respect to the j-th subsystem. From a semantic point of view, this state can be regarded as a contextual information about βij (the subformula of α occurring at the j-th position at Leveli (α)) under the input ρk . Since both qubit trees and qumix trees are determined by the syntactical tree of a given formula, one can also say that any formula α of the quantum computational language plays the role of an intuitive and “economical” description of a quantum circuit.

3

Compositional and Holistic (Concrete) Quantum Computational Semantics

In the compositional quantum computational semantics, the meaning of any molecular formula is determined by the meanings of its parts: the input-information about the top level of the syntactical tree of a formula α is always associated to a factorized state ρ1 ⊗ . . . ⊗ ρt , where t is the atomic complexity of α and ρ1 , . . . , ρt are qumixes of C2 . As a consequence, the meaning

100

Maria Luisa Dalla Chiara, Roberto Giuntini, and Roberto Leporini

of a molecular α cannot be a pure state, if the meanings of some atomic parts of α are proper mixtures [3]. The holistic quantum compositional semantics is based on a more “liberal” assumption: the input information about the top-level of the syntactical tree of α can be represented by any qumix “living” in the semantic space of α. As a consequence, the meanings of all levels of ST reeα are not, generally, factorized states (and might correspond to entangled states). The main concept of our semantics is the notion of holistic quantum computational model : a function Hol that assigns to any formula α of the quantum computational language a global meaning, which cannot be generally inferred from the meanings of the parts of α. Definition 7. Atomic holistic model An atomic holistic model is a map HolAt that associates a qumix to any formula α of L , satisfying the following conditions: (1) HolAt (α) ∈ D(Hα ); (2) Let At(α) = n and LevelHeigth(α) = q1 , . . . , qn . Then, (2.1) if qj = f , then redj (HolAt (α)) = P0 ; (2.2) if qj and qh are two occurrences in α of the same atomic formula, then redj (HolAt (α)) = redh (HolAt (α)). Apparently, HolAt (α) represents a global interpretation of the atomic formulas occurring in α. At the same time, redj (HolAt (α)), the reduced state of the compound system (described by HolAt (α)) with respect to the j-th subsystem, represents a contextual meaning of qj with respect to the global meaning HolAt (α). The map HolAt (which assigns a meaning to the top-level of the syntactical tree of any sentence α) can be naturally extended to a map HolT ree that assigns a meaning to each level of the syntactical tree of any α, following the prescriptions of the qumix tree of α. Definition 8. Holistic model A map Hol that assigns to any formula α a qumix of the space Hα is called a holistic (quantum computational) model of L iff there exists an atomic holistic model HolAt s.t.: Hol(α) = HolT ree (Level1 (α)), where HolT ree is the extension of HolAt . Given a formula γ, Hol determines the contextual meaning, with respect to the context Hol(γ), of any occurrence of a subformula β in γ (i.e. of any node βij of ST reeγ ): Holγ (βij ) := redj (HolT ree (Leveli (γ))). One can prove that two different occurrences of one and the same subformula in a formula γ receive the same contextual meaning with respect to the context Hol(γ). On this basis, one can define the contextual meaning of a subformula β of γ, with respect to the context Hol(γ): Holγ (β) := Holγ (βij ), where βij is any occurrence of β at a node of ST reeγ . Notice that formulas may receive different contextual meanings in different contexts! Given a formula γ, we call the partial function Holγ (which assigns meanings to the subformulas of γ) a contextual holistic model of the language. In this framework, compositional models can be described as special cases of holistic models. Definition 9. Compositional model A model Hol is called compositional iff the following condition is satisfied for any formula α: HolAt (α) = Hol(q1 ) ⊗ . . . ⊗ Hol(qt ), where q1 , . . . , qt are the atomic formulas occurring in α. Unlike holistic models, compositional models are context-independent.

Concrete and Abstract Quantum Computational Logics

101

Definition 10. Consequence in a given contextual model Holγ A formula β is a consequence of a formula α in a given contextual model Holγ (α |=Holγ β) iff 1. α and β are subformulas of γ; 2. Holγ (α) ¹ Holγ (β) (where ¹ is the preorder relation defined in (6)). Definition 11. Logical consequence (in the holistic semantics) A formula β is a consequence of a formula α (in the holistic semantics) iff for any formula γ such that α and β are subformulas of γ and for any Hol, α |=Holγ β. We call HQCL the logic that is semantically characterized by the logical consequence relation we have just defined. At the same time, by compositional quantum computational logic (CQCL) we mean the logic that is semantically characterized by the class of all compositional quantum computational models. Although the basic ideas of the holistic and of the compositional quantum computational semantics are quite different, one can prove that HQCL and CQCL are the same logic [4]. We call this logic quantum computational logic (QCL). One is dealing with a nonstandard form of unsharp quantum logic, where the noncontradiction principle breaks down (2QCL ¬(α ∧ ¬α)), while conjunction is not idempotent (α 2QCL α ∧ α). Interestingly enough, distributivity is here violated “in the wrong direction” with respect to orthodox quantum logic. For, α ∧ (β ∨ γ) |=QCL (α ∧ β) ∨ (α ∧ γ), but not the other way around! The axiomatization of QCL is an open problem.

4

An Abstract Quantum Computational Logic

The concrete quantum computational semantics can be naturally generalized to an abstract semantics that preserves some basic features of the quantum computational logical approach (like reversibility and partial intensionality). We consider here the compositional version of such semantics, based on the notion of abstract quregister structure. In this framework, abstract quregisters are identified with some special objects (not necessarily living in a Hilbert space), while gates are reversible functions that transform quregisters into quregisters. In order to stress the relation between abstract and concrete quregister-structures, we use the familiar ket-notation also for abstract quregisters. From an intuitive point of view, abstract quregisters represent pieces of information that are generally uncertain, while (abstract) registers are special examples of quregisters that store a certain information. Any (abstract) quregister is associated to a given length n, and lives in a subdomain Q(n) of the domain Q of all possible quregisters. The preorder relation ¹ is here primitive and has the following intuitive interpretation: |ψi ¹ |ϕi iff the information encoded by |ϕi is “closer to the truth” than the information encoded by |ψi. Another primitive relation, called quconsistency, permits us to define an abstract notion of superposition. Definition 12. Abstract quregister structure An abstract quregister structure is a system A = hQ , ♣, ¹, Not,



Not, T, |0i , |1ii,

where the following conditions hold: 1. Q is the set of all abstract quregisters (briefly, quregisters), indicated by |ψi, |ϕi , . . .. S (n) (a) Q = n≥1 Q(n) , where Q(n) is the set of all quregisters of length n, indicated by |ψi , (n)

|ϕi

, . . ..

102

Maria Luisa Dalla Chiara, Roberto Giuntini, and Roberto Leporini (m)

(b) The cartesian product Q(m) × Q(n) is embeddable into Q(m+n) . We indicate by |ψi ⊗ (n) (m) (n) |ϕi the element of Q(m+n) that corresponds to the pair (|ψi , |ϕi ). 2. For any n ≥ 1, R(n) is the set of all registers of length n. The elements of R(n) are represented as sequences |x1 , . . . , xn i, where xi ∈ {0, 1}. The set R(1) = {|0i , |1i} is called the set of the two abstract bits. (a) R(n) ⊆ Q(n) ; (b) R(m+n) is in one-to-one correspondence with the cartesian product R(m) × R(n) . We indicate by |x1 , . . . , xm , y1 , . . . , yn i the register in R(m+n) that corresponds to the pair (|x1 , . . . , xm i , |y1 , . . . , yn i). 3. ♣ is a map that associates to any n ≥ 1 a binary reflexive and symmetric relation ♣n (called quconsistency) that may hold between quregisters of length n. (a) |x1 , . . . , xn i ♣n |y1 , . . . , yn i y |x1 , . . . , xn i = |y1 , . . . , yn i ; (b) any quregister of length n is quconsistent with at least n o one register of length n. (n) (n) n Let Reg(|ψi ) = |x1 , . . . xn i : |x1 , . . . xn i ♣ |ψi . n

4.

5.

6.

7.

8.

9.

(n)

We say that |ψi is a superposition of the elements of Reg(|ψi ). ¹ is a preorder relation on Q. This permits one to define the following equivalence relation: |ψi ≈ |ϕi := |ψi ¹ |ϕi and |ϕi ¹ |ψi . (a) |0i ¹ |1i; (m) (n) (n) (b) |ψi ⊗ |ϕi ≈ |ϕi ; (m) (n) (p) (m) (n) (p) (c) |ψi ⊗ (|ϕi ⊗ |χi ) ≈ (|ψi ⊗ |ϕi ) ⊗ |χi √ Not, Not, T are maps that assume as values abstract logical gates (briefly, gates). By gate on Q(n) we mean a map G(n) that satisfies the following conditions: (a) G(n) is a injection of Q(n) into Q(n) ; (b) ≈ is a congruence with respect to G(n) . Condition (a) guarantees that gates are reversible logical operations. For any n ≥ 1, Not associates to n the gate Not(n) (defined on Q(n) ) that satisfies the following conditions: (a) Not(n) (|x1 , . . . , xn i) ≈ |x1 , . . . , xn−1 , 1 − xn i; (n) (n) (b) Not(n) (Not(n) (|ψi )) ≈ |ψi . √ √ (n) For any n ≥ 1, Not associates to n the gate Not (defined on Q(n) ) that satisfies the following conditions: √ √ (n) (1) (a) Not (|x1 , . . . , xn i) ≈ |x1 , . . . , xn−1 i ⊗ Not (|xn i); √ (n) √ (n) (n) (n) (b) Not ( Not (|ψi )) ≈ Not(n) (|ψi ); √ (n) (c) Not (|x1 , . . . , xn i)♣n |x1 , . . . , xn i; √ (n) Not (|x1 , . . . , xn i)♣n |x1 , . . . , 1 − xn i. For any m, n ≥ 1, T associates to the triplet (m, n, 1) the gate T(m,n,1) , defined on Q(m+n+1) . We put: (m) (n) (m) (n) And(|ψi , |ϕi ) := T(m,n,1) (|ψi ⊗ |ϕi ⊗ |0i); (m)

Or(|ψi

(n)

, |ϕi

) := Not(And(Not(|ψi

(m)

, Not(|ϕi

(n)

))).

The following conditions hold: (a) T(m,n,1) (|x1 , . . . , xm , y1 , . . . , yn , zi ≈ |x1 , . . . , xm , y1 , . . . , yn , xm · yn ¢ zi, where ¢ is the sum modulo 2. (b) And(|ψi , |ϕi) ≈ And(|ϕi , |ψi) (commutativity). (c) And {|ψi , And(|ϕi , |χi)} ≈ And {And(|ψi , |ϕi), |χi} (associativity). (d) And {|ψi , Or(|ϕi , |χi)} ¹ Or {And(|ψi , |ϕi), And(|ψi , |χi)} (semidistributivity). ¯ √ √ ® (1) (m+n+1) (m,n,1) ¯ (m) ® ⊗ ¯ϕ(n) ⊗ |0i)) (e) Not (|1i) ¹ Not (T (¯ψ √ (1) ¹ Not (|0i).

Concrete and Abstract Quantum Computational Logics

103

The notion of abstract quregister structure represents a “good” abstraction from Hilbertspace quregisters. Consider the concrete structure √ hQ, ♣, ¹, Not, Not, T, |0i , |1ii, where: S – Q = n≥1 Q(⊗n C2 ) is the set of all concrete quregisters; – ♣n is defined as follows: P (n) 1. for any |ψi = i ci |xi1 , . . . , xin i, (n) |ψi ♣n |x1 , . . . , xn i iff for some ci 6= 0, |x1 , . . . , xn i = |xi1 , . . . , xin i. (n) n (n) 2. |ψi ♣ |ϕi iff there exists a register |x1 , . . . , xn i such that (n) n (n) n |ψi ♣ |x1 , . . . , xn i and |ϕi √ ♣ |x1 , . . . , xn i. – the relation ¹, the gates Not, Not, T and the two bits |0i, |1i are defined according to the definitions given in Sect. 1. This structure satisfies our definition of abstract quregister structure. In order to characterize semantically abstract quantum computational logic (AQCL), we first extend the minimal quantum computational language L to a richer language LA , that contains a primitive true sentence t and a new binary connective ] (called concatenation), whose intended interpretation √ is theVabstract tensor product ⊗. Since ] does not correspond to a gate, the connectives ¬, ¬ and will be also called, in this framework, gate-connectives. Definition 13. A formula α of LA is called – – – –

a bit-formula iff either α = f or α = t; a gate-formula iff α does not contain ]; a register-formula iff α = β1 ] . . . ]βn , where√β1 , . . . , βn are bit-formulas4 ; a classical formula iff α does not contain ¬ and all the atomic formulas of α are bitformulas. – a strictly classical formula iff α is a classical formula that does not contain ]. We will use b, c, . . . as metavariables for bit-formulas, while κ, κ1 , . . . will represent classical formulas. Furthermore we will use the following metalinguistic “truth-functions”: t⊥ = f ; f ⊥ = t; t u t = t; t u f = f ; f u t = f ; f u f = f . Definition 14. An abstract (compositional) quregister model is a pair M = (A, Qur), where A is an abstract quregister structure, while Qur is a map that associates to any formula α of atomic complexity n an abstract quregister Qur(α) of length n. The following conditions must be satisfied: 1. 2. 3. 4.

Qur(f ) = |0i; Qur(t) = |1i: n Qur(¬β) if At(β) = n; √ (Qur(β)), √ = Not n Qur(V¬β) = Not (Qur(β)), if At(β) = n; Qur( (β, γ, f )) = T(m+n+1) (Qur(β), Qur(γ), Qur(f )) if At(β) = m and At(γ) = n.

Definition 15. Consequence and logical consequence – β is a consequence of α in a model (A, Qur) (abbreviated as α |=Qur β) iff Qur(α) ¹ Qur(β). – β is a logical consequence of α in the logic AQCL (abbreviated as α |=AQCL β) iff for any model (A, Qur), α |=Qur β. 4

More precisely, we conventionally assume that b1 ] . . . ]bn is an ((. . . (b1 ]b2 )] . . .)]bn ) (which is the canonical form of a register-formula).

abbreviation

for

104

Maria Luisa Dalla Chiara, Roberto Giuntini, and Roberto Leporini

The logic AQCL is axiomatizable. We present a calculus that is sound and complete with respect to the abstract quantum computational semantics. The rules of the AQCL-calculus Let α ≡ β be an abbreviation for α ` β and β ` α. R0 κ1 ` κ2 , for any strictly classical κ1 and κ2 such that κ2 is a classical logical consequence of κ1 . R1 f ` κ; κ ` t, for any classical κ. R2 α ` α R3

α`β β`γ α`γ

R4 α](β]γ) ≡ (α]β)]γ R5 α]β ≡ β R6

α≡β ¬α ≡ ¬β

R7 ¬(b1 ] . . . ]bn ) ≡ b1 ] . . . ]bn−1 ]b⊥ n R8 ¬¬α ≡ α R9

√ α≡β √ ¬α ≡ ¬β

√ ¬(b1 ] . . . ]bn ) ≡ b1 ] . . . ]bn−1 ] ¬bn √ √ R11 ¬ ¬α ≡ ¬α √ √ R12 ¬ ¬α ≡ ¬ ¬α R10

R13 R14



(m)

(m)

α1

≡ α2

α1

∧ β1

(m)

V

(n)

(n)

(n)

β1

≡ β2

≡ α2

∧ β2

(m)

(n)

(b1 ] . . . ]bm , c1 ] . . . ]cn , f ) ≡ b1 ] . . . ]bm ]c1 ] . . . ]cn ](bm u cn )

R15 α ∧ β ≡ β ∧ α; α ∨ β ≡ β ∨ α R16 (α ∧ β) ∧ γ ≡ α ∧ (β ∧ γ); (α ∨ β) ∨ γ ≡ α ∨ (β ∨ γ) R17 α ∧ (β ∨ γ) ` (α ∧ β) ∨ (α ∧ γ); (α ∨ β) ∧ (α ∨ γ) ` α ∨ (β ∧ γ) √ √ R18 ¬t ` ¬(α ∧ β) √ √ R19 ¬(α ∧ β) ` ¬f The derivability relation for AQCL (abbreviated as α `AQCL β) is defined in the expected way. Theorem 1. α `AQCL β iff α |=AQCL β. The completeness-proof is based on the definition of a canonical model, where the set of all quregisters of length n is identified with the set of all formulas of atomic complexity n, while registers are represented by register-formulas. A crucial point in the proof is a syntactical definition of the quconsistency relation.

Concrete and Abstract Quantum Computational Logics

105

References [1] M. L. Dalla Chiara, R. Giuntini and R. Greechie, Reasoning in Quantum Theory, Kluwer, Dordrecht, 2004. [2] M. L. Dalla Chiara, R. Giuntini and R. Leporini, “Quantum Computational Logics. A Survey”, in V. Hendricks and J. Malinowski (eds.), Trends in Logic. 50 Years of Studia Logica, Kluwer, 2003, 229–271. [3] M. L. Dalla Chiara, R. Giuntini and R. Leporini, “Logics from quantum computation”, International Journal of Quantum Information, 3 (2005), 293–337. [4] M. L. Dalla Chiara, R. Giuntini and R. Leporini, “Compositional and holistic holistic quantum computational semantics”, Natural Computing, 10.1007/s11047-006-9020-x (2006), 1–20. ISSN: 15677818 (Print) 1572-9796 (Online). [5] D. Deutsch, A. Ekert, and R. Lupacchini, “Machines, logic and quantum physics”, Bulletin of Symbolic Logic, 3 (2000), 265–283. [6] S. Gudder, “Quantum computational logic”, International Journal of Theoretical Physics, 42 (2003), 39–47.

Pseudorandom Number Generation Based on 90/150 Linear Nongroup Cellular Automata ? ?? Sung-Jin Cho1 , Un-Sook Choi2 , Han-Doo Kim3 , Yoon-Hee Hwang4 , and Jin-Gyoung Kim5 1

Division of Mathematical Sciences, Pukyong National University Busan 608-737, Korea, [email protected] 2 Department of Multimedia Engineering, Tongmyong University Busan 626-847, Korea, [email protected] 3 Institute of Mathematical Sciences and School of Computer Aided Science Inje University, Gimhae 621-749, Korea, [email protected] 4 Department of Information Security, Graduate School Pukyong National University Busan 608-737, Korea, [email protected] 5 Department of Applied Mathematics, Pukyong National University Busan 608-737, Korea, [email protected]

Abstract. In this paper, we propose a new pseudorandom sequence generation algorithm. And we propose an efficient 90/150 LNCA synthesis method than the method which generates sequences constructed by a quadratic function. Also we give an algorithm for generating a maximal period pseudorandom sequence by synthesizing 90/150 NCA repeatedly with complement vectors.

1

Introduction

Information security is of importance and several cryptographic techniques have been developed to perform the security services required. The only cryptographic techniques which we can use those are based on stream ciphers. Pseudorandom sequences are the basis of these cryptosystems ([1], [2], [3], [4]). Quadratic functions are of great importance in cryptography. They have been studied and employed in different finite domains ([5], [6]). The orbits of these functions in Zpq are used in [5]. And the orbits of quadratic functions defined in GF (2n ) are used in [7]. Especially they [7] described the equivalence between the iteration of quadratic functions and the Cellular Automata(CA) behavior ([8], [9]), allowing them to study the sequences produced by quadratic functions as an additive CA. Thus they presented the characterization of sequences of maximal length and their randomness. Also they described an algorithm to produce pseudorandom sequences based on the orbits of quadratic functions and the results for applying the algorithm to 90/150 Linear Nongroup CA(90/150 LNCA). But there are some problems in their algorithm to produce pseudorandom sequences. They must check whether p(x) = x(x + 1)Q(x) + 1 is irreducible or not, where Q(x) ∈ F2 [x] is a primitive polynomial of degree n−2. But such p(x) does not exist for many cases. Even if the polynomial p(x) = x(x + 1)Q(x) + 1 is irreducible for some primitive polynomial Q(x) of degree n − 2, in many cases x(x + 1)Q(x) is not a CA-polynomial. This means that there is no 90/150 LNCA corresponding to the polynomial x(x + 1)Q(x). Moreover they didn’t have the algorithm which produce the 90/150 LNCA for the CA-polynomial. In this paper, we propose a new pseudorandom sequence generation algorithm. And we propose more efficient 90/150 LNCA synthesis method than the method which generates sequences constructed by f (x) = x2 + bx + c. Also we give an algorithm for generating a maximal period pseudorandom sequence by synthesizing 90/150 nongroup CA(NCA) repeatedly with complement vectors. ?

??

This work was supported by grant No. (R01-2006-000-10260-0) from the Basic Research Program of the Korea Science and Engineering Foundation. 2 Corresponding author

Pseudorandom number generation

2

107

CA Preliminaries

A CA consists of a number of interconnected cells arranged spatially in a regular manner [10], where the state-transitions of each cell depends on the states of its neighbors. If C is a linear CA whose rule vector is < d1 , d2 , · · · , dn >, then the state-transition matrix of C is a 90/150 tridiagonal matrix   d1 1 0 0 · · · 0 0 0  1 d2 1 0 · · · 0 0 0     0 1 d3 1 · · · 0 0 0    T =. . . . . . . ..    .. .. .. .. . . .. .. .    0 0 0 0 · · · 1 dn−1 1  0 0 0 0 · · · 0 1 dn , where di = 0 (resp. 1) if cell i uses rule 90(resp. 150). Hereafter we write T by T =< d1 , d2 , · · · , dn >, where di ∈ {0, 1}. The characteristic polynomial c(x) of T is defined by c(x) = |T ⊕ xI| where x is an indeterminate and I is the n × n identity matrix. A polynomial is said to be a CA-polynomial if it is the characteristic polynomial of some CA [11]. In fact all irreducible polynomials are CA-polynomials [11]. Definition 2.1. ([12], [13], [14]). i) Nongroup CA: A CA C is called a nongroup CA if the determinant of the state-transition matrix C is 0. ii) Attractor : A state having a self-loop is referred to as an attractor. An attractor can be viewed as a cyclic state with unit cycle length. iii) Depth: The maximum number of state-transition required to reach the nearest cyclic state from any non-reachable state in the CA state-transition diagram is defined as the depth of the NCA. p

Definition 2.2. ([12]). Let T denote p times application of the complemented CA operator T . Then p T X = T p X ⊕ (I ⊕ T ⊕ T 2 ⊕ · · · ⊕ T p−1 )F, where T is the state-transition matrix of the corresponding noncomplemented rule vector and X is an n-dimensional vector (n is the number of cells) responsible for inversion after XNORing. F has 0 10 entries (i.e., nonzero entries) for CA cell positions where XNOR function is employed and X is the current state assignment of the cells.

3

Algorithm for Finding 90/150 LNCA

In this section, we introduce an efficient method for finding 90/150 LNCA. Theorem 3.1. ([11]). Let T =< d1 , d2 , · · · , dn > and C be the companion matrix of the characteristic polynomial c(x) = xn + cn−1 xn−1 + cn−2 xn−2 + · · · + c1 x + c0 of T , where ci ∈ GF (2). Let U be the upper triangular matrix satisfying T U = U C.   1 a1 ∗ · · · ∗ ∗ ∗   0 1 a2 · · · ∗ ∗ 0 0 0 · · · 0 c0 ∗    1 0 0 · · · 0 c1  0 0 1 · · · ∗ ∗ ∗         ..  C = 0 1 0 · · · 0 c2  , U =  ... ... ... . . . ... ... .   .. .. .. . . .. ..    0 0 0 · · · 1 an−2 ∗  . . . . . .    0 0 0 · · · 0 1 an−1  0 0 0 · · · 1 cn−1 0 0 0 ··· 0 0 1 Then we obtain the following equation:

108

Sung-Jin Cho et al.

  d1 = a1     d2 = a1 ⊕ a2     d3 = a2 ⊕ a3 ..   .     dn−1 = an−2 ⊕ an−1     dn = an−1 ⊕ cn−1

(3.1)

Definition 3.2. ([15]). For a given n dimensional vector x and an n × n matrix M , let K(M, x) = (x; M x; M 2 x; · · · ; M n−1 x) We call K(M, x) the Krylov matrix and x is called a seed vector. K(C t , y) is always a Hankel matrix [15], where C t is the transpose of C. Theorem 3.3. ([16]). The Hankel matrix has an LU factorization if and only if the following equation hold: ( h1 = 1 (3.2) hi + h2i + h2i+1 = 0 (i = 1, 2, · · · , n − 1) We need a theorem to reduce (3.2) to a system of linear equations. Theorem 3.4. ([11]). Let A be the n × n matrix obtained by reducing the n polynomials xi−1 + x2i−1 + x2i , i = 1, 2, · · · , n

(3.3)

modulo c(x), where c(x) is a polynomial. Then the members of the set {v|Av = (0, · · · , 0, 1)t } satisfies the equation (3.2), where v = (h1 , h2 , · · · , hn )t . The following algorithm is an algorithm for finding the 90/150 LNCA for the given CApolynomial. This algorithm is based on the results in this section. Algorithm SynthesisOfLNCA Input : CA-polynomial c(x) Output : 90/150 LNCA Step 1 : Make the matrix A from (3.3). Step 2 : Solve the equation Av = (0, · · · , 0, 1)t . Step 3 : Construct a Krylov matrix H = K(C t , v) by the seed vector v which is a solution of the equation in Step 2. Step 4 : Compute the LU factorization H = LU . Step 5 : Compute CA for c(x) by the matrix U using (3.1). Example 3.5. For the CA-polynomial c(x) = x(x + 1)(x2 + x + 1) = x4 + x, we obtain an 4×4 matrix A as the following. Solving the equation Av = (0, 0, 0, 1)t , we obtain an seed vector v = (1, 1, 0, 0)t . Using v and C t , we obtain the matrix K(C t , v) = H = (v; C t v; (C t )2 v; (C t )3 v). By the LU factorization of H, we obtain the upper triangular matrix U .     1110 1100 0 0 0 1  0 1 0 1    A= 0 0 0 1 , U = 0 0 1 0 0111 0001 Since c3 = 0, T =< 1, 1 ⊕ 0, 0 ⊕ 0, 0 ⊕ 0 >=< 1, 1, 0, 0 > by 3.1.

Pseudorandom number generation

4

109

Pseudorandom Number Generation based on 90/150 NCA

In this section we propose a new pseudorandom sequence generation method based on 90/150 LNCA. In [7] they characterized b and c for the quadratic function f (x) = x2 + bx + c by using the state-transition diagram according to either T r( b2c+1 ) = 0 or T r( b2c+1 ) = 1. In this paper we characterize b and c without the trace function. Also we need the following matrix B constructed by f (x) = x2 + bx to get the next state. B = ((αn−1 (αn−1 + b))t ; · · · ; (α2 (α2 + b))t ; (α(α + b))t ; (1 + b)t ) In this case, the rank of the n × n matrix B is n − 1. Example 4.1. Let f (x) = x2 + αx, where α is a root of p(x) = x4 + x + 1, B is the following matrix.   3 3 t  α (α + α) 1100 1 0 0 0 α2 (α2 + α)    B=  α(α + α)  = 1 1 0 1 1101 (1 + α) The following theorem gives an explicit formula for the characteristic polynomial of B, where b = α is a primitive root defining GF (2n ). Theorem 4.2. The characteristic polynomial of the matrix B constructed by a quadratic function f (x) = x2 + αx is x(x + 1)Q(x), where α is a primitive root defining GF (2n ). Proof. Let α be a primitive root defining GF (2n ). The (n − 1)th column of B is the zero vector and the nth column of B is (0, · · · , 0, 1, 1)t . Thus B is the following matrix.   .. B . O  11    B =  · · · · · · · · ·  .. 0 1 B21 . 01 Therefore the characteristic polynomial of B is the product x(x+1)Q(x) µ of¶the characteristic 01 polynomial Q(x) of B11 and the characteristic polynomial x(x + 1) of . 01 Theorem 4.3. Let f (x) = x2 + bx + c be a quadratic function defined in GF (2n ), with b, c ∈ GF (2n ). If b is a primitive root defining GF (2n ) and c ∈ / N (B · Q(B)) where B is the matrix constructed by f (x) = x2 + bx and N (B · Q(B)) is the null space of B · Q(B). Then the state-transition diagram generated by the f (x) has maximal cycles and the depth of the state-transition diagram is 1. Example 4.4. Let f (x) = x2 + αx be a quadratic function defined in GF (24 ), where α is a primitive root of p(x) = x4 + x + 1. Fig. 1 shows the state-transition diagram for f (x). The null space of B · Q(B) = B 3 + B 2 + B is [2, 5, 8], where [2, 5, 8] is the subspace generated by the states 2, 5 and 8. Fig. 2 shows the state-transition diagrams for the case c = 2(∈ N (B · Q(B))) and for the case c = 3(∈ N (B · Q(B))). If the sequences generated by these quadratic functions have maximal cycles, Q(x) must be a primitive polynomial. In [7], the characteristic polynomial of B is p(x)+1, where p(x) is the irreducible polynomial defining GF (2n ). For Q(x) = x4 + x + 1, since x(x + 1)Q(x) + 1 := p(x) = (x2 + x + 1)3 is not irreducible, it is impossible to construct B. By our algorithm we can overcome this problem. So we can construct 90/150 LNCA with characteristic polynomial x(x + 1)Q(x). The period of the n-cell maximal length NCA with

110

Sung-Jin Cho et al.

Fig. 1. State-transition diagram for f (x) = x2 + αx

Fig. 2. State-transition diagram for c = 2 and c = 3 Table 1. 90/150 LNCA for x(x + 1)Q(x) (In this table, 9,5,4,1,0 stands for the polynomial x9 + x5 + x4 + x + 1.) n

Q(x)

11 12 13 14 15

9,5,4,1,0 10,7,6,5,2,1,0 11,9,7,5,2,1,0 12,10,2,1,0 13,12,10,5,2,1,0

CA Configuration n 10000110011 001111010101 1111101110111 01000110010010 000101010001101

16 17 18 19 20

Q(x)

CA Configuration

14,12,10,1,0 15,12,9,1,0 16,14,12,1,0 17,13,12,1,0 18,17,12,10,9,1,0

1100110011010011 00000111110100111 101100100110001101 0100101010011011100 00111100100000111000

characteristic polynomial x(x+1)Q(x) is 2n−2 −1, where Q(x) ∈ F2 [x] is a primitive polynomial of degree n − 2. The state-transition diagram of these NCA consists of two trees whose depth are always unity and cycle lengths are 1, and two trees whose depth are always unity and cycle lengths are 2n−2 − 1. Fig. 3 shows the state-transition diagram of 90/150 NCA with the characteristic polynomial c(x) = x(x + 1)(x2 + x + 1) = x4 + x considered in Example 3.5. Table 1 shows that there exists an n-cell 90/150 LNCA for the 90/150 CA-polynomial of the form x(x + 1)Q(x) (Q(x) is some primitive polynomial). Now we present three methods to extend the period of these 90/150 NCA. The first method for extending the period of 90/150 NCA is to generate the complemented CA corresponding to 90/150 LNCA using a complement vector. The following theorems characterize the period and the structure of the state-transition diagram of the complemented NCA corresponding to 90/150 LNCA using a complement vector. Theorem 4.5. Let T be the state-transition matrix of the n-cell 90/150 LNCA C with the characteristic polynomial x(x + 1)Q(x). Take an element z ∈ N (T · Q(T )) as the complement vector F . Let C0 be the complemented CA corresponding to C using the complement vector z. Then the state-transition diagrams of C and C0 are isomorphic. Theorem 4.6. Let T be the state-transition matrix of the n-cell 90/150 LNCA C with the characteristic polynomial x(x + 1)Q(x). Let the period of the state-transition diagram of C be r. Take an element z ∈ / N (T · Q(T )) as the complement vector F . Let C0 be the complemented CA corresponding to C using the complement vector z. Then the period of the state-transition diagram of C0 is 2r.

Pseudorandom number generation

111

Fig. 3. State-transition diagram for c(x) = x(x + 1)(x2 + x + 1)

Remark. The state-transition diagram of C0 has two trees, one with depth 1 and period 2, and another with depth 1 and period 2n−1 − 2. Example 4.7. For the 90/150 LNCA considered in Example 3.5, the null space of T · Q(T ) = T 3 + T 2 + T is [4, 9, 10]. Thus, taking an element of [4, 9, 10] as F , the state-transition diagrams of C and C0 are isomorphic. Fig. 4 shows the state-transition diagrams of C0 for F = 4 = (0, 1, 0, 0)t ∈ [4, 9, 10] and for F = 2 = (0, 0, 1, 0)t ∈ / [4, 9, 10].

Fig. 4. State-transition diagram for 4 = (0, 1, 0, 0)t and 2 = (0, 0, 1, 0)t

The second method for extending the period of NCA is to synthesize the n-cell 90/150 LNCA with the characteristic polynomial xp(x). The state-transition diagram of the maximal period 90/150 NCA has two trees, one with depth 1 and period 1, and another with depth 1 and period 2n−1 − 1. The following theorem characterizes a 90/150 LNCA with characteristic polynomial xp(x). Theorem 4.8. Let g(x) = xp(x), where p(x) is an (n − 1)th primitive polynomial of the form p(x) = xn−1 + an−2 xn−2 + · · · + a2 x2 + 1. Then there exists a 90/150 LNCA with the characteristic polynomial g(x).

Fig. 5. State-transition diagram for c(x) = x(x3 + x2 + 1)

112

Sung-Jin Cho et al. Table 2. 90/150 LNCA for xp(x) (In this table 28,3,0 stands for the polynomial x28 + x3 + 1.) n

p(x)

28 29 30 31 32

27,23,22,17,0 28,3,0 29,2,0 30,29,10,9,0 31,3,0

CA Configuration 1000100000101001111010111011 00000001111101101011110000000 110001101110001111011101100011 1000001001111000111100110111010 11010010011011110011011001001011

Corollary 4.9. xp(x) is not a CA-polynomial for the primitive polynomial p(x) = xn−1 + an−2 xn−2 + · · · + a2 x2 + x + 1. Theorem 4.10. The depth of the state-transition diagram of an n-cell 90/150 LNCA C with the characteristic polynomial xp(x) is 1. And C has two trees, one 0-tree with cyclic state as an attractor and depth 1, and another with depth 1 and cycle length 2n−1 − 1, where p(x) = xn−1 + an−2 xn−2 + · · · + a2 x2 + a1 x + 1 is a primitive polynomial with a1 = 0. Example 4.11. Let c(x) = xp(x) = x(x3 + x2 + 1). Then a 4-cell 90/150 LNCA C is T =< 1, 1, 1, 0 >. Fig. 5 shows the state-transition diagrams of C and C0 corresponding to C with the complement vector F = 10 = (1, 0, 1, 0)t . for the 90/150 CA-polynomial of the form xp(x) (p(x) is some primitive polynomial) for each n(28 ≤ n ≤ 32). The last method for extending the period of 90/150 NCA is to synthesize complemented CA corresponding to an n-cell 90/150 LNCA using a complement vector. This method can generate long period sequences by synthesizing 90/150 NCA repearedly with complement vectors. Since an n-cell 90/150 LNCA with the characteristic polynomial xp(x) is a maximal period 90/150 NCA, the complemented CA corresponding to the LNCA is also a maximal period NCA. The following algorithm is an algorithm for obtaining a long period sequence from small size CA. Algorithm MaximalPeriodSequence Step 1 : Synthesize an n-cell 90/150 LNCA with CA-polynomial xp(x) as the characteristic polynomial. Step 2 : Select a state in the larger component of the state-transition diagram as a seed vector v0 . Step 3 : For F = 0 to 2n − 1 Check the (v0 , F ). if (v0 , F ) is in the (v0 , F )-table then STOP. Run the CA with (v0 , F ) for (2n−1 − 1) cycles. if T

(2n−1 −1)

v0 is in the larger component (2n−1 −1)

(2n−1 −2)

then v0 ← T v0 else v0 ← T v0 Step 4 : Go To step 3. Using this algorithm, for 4-cell 90/150 NCA with the characteristic polynomial c(x) = x(x3 + x2 + 1) and 5-cell 90/150 NCA with the characteristic polynomial c(x) = x(x4 + x3 + 1), we can generate pseudorandom sequences whose periods 217 and 942, respectively.

5

Conclusion

In this paper, we proposed a new pseudorandom sequence generation algorithm. And we proposed more efficient 90/150 LNCA synthesis method than the method which generates se-

Pseudorandom number generation

113

quences constructed by f (x) = x2 + bx + c. Also we gave an algorithm for generating a maximal period pseudorandom sequence by synthesizing 90/150 NCA repeatedly with complement vectors.

References 1. D. de la Guia and A. Fuster-Sabater, Cryptographic design based on cellular automata, IEEE International Symposium on Information Theory, (1997) 180. 2. S.J. Cho, U.S. Choi, Y.H. Hwang, Y.S. Pyo, H.D. Kim, K.S. Kim and S.H. Heo, “ Computing phase shifts of maximum-length 90/150 cellular automata sequences,” In Proc. ACRI 2004, LNCS, 3305 (2004) 31-39. 3. S. Nandi, B.K. Kar and P.P. Chaudhuri, Theory and application of cellular automata in cryptography, IEEE Trans. Computers, 43 (1994) 1346-1357. 4. A.K. Das and P.P. Chaudhuri, Vector space theoretic analysis of additive cellular automata and its application for pseudo- exhaustive test pattern generation, IEEE Trans. Comput., 42 (1993) 340-352. 5. L. Blum, M. Blum and S. Shub, A simpler unpredictable pseudorandom number generator, SIAM Journal on Computing), Vol. 15 (1986) 364-383. 6. L. Blum and S. Goldwasser, An efficient probabilistic public key encryption scheme which hides all partial information S, Adv. in Cryptology-CRYPTO’84, LNCS 196, Springer-Verlag, (1985) 289-299. 7. D. de la Guia Martinez and A. Peinado Dominguez, Pseudorandom number generation based on nongroup cellular automata, Security Technology, 1999, Proceedings, IEEE 33rd Annual 1999 International Carnahan Conference, 45 (1999) 370-376. 8. S.J. Cho, U.S. Choi, Y.H. Hwang and H.D. Kim, Analysis of hybrid group cellular automata, ACRI 2006, LNCS, 4173 (2006) 222-231. 9. S.J. Cho, U.S. Choi, Y.H. Hwang, H.D. Kim and H.H. Choi, Behaviors of single attractor cellular automata over Galois Field GF (2p ), ACRI 2006, LNCS, 4173 (2006) 232-237. 10. S. Wolfram, Statistical mechanics of cellular automata, Rev. Mod. Phys. 55 (1983) 601-644. 11. S.J. Cho, U.S. Choi, H.D. Kim, Y.H. Hwang, J.G. Kim and S.H. Heo, New synthesis of onedimensional 90/150 linear hybrid group cellular automata, IEEE Trans. Comput-Aided Des. Integr. Circuits Syst., Accepted. 12. P.P. Chaudhuri, D.R. Chowdhury, S. Nandy and C. Chattopadhyay, Additive cellular automata theory and applications, 1, IEEE Computer Society Press, California, (1997). 13. S.J. Cho, U.S. Choi and H.D. Kim, Analysis of complemented CA derived from a linear TPMACA, Computers Math. Applic., 45 (2003) 689-698. 14. S.J. Cho, U.S. Choi and H.D. Kim, Behavior of complemented CA whose complement vector is acyclic in a linear TPMACA, Math. Comput. Modelling, 36 (2002) 979-986. 15. R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press (1985). 16. C.R. Johnson, D.D. Olesky and P. van den Driessche, Inherited matrix entries: LU factorizations, SIAM J. on Matrix Analysis and Applications, 10 (1989) 94-104.

Towards a Domain-theoretic Model of Developmental Machines Graçaliz Pereira Dimuro and Antônio Carlos da Rocha Costa Escola de Informática – PPGINF, Universidade Católica de Pelotas Rua Felix da Cunha 412, 96010-000 Pelotas, Brazil {liz,rocha}@ucpel.tche.br

Abstract. This paper reports ongoing work about a domain-theoretic formalization of a notion of developmental computing machines. It builds on some operations defined on so-called bi-structured domains, to set the stage for a view of machine development as a self-regulated process of machine construction. The transformations defined along the two structures of the bi-structured domain of development stages, namely the transformations of the computing machines’ operational and constructional structures, are sketched in the paper, and shown to support an appropriate intuitive notion of machine development.

1

Introduction

Living beings, social organizations, and complex functional artifacts (e.g., complex computing systems) are similar in the difficulty of articulating adequate notions for their conceptual explication, and in the corresponding difficulty of getting their formal description by mathematical means. We take the viewpoint that two notions that can’t be dismissed when considering real kinds of such systems are the notions of development and evolution. Development is usually thought of as concerning modifications in individuals, leading them from states of lesser individual structural, functional and behavioral features to states of greater individual structural, functional and behavioral features, while evolution is thought of as concerning modifications in populations of individuals, similarly leading such populations from states of lesser collective structural, functional and behavioral features to states of greater collective structural, functional and behavioral features. Clearly, development and evolution should be seen as inter-related aspects of complex systems, so that the evolution of a population impacts the development of its individuals, and the development of individuals impacts the evolution of their population. We aim at applying Jean Piaget’s analysis of the development and evolution of complex systems to the development and evolution of computing machines1 , that is, we aim at connecting the concepts of machine development and machine evolution with concepts such as autonomy, adaptation and operatory equilibration. Such concepts refer to processes and structures that are still widely unexplored in research areas such as Artificial Intelligence, General System Theories and Cybernetics, so our work should be seen as placed in an epistemological space were a wide open theoretical scope still exists for any attempt at introducing formal models for such ideas. We note that we are mainly concerned with the formal explication of the notion of machine development, the explication of machine evolution being left for future work. Also, we note that the machines with which we are concerned are computing machines, that is, machines 1

See [1] for an account of Piaget’s approach to general biological processes and structures, including an account of the notions of biological development and evolution, and [2] for an account of his approach to the psychological processes and structures involved in the general problem of cognitive development.

A Domain-theoretic Model of Machine Development

115

whose main structural, functional and behavioral features refer to the processing of symbolic information, although we stress that we conceive them in tight interaction with environments characterized by both informational and material (i.e., physical) features. We also note that we do not aim, in the present paper, to present any foundational advances besides: (1) summarizing our view that the concepts elaborated by Piaget for the explanation of the notion of (biological and psychological) development can support the establishment of a sensible notion of development for computing machines; and (2) showing that such notion of machine development can be given a suitable expression in the language of Domain Theory [3–5]. The paper builds on our various previous results. The notion of machine development were introduced in [6], explaining it intuitively in terms of self-regulated construction processes happening in domain-like structures. In [7], we have stressed the conceptual connection of such notion of machine development with the theoretical framework of the Interactive Computations [8]. The formal domain-theoretic structures that we use here to explicate our notion of machine development are the so-called bi-structured domains, introduced in [9–11]. The paper formally defines a bi-structured domain of computing machines, and uses such domain to give a formalization for the notion of machine development in a way that is in line with the piagetian approach we mentioned above.

2

Development and Equilibration

Following Piaget [1], we consider development as a process that is indispensable in complex systems, allowing them to progressively increase the quality of their structural, functional and behavioral features. In consequence, development allows complex systems to progressively improve their adaptation to the environment, that is, to progressively increase the equilibrium of their exchanges with the other systems that operate in the environment. Accordingly, one says that development happens through a sequence of equilibration operations. Two main kinds of equilibration operations can be defined [2]. The first one, called minor equilibration, contributes to development by increasing the quality of the systems’ behavioral, organizational and functional (BOF) features without altering the so-called developmental level of the system: it expands the scope of applications of each of the operations present in the system’s structure, in an incremental way, without altering the properties of that structure as such. The second kind of equilibration operation, called major equilibration, contributes to development by modifying substantially the system’s structure, expanding the set of operations and relations responsible for the systems’ BOF features, leading the system to a new developmental level. We propose that minor equilibrations be modelled domain-theoretically as chains of stages of machine development within domains of stages of machine development, which represent the development levels of developmental machines. Major equilibrations are to be modelled domain-theoretically through chains of domains within the ordered category of domains of stages of machine development, where the order models the relative expansion of a machine’s structure. Formally, we operate with domains that are isomorphic to coherence spaces [5, 12, 13], which are families of coherent sets, ordered by the inclusion relation (see Def. 1). We model stages of machine development as such coherent sets, so that processes of machine development produced by minor equilibrations are simply chains of coherent sets in those spaces. Major equilibrations, on the other hand, may be modelled by the construction process introduced in [9–11], where: (1) a powerset operation is applied to an initial coherence space B, obtaining a new coherence space ℘(B), which ensures that B is embedded in it; (2) restriction operations are performed, in order to guarantee that a subsystem ℘(B)0 is well behaved with respect to a given criteria.

116

G.P. Dimuro and A.C.R. Costa

Figure 1 illustrates intuitively the first few major equilibrations of the domain B. The chain of restricted domains obtained as the result of a major equilibration process is the domaintheoretic support for the whole development process of the computing machine, which is being modelled by that structure. Each of the domains in that support is a level of development, achieved by a major equilibration operation, and within which minor equilibration operations occur.

Ã

(B) ¢

Powerset operation

Ã

( B) Restriction operation

B

Fig. 1. The first few major equilibrations of the domain B.

Considered abstractly, chains of objects defined through out a sequence of progressively equilibrated domains seem to constitute a good tentative model for an abstract notion of development process. Applied to the development of machines, as will be seen below, they are the best candidates we have currently for the formal model of developmental machines that we are looking for. We summarize as follows our model of developmental machines: (1) operations of minor equilibration take stages of development of the machine through chains of increasingly complete BOF features; (2) having achieved a development stage with totally defined BOF features, an operation of major equilibration may be applied to the developmental machine, by reflecting it to an upper powerset-like domain structure, which is constructed from the the domain that represented the previous development level, and which allows further BOF development at the new level; (3) a chain of such increasingly complex stages of development allows for a development process (sequence of minor and major equilibrations) that is able to give to the developmental machine a set of BOF features that is qualitatively different from the one with which the machine began to operate.

3 3.1

Coherence Spaces and the Minor Equilibration Process Stages of Machine Development and Coherence Spaces

We take the (very) simplifying view that a computing machine, in any stage of any level of its development, can be seen as a dynamical system m = (S, δ), where S is a set of states and δ ⊆ S × S is a (partial) transition relation defined on S. Let ∆S denote the set of all transition relations. The development of a computing machine, as given by a succession of minor equilibration operations, can be modelled as a chain in the set M = {S} × ∆S of stages of machine development, partially ordered by a development relation v⊆ M × M, defined as: ∀m1 = (S, δ1 ), m2 = (S, δ2 ) ∈ M : m1 v m2 ⇔ δ1 ⊆ δ2 .

(1)

A Domain-theoretic Model of Machine Development

117

The structure M = (M, v, m0 ) is a domain in the general sense of Scott’s theory [3–5], where the development relation v is the domain-theoretic information order of M, and m0 = (S, δ0 ) is the least element of the domain, representing the initial stage in the development process (δ0 = ∅). Among several domain structures found in the literature [3–5], Coherence Spaces [12, 13, 5] seemed to be adequate for the purpose of this work, since the powerset operation (necessary for modelling the major equilibration operation between two levels of machine development) is a functor in the category of coherence spaces and continuous functions. Definition 1. A coherence space is a structure A = (A, ⊆), where A is a of family of (coherent) sets ordered under the inclusion relation, such that the following properties hold: (i) down-closure: if a ∈ A and a0 S ⊆ a, then a0 ∈ A; (ii) binary completeness: if X ⊆ A and if 0 0 ∀a, a ∈ X : (a ∪ a ) ∈ A, then X ∈ A. From any coherence space A, it is possible to find its web [13], the very basic structure from which A originated, which is given by: | A |= ({α | {α} ∈ A}, ≈), where any α is called a token and ≈ is a reflexive and symmetric relation, called coherence relation, defined between tokens by: α ≈ α0 ⇔ {α, α0 } ∈ A. e = {S} × ∆ eS be the set of stages of development of a deterministic Proposition 1. Let M e = (∆ eS is the set of partial transition functions δe on S. Then: (i) D eS , ⊆) machine, where ∆ e is a coherence space, with web | D |= (S × S, ≈), and coherence relation ≈⊆ S × S given by: e v, m f = (M, (x, y) ≈ (x0 , y 0 ) ⇔ ((x, y) = (x0 , y 0 ) ∨ x 6= x0 ). (ii) The structure M e 0 ), where v is e . defined analogously to (1) and m e 0 = (S, δe0 = ∅), is isomorphic to D Proof. (i) It is immediate that ≈ is a coherence relation, since it is reflexive and symmetric. eS = {δe ⊆ S × S | ∀(x, y), (x0 , y 0 ) ∈ δe : (x, y) ≈ (x0 , y 0 )} is a family of coherent It follows that ∆ e is a coherence space. (ii) Considering sets that satisfy the conditions of Def. 1, and, thus, D e e f e the bijection f : M → D, defined by (S, δ) 7→ δ, the result is then immediate, since f is order-preserving. u t Notice that analogous results hold if one considers a non-deterministic computing machine. In this case, it is immediate that: Proposition 2. Let ∆S be the set of partial transition relations that define the dynamics of a non-deterministic computing machine. Then: (i) D = (∆S , ⊆) is a coherence space, with the trivial coherence relation. (ii) M = (M, v, m0 ), with M = {S} × ∆S , m0 = (S, δ0 = ∅) and v as given in (1), is isomorphic to D. u t In the following, we shall abuse the notation and refer to both domains of stages of machine f as coherence spaces. development M and M 3.2

Bi-structured Coherence Spaces, and the Development Factors

The universe of M may be restricted by a set I of internal development factors (e.g., some condition of behavior enabling, some functional requirement, etc.). That is, it may be necessary to define the family of developmental stages that are admissible for I, obtaining a developmental structure MI = (MI , vI , m0 ) where MI is the subset of stages of machine development that are compatible with I and vI ⊆ MI × MI preserves the compatibility character during the development process. I can be seen as a set of criteria for the internal consistency of the admissible structures of M. The influence of the external environment can also be considered, in a similar way, through a set of external development factors E. I and E lead the development

118

G.P. Dimuro and A.C.R. Costa

process in M toward a final, limit developmental stage, which is the limit of the development chain in that domain. A formal structure able to model the domain of development stages of a computing machine (at a given development level – see below) is, thus, a structure MI,E = (MI,E , vI,E , m0 ) , where MI,E is the subset of development stages of M that are compatible with both I and E, and the development relation vI,E respect both types of development factors. At this point, we find that it is necessary to deal with the so-called bi-structured coherence spaces introduced in [9, 10]: Definition 2. A bi-structured coherence space is a system A = (A; ΣAin ; ΣAex ), with type hµin ; µex i, where: (i) A 6= ∅ is the universe of a coherence space; (ii) ΣAin = (⊆A , {gAl : Aµin (l) → A}l∈L ) is the internal structure of A, determined by the information order (the inclusion relation ⊆A defined on A), together with functions gAl , which contain the internal factors of the development process, with arities given by µin : L → N; (iii) ΣAex = ({fAi : Aµex (i) → A}i∈I ) is the external structure of A, determined by the functions fAi , which contain the external factors of the development process, with arity given by µex : I → N. Let A = (A; ΣAin ; ΣAex ) and B = (A; ΣBin ; ΣBex ) be bi-structured coherence spaces, both of the same type hµin ; µex i. A and B are said to be homomorphic if and only if there exists a continuous2 function h : A → B such that: (i) ∀l ∈ L, x1 , . . . , xµin (l) ∈ A: h(gAl (x1 , . . . , xµin (l) )) = gBl (h(x1 ), . . . , h(xµin (l) )); (ii) ∀i ∈ I, x1 , . . . , xµex (i) ∈ A: h(fAi (x1 , . . . , xµex (i) )) = fBi (h(x1 ), . . . , h(xµex (i) )). h is said to be an homomorphism from A to B. When the reverse of (ii) holds, then h is a strong homomorphism. If h is an injective strong homomorphism, then h is said to be an embedding of A into B. The category BSCS has bi-structured coherence spaces as objects and strong homomorphisms as morphisms ([9, 10]). Given that the internal and external development factors are given as functions defined on M = {S} × ∆S , then it is possible to take a domain M = (M, v, m0 ) of stages of developE I ) be the extended ; ΣM ment and make explicit those factors in its structure. Let M∗ = (M; ΣM I domain, where ΣM is the internal structure (the information order ⊆ and the internal developE ment factors I), and ΣM is the external structure (the external development factors E). Then, considering the isomorphism stated in Prop. 1, it follows that: I E I E Proposition 3. M∗ = (M; ΣM ; ΣM ) is isomorphic to D = (∆S ; Σ∆ ; Σ∆ ). S S I E Similarly, it is possible to get M∗I,E = (MI,E ; ΣM ; ΣM ) that has only developmental I,E I,E stages compatible with both I and E.

3.3

Sub-systems of Bi-Structured Coherence Spaces

In the following, let A = (A; ΣAin ; ΣAex ) and B = (A; ΣBin ; ΣBex ) be bi-structured coherence spaces, both of the same type hµin ; µex i, as introduced in Def. 2. Definition 3. A is said to be a sub-system (restricted by a function p : B → B) of B, denoted by A = p(B), if and only if: (i) A = p[B] = {p(b) | binB}; (ii) ∀l ∈ L : gAl = gBl |Aµin (l) , i.e., gAl is the restriction of gBl to A; (iii) ∀i ∈ I : fAi = fBi |Aµex (i) , i.e., fAi is the restriction of gBi to A. 2

We require that an homomorphism of bi-structured coherence spaces be at least continuous, but additional properties of stability or linearity may be desirable if the categories of coherence spaces STAB or LIN are to be considered. [13]

A Domain-theoretic Model of Machine Development

119

Notice that a sub-system is not necessarily a coherence space. The operation of closure of a function on bi-structured coherence spaces, with respect to a given sub-system, guarantees that the function is well-behaved in that sub-system [10, 11]. The closure of a function hAi = Aµ(i) → A, defined on a bi-structured coherence space A, with respect to a sub-system p(A), ˆ p[A] : Aµ(i) → A, defined by: itself closed for the intersection operation, is the function h Ai ˆ p[A] (X1 , . . . , Xµ(i) ) h Ai ½T {Y ∈ p[A] | hAi (X1 , . . . , Xµ(i) ) ⊆ Y } if X1 , . . . , Xµ(i) ∈ p[A]µ(i) , = hAi (X1 , . . . , Xµ(i) ) otherwise.

(2)

Definition 4. A is obtained by regulation from B, with respect to a sub-system p(B) of B, p[B] denoted by A = Rp (B), if and only if: (i) A = B; (ii) ∀l ∈ L : gAl = gˆBl ; (iii) ∀i ∈ I : fAi = p[B] p[B] p[B] are the closures of fB and gB with respect to p[B], as given in (2). fˆ ; where fˆ , gˆ Bi

Bi

Bl

i

l

I E Let M∗ = (M; ΣM ; ΣM ) be the bi-structured coherence space of a given level of develop∗ I E ment (Prop. 3) and MI,E = (MI,E ; ΣM ; ΣM ) be the domain of the development stages I,E I,E that are admissible for the development factors I and E (see Sect. 3.2). Then, we state that following requirements for the development factors: (i) M∗I,E is a subsystem of M∗ that is itself a coherence space; and (ii) M∗ = RI,E (M∗ ), that is, M∗ is a fixpoint of RI,E .

4

CS Constructors and the Major Equilibration Process

A a total stage of machine development, in a given level, represents a fully developed machine, with respect to the level of development (bi-structured coherence space) that is being considered. Only a so-called major equilibration operation can embed the developmental machine into a higher-level bi-structured coherence space (the domain corresponding to a higher level of development) whose developmental stages allow further BOF development. In this section, coherence space transformations performed simultaneously on the universe and on the internal and external structures (called global domain transformations in [9, 10]) are used to model operations of major equilibration. Those transformations are functors in the category BSCS. 4.1

Constructors of Coherence Spaces

The constructors of Coherence Spaces [13] may be extended to bi-structured coherence spaces. Some of then may have interesting interpretation when acting on a developmental machine domain. Due to the lack of space, in this subsection we just outline some of those constructors. I E I E Let D = (∆S ; Σ∆ ; Σ∆ ) and D0 = (∆S 0 ; Σ∆ ; Σ∆ ) be the bi-structured coherence S S S0 S0 ∗ 0∗ spaces isomorphic to developmental machines M and M (Prop. 3). Considering, for example, the operator called tensor ? applied to D and D0 , one obtains a bi-structured coherence space D ? D0 determined by the web ((S × S) × (S 0 × S 0 ), ≈? ), with ((x1 , y1 ), (x01 , y10 )) ≈? ((x2 , y2 ), (x02 , y20 )) ⇔ ((x1 , y1 ) ≈S (x2 , y2 ) ∧ (x01 , y10 ) ≈S 0 (x02 , y20 )), where ≈S and ≈S 0 are the coherence relations defined on the original webs (S × S, ≈S ) and (S 0 × S 0 , ≈S 0 ), respectively (Prop. 1). The isomorphic domain M∗ ? M0∗ constitutes a kind of “grid” of developmental machines (parallel processing with two independent transition functions). 4.2

The Operation of Major Equilibration

In the following, let A = (A; ΣAin ; ΣAex ), B = (B; ΣBin ; ΣBex ) ∈ BSCS be both of the same type hµin ; µex i, as introduced in Def. 2.

120

G.P. Dimuro and A.C.R. Costa

Definition 5. A powerset constructor is a map t ≡ → BSCS, defined by tA = (tu A; tΣin ΣAin ; tΣex ΣAex ), where:

(tu ; tΣin ; tΣex )

:

BSCS

(i) tu is a powerset universe constructor such that tu A = ℘(A); (ii) tΣ in = ht⊆ , tin F i is the internal structure constructor, determined by the information order in in constructor t⊆ and the internal function constructor tin F , such that ht⊆ , tF iΣA = (t⊆ ⊆A in µin (l) , tF {gAl : A → A}l∈L ), where: (a) t⊆ ⊆A =⊆℘(A) , that is, t⊆ transforms the inclusion relation defined on A into the inclusion relation defined on tu A = ℘(A), and µin(l) → tu A], where (tin (b) ∀l ∈ L, gAl ∈ [Aµin(l) → A] : gAl 7→ (tin F g)(tu A)l F g)(tu A)l ∈ [tu A is the natural extension of the function gAl to tu A: (tin F g)(tu A)l (X1 , . . . , Xµin (l) ) = {gAl (x1 , . . . , xµin (l) ) ∈ A | x1 ∈ X1 , . . . , xµin (l) ∈ Xµin (l) },

(3)

for each x1 , . . . , xµin (l) ∈ A and X1 , . . . , Xµin (l) ∈ tu A; (iii) tΣ ex = htex F i is an external structure constructor, determined by the external function ex ex ex µex (i) → A}i∈I , where ∀i ∈ I, fXi ∈ constructor tex F , such that htF iΣA = tF {fAi : A µex(i) µex (i) ex [X → X] : fXi 7→ (tF f )(tu X)i ∈ [tu X → tu X], with (tex F f )(tu X)i being the natural extension of the function fXl to tu X, whose definition is analogous to (3). It follows that (see [10, 9]): Proposition 4. tA is a bi-structured coherence space, whose internal and external structures are well defined. u t Definition 6. Let p(A) be a sub-system of A, restricted by a function p : A → A (Def. 3). Let tB = (tu B; tΣin ΣBin ; tΣex ΣBex ) be the bi-structured coherence space obtained from B by the application of the powerset constructor t ≡ (tu ; tΣin ; tΣex ). A is said to be obtained from B by a major equilibration regulated by p(A), denoted by B Vp(A) A, if and only if: (i) A = Rp (tB), that is, A is obtained by the regulation of tB with respect to p(A) = p(tB); and (ii) B is embedded in A. Notice that if condition (i) of Def. 6 holds, then (ii) is redundant, since it is clear that there is a continuous injection f : B → tB, defined by x 7→ {x}, for all x ∈ B. However, we decided to make explicit (ii) to guarantee that the original domain is represented in the domain obtained by a major equilibration process. Observe that, from Def. 6, Prop. 4 and Def. 4, it follows that if B Vp(A) A then p(A) is closed for the functions defined on ΣAin and ΣAex . I E Let D = (∆S ; Σ∆ ; Σ∆ ) be the bi-structured coherence space that is isomorphic to a S S ∗ I E domain M = (M; ΣM ; ΣM ) of a developmental machine. Applying the powerset constructor I E to D, the resulting bi-structured coherence space ℘(D) = (℘(∆S ); Σ℘(∆ ; Σ℘(∆ ) has a web S) S) given by | ℘(D) |= (∆S , ≈), where ≈ is the trivial coherence relation, that is, δ ≈ δ 0 , for all δ, δ 0 ∈ ∆. I E Proposition 5. ℘(D) is isomorphic to ℘(M∗ ) = ({S} × ℘(∆S ); ΣM ; ΣM ).

Proof. Consider the order-preserving bijection f : ℘(M∗ ) → ℘(D), defined by (S, W ) 7→ W , for all subset W ∈ ℘(D) of transitions relations on S. u t The main feature of ℘(M∗ ) is that any of its stage of development is a dynamical system composed by a family of transition functions behaving as a set of distributed machines. Note, however, that ℘(M∗ ) encompasses all the stages in the previous level, since M∗ is embedded in ℘(M∗ ).

A Domain-theoretic Model of Machine Development

121

We now show how the powerset constructor, restricted by some development factor, can be used to model some examples of major equilibration. First, suppose that there is a development factor given by a function g : ℘(∆S ) → ℘(∆S ), defined, for all X ∈ ℘(∆S ), by: ½ S T X if X ∈ ∆S and δ∈X δ = ∅; (4) g(X) = ∅ otherwise. g(℘(D)) is a sub-system of ℘(D) that is a bi-structured coherence space. The web is given by | g(℘(D)) |= (∆S , ≈), where δ ≈ δ 0 ⇔ (δ ∪ δ 0 ∈ ∆S ∧ δ ∩ δ 0 = ∅). g(℘(D)) presents the same behavior of the original domain D of stages of machine development. Considering the regulation of ℘(D) with respect to g(℘(D)), Rg (℘(D)), then, from Def. 6 and Prop. 5, it follows that: Proposition 6. M∗ Vg(℘(M∗ )) Rg (℘(M∗ )).

u t

Consider now another elementary development factor, given by a function h : ℘(∆S ) → ℘(∆S ), defined, for all X ∈ ℘(∆S ), by: X 7→ Xx¯ = {δx¯ ∈ ∆S | δ ∈ X}, where δx¯ = {(x, y) ∈ S × S | (x, y) ∈ d ∧ x 6= x ¯}, for some x ¯ ∈ S. This means that no transition in δx¯ starts in the state x ¯, which is then a deadlock. h(℘(D)) is a sub-system of ℘(D) that is a bi-structured coherence space, whose web is given by | h(℘(D)) |= (∆S − {δx¯ | δ ∈ ∆S }, ≈), where ≈ is the trivial coherence relation. Let Rh (℘(D)) be the regulation of ℘(D) with respect to h(℘(D)). Again, from Def. 6 and Prop. 5, it follows that: Proposition 7. M∗ Vh(℘(M∗ )) Rh (℘(M∗ )).

u t

We note that to be able to classify the development factors as either internal or external, one needs to consider machine models where interactions with the environment are taken into account. Also, more interesting examples of development factors are only possible with such models.

5

Conclusion

This paper consolidates part of our previous work on the notion of developmental machine. It introduces a domain-theoretic model of developmental machines, based on the so-called bistructured coherence spaces, which seems well compatible with Piaget’s general account of the notion of system development. The explication of the notion of machine evolution, however, was left for future work. Further work on the notion of development of computing machines is still necessary. Firstly, we need to adopt a dynamical model with interactive capabilities, to be able to identify interesting ways in which the internal and external factors I and E influence the equilibration constructor. Secondly, we need to internalize the mechanism of machine development in the developmental machines themselves, by way of some reflection procedure, so that the machine can control itself its own development by controlling the activation of the development factors. Finally, we need to make clear the connection of the notion of development to the notion of evolution of populations of computing machines. Acknowledgements. Work partially supported by FAPERGS and CNPq. The authors thank the referees for their valuable comments.

References 1. Piaget, J.: Biology and Knowledge. Edinburgh University Press (1971)

122

G.P. Dimuro and A.C.R. Costa

2. Piaget, J.: The Equilibration of Cognitive Structures: The Central Problem of Intellectual Development. University of Chicago Press (1985) 3. Stoltenberg-Hansen, V., Lindström, I., Griffor, E.R.: Mathematical Theory of Domains. Cambridge University Press, Cambridge (1994) 4. Gierz, G., Hofmann, K.H., Keimel, K., Lawson, J.D., Mislove, M., Scott, D.S.: Continuous Lattices and Domains. Cambridge University Press, Cambridge (2003) 5. Zhang, G.Q.: Logic of Domains. Birkhäuser, Boston (1991) 6. Costa, A.C.R.: Machine Intelligence: sketch of a constructive approach. PhD thesis, CPGCC / UFRGS, Porto Alegre (1993) (in Portuguese). 7. Costa, A.C.R., Dimuro, G.P.: Interactive computation: Stepping stone in the pathway from classical to developmental computation. Electronic Notes in Theoretical Computer Science 141(5) (2005) 5–31 (Goldin, D. and Viroli, M. (eds.) Proc. Work. Foundations of Interactive Computation, FInCo 2005, Edinburgh, 2005). 8. Goldin, D., Smolka, S., Wegner, P., eds.: Interactive Computation: The New Paradigm. SpringerVerlag, New York (2006) 9. Dimuro, G.P.: A Global Constructive Representation of Second Order Ordered Systems in BiStructured Interval Coherence Spaces, with an application in Interval Mathematics. PhD thesis, CPGCC/UFRGS, Porto Alegre (1998) (in Portuguese). 10. Dimuro, G.P., Costa, A.C.R., Claudio, D.M.: A bi-structured coherence space for a global representation of the system IR of real intervals. In: Proc. Intl. Conf. on Information Technology, ICIT’99, Bhubaneswar, 1999, NY, McGraw-Hill (1999) 11. Dimuro, G.P., Costa, A.C.R., , Claudio, D.M.: A coherence space of rational intervals for a construction of IR. Reliable Computing 6(2) (2000) 139–178 12. Girard, J.Y.: Linear logic. Theoretical Computer Science 50 (1987) 1–102 13. Troelstra, A.S.: Lectures on Linear Logic. CSLI, Menlo Park (1992)

Quantum Algorithms for Graph Traversals and Related Problems Sebastian Dörn Institut für Theoretische Informatik, Universität Ulm, 89069 Ulm, Germany [email protected]

Abstract. We study the complexity of algorithms for graph traversal problems on quantum computers. More precisely, we look at eulerian tours, hamiltonian tours, travelling salesman problem and project scheduling. We present quantum algorithms and quantum lower bounds for these problems. Our results improve the best classical algorithms for the corresponding problems. In particular, we prove that the quantum algorithms for the eulerian tour and the project scheduling problem are optimal in the query model.

1

Introduction

Quantum computation has the potential to demonstrate that for some problems quantum computation is more efficient than classical computation. The goal of quantum computing is to determine when quantum computers provide a speed-up over classical computers. Today, two main complexity measures for quantum algorithms have been studied: the quantum query and the quantum time complexity. The quantum query complexity of a quantum algorithm A is the number of quantum queries to the input, and the quantum time complexity of A is the number of basic quantum operations made by A. In this paper we are interested in graph traversal problems. The study of the quantum complexity for graph problems is an active topic in quantum computing. Dürr, Heiligman, Høyer and Mhalla [DHHM04] presented optimal quantum query algorithms for the minimum spanning tree, graph connectivity, strong graph connectivity and the single source shortest path problem. Magniez, Santha and Szegedy [MSS05] constructed a quantum query algorithm for finding a triangle in a graph. Polynomial-time quantum algorithms are given by Ambainis and Špalek [AS06] for the maximum matching and the network flow problem. Dörn [Doe07] presented quantum algorithms for several independent set problems. In this paper, we present quantum algorithms and quantum query lower bounds for graph traversals and related problems. Our input is a directed or undirected graph G with n vertices and m edges. We consider two query models for graphs: the adjacency matrix and the adjacency list model. In section 3 we study the quantum query and the quantum time complexity of the eulerian graph problem. In the eulerian graph problem we have to decide if a graph G has an eulerian cycle, this is a closed walk that contains every edge of G exactly once. We compute the quantum query complexity of the eulerian graph problem in the adjacency matrix and the list model. Futhermore, we show that our lower and upper bounds are tight in both graph representation models. In section 4 we consider the hamiltonian cycle problem, an important NP-complete graph problem. A hamiltonian cycle of a graph G is a cycle which contains all the vertices of G exactly once. Berzina et al. [BDFLS04] proved, that the hamiltonian cycle problem requires Ω(n1.5 ) quantum queries to the adjacency matrix. We show an O(n2n/(n+1) ) quantum query upper bound for this problem, by using a recent quantum walk technique. In section 5 we study the travelling salesman problem in graphs with maximal degree three, four and five. Eppstein [Epp03] constructed algorithms for the travelling salesman problem on graphs with bounded degree three and four which are faster than O(2n ). We analyse the

124

Sebastian Dörn

quantum time complexity of these algorithms. We show that with a quantum computer we can solve the travelling salesman problem on graphs with maximal degree three, four and five quadratically faster than in the classical case. In section 6 we consider a project scheduling problem. A digraph model can be used to schedule projects consisting of several interrelated tasks. Some of these tasks can be executed simultaneously, but some tasks cannot begin until certain others are completed. The goal is to compute the minimal project completion time. We present an optimal quantum query algorithm for computing the earliest completion time for every vertex of the network.

2 2.1

Preliminaries Graph Theory

We denote by [n] the set {1, 2, . . . , n}. Let G = (V, E) be a undirected or directed graph, with V = V (G) and E = E(G) we denote the set of vertices and edges of G. Let n = |V | be the number of vertices and m = |E| the number of edges of G. We denote by (u, v) a directed edge in G from vertex u to vertex v; the vertex u is called adjacent to vertex v in G. The number of vertices adjacent to v is called the out-degree of v and denote by d+ G (v). The in-degree of a vertex v is the number of edges directed to v, denoted by d− (v). A cycle of G is a sequence G (v1 , . . . , vk , v1 ) where k ≥ 3 and v1 , . . . , vk are distinct vertices of G such that (vi , vi+1 ) ∈ E for i ∈ [k − 1] and (vk , v1 ) ∈ E. We consider the following two models for accessing information in digraphs: – Adjacency matrix model: Given is the adjacency matrix A ∈ {0, 1}n×n of G with Ai,j = 1 iff (i, j) ∈ E. Weighted graphs are encoded by a weight matrix, where Ai,j is the weight of edge (i, j) and for convenience we set Ai,j = ∞ if (i, j) 6∈ E. + – Adjacency list model: Given are the out-degrees d+ G (1), . . . , dG (n) of the vertices and for + every i ∈ V an array with its neighbours fi : [dG (i)] → [n]. The value fi (j) is the j-th neighbour of i. Weighted graphs are encoded by a sequence of functions fi : [d+ G (i)] → [n] × N, such that if fi (j) = (i0 , w) then there is an edge (i, i0 ) with weight w and i0 is the j-th neighbour of i. In undirected graphs, we replace the directed edge (u, v) by an undirected edge {u, v}, and the out-degree d+ G (i) through the degree dG (i) of the every i ∈ V . 2.2

Quantum Computing

For the basic notation on quantum computing, we refer the reader to the textbook by Nielsen and Chuang [NC03]. For the quantum algorithms included in this paper we use the following two complexity measures: – The quantum query complexity of a graph algorithm A is the number of queries to the adjacency matrix or to the adjacency list of the graph made by A. – The quantum time complexity of a graph algorithm A is the number of basic quantum operations made by A. Now we give three tools for the construction of our quantum algorithms. Quantum Search. A search problem is a subset P ⊆ [N ] of the search space [N ]. With P we associate its characteristic function fP : [N ] → {0, 1} with fP (x) = 1 if x ∈ P , and 0 otherwise. Any x ∈ P is called a solution to the search problem. Let k = |P | be the number of solutions of P .

Quantum Algorithms for Graph Traversals and Related Problems

125

Theorem 1. [Gro96,BBHT98] For k > 0, the expected quantum query complexity for finding p √ one solution of P is O( N/k), and for finding all solutions, it is O( k · N ). Futhermore, √ whether k > 0 can be decided in O( N ) quantum queries to fP . Theorem 2. [DH96] There is a quantum algorithm for finding the maximum element in a set √ of N real numbers with query complexity of O( N ). Amplitude Amplification. Let A be an algorithm for a problem with small success probability at least ². Classically, we need Θ(1/²) repetitions of A to increase its success probability from ² to a constant, for example 2/3. The corresponding technique in the quantum case is called amplitude amplification. Theorem 3. [BHMT00] Let A be a quantum algorithm with one-sided error and success probability at least ². Then there is a quantum algorithm B that solves A with success probability 2/3 by O( √1² ) invocations of A. Quantum Walk. Quantum walks are the quantum counterpart of Markov chains and random walks. The quantum walk search provide a promising source for new quantum algorithms, see [Amb04], [MSS05], [MN05] and [BS06]. Let P = (pxy ) be the transition matrix of an ergodic symmetric Markov chain on the state space X. Let M ⊆ X be a set of marked states. Assume that the search algorithms use a data structure D that associates some data D(x) with every state x ∈ X. From D(x), we would like to determine if x ∈ M . When operating on D, we consider the following three types of cost: – Setup cost s: The worst case cost to compute D(x), for x ∈ X. – Update cost u: The worst case cost for transition from x to y, and update D(x) to D(y). – Checking cost c: The worst case cost for checking if x ∈ M by using D(x). Magniez et al. [MNRS07] developed a new scheme for quantum search, based on any ergodic Markov chain. Their work generalizes previous results by Ambainis [Amb04] and Szegedy [Sze04]. They extend the class of possible Markov chains and improve the query complexity as follows. Theorem 4. [MNRS07] Let δ > 0 be the eigenvalue gap of a ergodic Markov chain P and let |M | |X| ≥ ². Then there is a quantum algorithm that determines if M is empty or finds an element of M with cost µ ¶ 1 1 √ u+c . s+ √ ² δ In the most practical application ([Amb04], [MSS05]) the quantum walk takes place on the Johnson graph J(n, r), which is defined as follows: the vertices are subsets of {1, . . . , n} of size r and two vertices are connected iff they differ in exactly one number. It is well known, that the spectral gap δ of J(n, r) is 1/r. Remark 1. Our quantum algorithms output an incorrect answer with a constant probability p. If we want to reduce the error probability to less than ², we repeat each quantum subroutine l times, where pl ≤ ². It follows, that we have to repeat each quantum subroutine l = O(log n) times, to make the probability of a correct answer greater than 1 − 1/n. This increases the running time of all our algorithms by a logarithmic factor. Furthermore, the running time of Grover search is bigger that its query complexity by another logarithmic factor.

126

3

Sebastian Dörn

Eulerian Graph Problem

In this section we consider the eulerian graph problem. Given an undirected graph G, decide if G has a closed walk that contains every edge of G once. We denote such a walk as eulerian tour. It is a well known fact in graph theory, that a graph G is eulerian iff the degree of every vertex in G is even. √ Theorem 5. The quantum query complexity of the eulerian graph problem is O( n) in the adjacency list and O(n1.5 ) in the adjacency matrix model. Proof. In the adjacency list model, the degree of every vertex is given. We search an odd number in the degree list. If there is a vertex in G with√odd degree, then the graph is not eulerian. This simple quantum search can be done in O( n) quantum queries to the degree list. In the adjacency matrix model, we search a vertex with odd degree (if there is one). We use Grover search in combination with a classical algorithms for computing the parity. Total the quantum query complexity of the eulerian graph problem is O(n1.5 ) in the adjacency matrix model. By using Remark 1, we obtain the quantum time complexity of the eulerian graph problem: Corollary 1. √ There is a quantum algorithm for the eulerian graph problem with time complexity of O( n log2 n) in the adjacency list and O(n1.5 log2 n) in the adjacency matrix model. Now we show that our upper bounds are tight in the matrix and list model. √ Theorem 6. The eulerian graph problem requires Ω( n) quantum queries to the adjacency list and Ω(n1.5 ) quantum queries to the adjacency matrix. Proof. In the adjacency matrix model, we reduce OR of n parities of length n to the eulerian tour problem. We define z := (x1,1 ⊕ . . . ⊕ x1,n ) ∨ . . . ∨ (xn,1 ⊕ . . . ⊕ xn,n ). It is a well known fact, that the computation of z requires Ω(n1.5 ) quantum queries [Amb02]. Then it is z = 0 iff the graph√G with adjacency matrix A = (xi,j ) has an eulerian tour. In the adjacency list model, the Ω( n) lower bound follows by a simple reduction from the Grover search. Since the upper and the lower bound match, we have determined the precise quantum query complexity of the eulerian tour problem.

4

Hamiltonian Circuit Problem

In the hamiltonian circuit problem we have given a directed graph G, one has to decide if G has a cycle which contains all the vertices of G exactly once. A hamiltonian graph is one containing a hamiltonian cycle. The hamiltonian cycle problem is analogous to the eulerian graph problem, but a simple characterization of a hamiltonian graph does not exist. The hamiltonian circuit problem is a well known NP-complete problem. There is a quantum query lower bound of Ω(n1.5 ) for the hamiltonian cycle problem in the matrix model, proved by Berzina et al. [BDFLS04]. We show an upper bound for this problem by using the quantum walk search technique. Theorem 7. The quantum query complexity of the hamiltonian cycle problem is O(n2n/(n+1) ) in the adjacency matrix model.

Quantum Algorithms for Graph Traversals and Related Problems

127

Proof. We use Theorem 4. To do so, we construct a Markov chain and a database for checking if a vertex of the chain is marked. Let G = (V, E) be a directed input graph with n vertices. Let A be a subset of [n] × [n] of size r > n. We will determine r later. The database is the edge-induced subgraph G[A] := (V, E ∩ A). Our quantum walk take place on the Johnson graph J(n2 , r). The marked vertices M of J(n2 , r) correspond to subsets of [n] × [n] with size r, where G[A] contains a hamiltonian cycle in G for all A ∈ M . In every step of the walk, we exchange one element of A. We determine the quantum query costs for setup, update and checking. The setup cost for the database is O(r), the update cost is O(1), and the checking cost is zero. The spectral gap 2 of the walk on J(n2 , r) is δ = O(1/r) for 1 ≤ r ≤ n2 , see e.g. [BS06]. If there is a hamiltonian ¡n2 −n¢ cycle in G, then there are at least r−n marked sets, since a hamilonian cycle contains n edges. Therefore it holds ¡n2 −n¢ ³³ r ´n ´ |M | ²≥ ≥ ¡r−n ≥ Ω . ¢ 2 n |X| n2 r

Then the quantum query complexity of the hamiltonian cycle problem is µ O(r +

n2 r

¶n/2 ·



r) = O(n2n/(n+1) )

iff r = n2n/(n+1) .

5

Travelling Salesman Problem

In this section we consider the travelling salesman problem (TSP) in graphs with maximal degree three, four and five. Given a weighted graph G with bounded degree, compute a hamiltonian cycle in G with minimum total edge weight. There is a simple algorithm by Held and Karp [HK62] for computing a travelling salesman tour with running time O(2n ). Today, this is the fastest known algorithm for the TSP. 5.1

Bounded Degree Three Graphs

Eppstein [Epp03] constructed an algorithm for the TSP on graphs with bounded degree three and running time O(2n/3 ). The general idea of this algorithm is the following (see Figure 1): Let G be a directed weighted graph with maximum degree three. Let F be the set of edges that must be used in the travelling salesman tour, denoted as the forced edge. In every step of the algorithm, we choose an edge (t, v) or (t, y) which are adjacent to a forced edge (s, t). If we add (t, v) to F , we delete (t, y) from G, and add the two edges (x, y) and (y, z) to F . Therewith the number of forced edges is increased by three. The subproblem in which we add (t, y) to F is symmetric. This procedure is the main subroutine of the Eppstein algorithm. It is not difficult to see, that we can transform this deterministic algorithm with running time of O(2n/3 ) in a probabilistic polynomial time algorithm with success probability of 1/2n/3 . From this classical algorithm we obtain a quantum algorithm by the following two modifications: We use Grover search for finding the edges of the graph, and we apply the quantum amplitude amplification [BHMT00] in oder to get an algorithm which computes a travelling salesman tour with constant success probability. Then we obtain the following result: Theorem 8. There is a quantum time algorithm for the TSP on graphs with bounded degree three and running time of O(2n/6 ).

128

Sebastian Dörn s u

s u

v

y

w

z

x

t

x

t

s u

x

t

v

y

v

y

w

z

w

z

Fig. 1. Travelling salesman tour in graphs with maximal degree three

5.2

Bounded Degree Four Graphs

Now we use an idea of Eppstein [Epp03] to compute the quantum time complexity for finding a travelling salesman tour in graphs with maximal degee four. In classical computation, the fastest algorithm for this problem has running time O(1.890n ), see [Epp03]. Theorem 9. There is a quantum time algorithm for the TSP on graphs with bounded degree four and running time of O((27/4)n/6 ) = O(1.375n ). Proof. Let k be the number of degree four vertices in the graph G with maximum degree four. The algorithm consists of the following steps: For each degree four vertex v with adjacent edges a, b, c and d, let a be the incoming edge of the tour. We choose randomly among the three possible partitions {a, b}, {a, c} and {a, d}. We divide the vertex v into two vertices, and connect the two vertices by a forced edge. The new graph has maximum degree three, and therefore we can apply the quantum algorithm of Theorem 8. Each such divide preserves the travelling salesman tour, if the two edges of the tour do not belong to the same set of the partition. p This happens with probability 2/3. We apply the quantum amplitude amplification, and after (3/2)k invocation the algorithm finds the correct solution. In total, the quantum time complexity of the algorithm is ³ ´ O 1.5k/2 · 2n/6 = O((27/4)n/6 ) = O(1.375)n . 5.3

Bounded Degree Five Graphs

In this subsection we consider the travelling salesman problem in graphs with maximal degee five. There is no classical algorithms with running time faster than O(2n ) for this problem. We present a quantum algorithm for the TSP on graphs with maximal degree five and running time O(1.5874n ). We use the same strategy as for bounded degree four graphs. Theorem 10. There is a quantum time algorithm for the TSP on graphs with bounded degree five and running time of O(1.5874n ). Proof. We use the proof of Theorem 9. Here we choose randomly among four possible partitions, and divide a vertex with degree five into two vertices, which we connect by a forced edge. Then the new graph has maximum degree four, and we can apply Theorem 9. In total, the quantum time complexity of the TSP algorithm for bounded degree five is ´ ³ O (4/3)k/2 · (27/4)n/6 = O(1.5874n ).

Quantum Algorithms for Graph Traversals and Related Problems

6

129

Project Scheduling

A digraph model can be used to schedule projects consisting of several interrelated tasks. Some of these tasks can be executed simultaneously, but some tasks cannot begin until certain others are completed. The goal is to compute the minimal project completion time. One way to represent scheduling projects is to use a digraph model, which is called AOA network. Definition 1. An AOA network N = (G, c) is a digraph G = (V, E) with an edge weight c : V × V → R+ . Each edge in the digraph represents a task of the project, the direction of the edge is the direction of progress in the project. Each vertex in the AOA network represents an event that signifies the completion of one or more activities and the beginning of a new one. An activity A called predecessor of activity B, if B cannot begin until A is completed. We compute the quantum query complexity of the project scheduling problem: Given an AOA network N = (G, c), compute the earliest completion time for every vertex in G. The AOA network must be an acyclic digraph, otherwise none of the tasks corresponding to the edges on the cycle could ever begin. We are interested on the earliest time point at which each event can occur. Let ET (i) denote the earliest time point in which the event corresponding to vertex i can occur. A vertex j is called immediate predecessor of a vertex i if there is an edge from j to i. Let P (i) be the set of all immediate predecessors of vertex i. Lemma 1. It holds E(1) = 0 and E(i) = maxj∈P (i) {ET (j) + c(j, i)}. The earliest time ET (i) for every event i to occur is the length of the longest directed path in the network from vertex 1 to vertex i. Theorem 11. The quantum query√complexity of the project scheduling problem is O(n1.5 ) in the adjacency matrix model and O( nm) in the adjacency list model. Proof. We use Lemma 1 to compute the earliest time ET (i) for every vertex i ∈ {2, 3, . . . , n} in order, since the vertices are numbered in a topological way. We use the quantum algorithm by Dürr and Hoyer [DH96] (see Theorem 2) to compute the maximum of ET (j) + c(j, i)√for all immediate predecessors of vertex i.q The quantum query complexity for this step is O( n) in the adjacency matrix model and O( d− G (i)) in the adjacency list model. The total number of quantum queries to the adjacency matrix is O(n1.5 ) and v u n n q X uX − √ √ − dG (i) ≤ nt dG (i) = O( nm) i=1

i=1

in the adjacency list model. Theorem 12. The quantum time complexity of the √ Project Scheduling algorithm is O(n1.5 log2 n) in the adjacency matrix model and O( nm log2 n) in the adjacency list model. Theorem 13. The project scheduling problem requires Ω(n1.5 ) quantum queries to the adja√ cency matrix and Ω( nm) quantum queries to the adjacency list. Proof. The proof is a reduction from maximum finding. Let k be an integer and M be a matrix with n rows, k columns and with N = kn positive √ entries. The quantum query lower bound for finding the maximum value in every row is Ω( nN ), see [DHHM04]. We construct a weighted graph G = (V, E), where the set of vertices is V = {s, v1 , . . . , vk , u1 , . . . , un , t}. The edges (s, vi ) and (uj , t) have the weight 0 for all i ∈ [k] and j ∈ [n]. The

130

Sebastian Dörn

edges (vi , uj ) get the weight Mji . The graph G has n + k + 2 vertices and m = kn + k + n edges. The earliest time ET (vi ) is zero for all vertices vi , and the earliest time ET (uj ) is the maximal weighted edge of (v, ui ) with v ∈ {v1 , . . . , vk }. Then the project scheduling problem √ requires Ω( nm) quantum queries to the adjacency list. Setting m = n2 , the quantum query lower bound for the adjacency matrix follows.

Conclusion and Open Problems In this paper we presented quantum algorithms and lower bounds for graph traversal problems. We constructed optimal quantum query algorithms for the eulerian graph and the project scheduling problem. We showed, that the travelling salesman problem in graphs with maximal degree three, four and five can be solved quadratic faster with quantum computing. Some questions remain open: Is there are a quantum time algorithm for TSP with running time O(cn ) for some c < 2? There is a simple algorithm by Held and Karp [HK62] for computing a travelling salesman tour in a graph with running time O(2n ). This algorithm was published in 1962, and it yields the best complexity that is known today. An other interesting problem is the improvement of the lower or upper bound for the quantum query complexity of the hamiltonian cycle problem.

References [Amb02] [Amb04] [AS06] [BBHT98] [BDFLS04] [BS06] [BHMT00]

[DHHM04] [DH96] [Doe07] [Epp03] [Gro96] [GY99] [HK62] [MN05] [MNRS07]

A. Ambainis, Quantum Lower Bounds by Quantum Arguments, Journal of Computer and System Sciences 64: pages 750-767, 2002. A. Ambainis, Quantum walk algorithm for element distinctness, Proceedings of FOCS’04: pages 22-31, 2004. A. Ambainis, R. Špalek, Quantum Algorithms for Matching and Network Flows, Proceedings of STACS’06, 2006. M. Boyer, G. Brassard, P. Høyer, A. Tapp Tight bounds on quantum searching, Fortschritte Der Physik 46(4-5): pages 493-505, 1998. A. Berzina, A. Dubrovsky, R. Freivalds, L. Lace, O. Scegulnaja, Quantum Query Complexity for Some Graph Problems, Proceedings of SOFSEM’04: pages 140-150, 2004. H. Buhrman, R. Špalek, Quantum Verification of Matrix Products, Proceedings of SODA’06: pages 880-889, 2006. G. Brassard, P. Høyer, M. Mosca, A. Tapp, Quantum amplitude amplification and estimation, In Quantum Computation and Quantum Information: A Millennium Volume, AMS Contemporary Mathematics Series, 2000. C. Dürr, M. Heiligman, P. Høyer, M. Mhalla, Quantum query complexity of some graph problems, Proceedings of ICALP’04: pages 481-493, 2004. C. Dürr, P. Høyer, A quantum algorithm for finding the minimum, Technical Report arXiv:quant-ph/9607014, 1996. S. Dörn, Quantum Complexity Bounds of Independent Set Problems, Proceedings of SOFSEM’07 SRF, 2007. D. Eppstein, The traveling salesman problem for cubic graphs, Lecture Notes in Computer Science 2748: pages 307-318, Springer, 2003. L. Grover, A fast mechanical algorithm for database search, Proceedings of STOC’96: pages 212-219, 1996. J. Gross, J. Yellen, Graph Theory and its Applications, CRC Press, London 1999. M. Held, R.M. Karp, A dynamic programming approach to sequencing problems Journal of SIAM 10: pages 196-210, 1962. F. Magniez, A. Nayak, Quantum complexity of testing group commutativity, Proceedings of ICALP’05: pages 1312-1324, 2005. F. Magniez, A. Nayak, J. Roland, M. Santha, Search via Quantum Walk, Proceedings of STOC’07, 2007.

Quantum Algorithms for Graph Traversals and Related Problems [MSS05] [NC03] [Sze04]

131

F. Magniez, M. Santha, and M. Szegedy, Quantum algorithms for the triangle problem, Proceedings of SODA’05: pages 1109-1117, 2005. M.A. Nielsen, I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2003. M. Szegedy, Quantum speed-up of markov chain based algorithms, Proceedings of FOCS’04: pages 32-41, 2004.

Four Ways of Logical Reasoning in Theoretical Physics and their Relationship with both Kinds of Mathematics and with Computer Science Antonino Drago Department of Physical Science University of Naples “Federico II” [email protected]

Abstract. My previous studies on scientific texts recognised the relevance of double negated sentences whose corresponding affirmative sentences lack scientific evidence. They begin by arguing within non-classical logic, which is an alternative, purely deductive way of organising a scientific theory and governs a model of organisation based on a universal problem; for instance, Lobachevsky’s organisation of hyperbolic geometry is based on the problem of how many parallel lines exist. Then the theory argues by means of double negated sentences, which can be grouped around two sub-problems; each group constitutes a unit of argument, ending with an ad absurdum theorem. The last unit of argument reaches a final, universal result, which is then changed into a positive mathematical formula (about the angle of parallelism). A third way of arguing can be recognised in the Lagrangian theory. This theory too is based on a problem, i.e. how to deal with constrained motions. But it argues with a massive use of the more advanced mathematics. However, even in this case an unit of argument can be recognised; but it represents the conversion, in agreement with classical logic, of an ad absurdum argument into a direct one; i.e. the double negated sentences are forced into the corresponding affirmative ones, however idealist in nature the latter ones may be. The goal is a mathematical formula from which one can obtain the solutions of a series of mathematical sub-problems. A fourth way of arguing in theoretical physics is recognised in Descartes’ geometrical optics. The relationships between all the different ways of arguing with the different kinds of mathematics are illustrated. As a verification, these ways of arguing are recognised in the four versions of the inertia principle, given respectively by Newton, L. Carnot, Enriques, Cavalieri. As an application, both ways of arguing and the kinds of mathematics belonging to computer science are circumscribed and instantiated by means of well-known theories.

1

Deductive Reasoning and the Reasoning by Means of Double Negated Statements

The deductive way of reasoning is well-known. It governs a kind of organisation of a scientific theory that was suggested by Aristotle (AO) and then inaugurated by Euclid’s Elements of Geometry more than two millennia ago and that thereafter constituted the model for lucid reasoning in science. Of course, the logic governing this kind of reasoning is classical logic; indeed, this logic only assures certain deductions from given premises. But I discovered that several scientific theories − classical chemistry, L. Carnot’s calculus, S. Carnot’s thermodynamics, Lobachevsky’s theory of parallels, Klein’s Erlanger Program, Einstein’s special relativity, Heisenberg’s formulation of quantum mechanics − show a different way of reasoning, because they make use of a particular kind of sentence; it is a double negated sentence, in the formula ¬¬A, whose corresponding positive sentence lacks of scientific evidence [6]. Instances of these sentences in correspondence to the above-mentioned theories, are the following ones: “The matter is not divisible at a non finite extent”, “The infinitesimals are not

Four Ways of Logical Reasoning

133

chimerical beings”, “It is not true that heat is not work” (ill-synthesised by physics textbooks by means of the sentence: “The heat is equivalent to work”; where the word “equivalent” represents a deus ex machina; his meaning is never explained; it is well-known that the corresponding affirmative sentence “Heat is work” is not true in physics, because there are no operative means for converting an entire quantity of heat in work), “The new hypothesis of two parallel lines does not lead to any contradiction”, “It is not true that a geometry is not a transformation group”, “The principle of relativity and the principle of the constant speed of light are only apparently irreconcilable”; “It is not true that two conjugate magnitudes are not be measurable with relative [non absolute] accurateness”. The sentences of the above kind will be called DNSs. Notice that a DNS does not belong to an AO, because it cannot be deduced from any affirmative axiom; moreover, it cannot work as an axiom inside an AO because its content is not accurately circumscribed. Indeed, a DNS cannot state anything with certainty, apart from a border to our thinking. In logical terms, the DNS ¬¬A forbids reaching the affirmative sentence A, owing to the inequality between the two contents of ¬¬A and A. In physical terms, this prohibition often becomes the version of a physical impossibility − e.g., “A motion without an end is impossible1 ”. Moreover, it is impossible to translate a DNS into a mathematical formula with exact equality, because when a mathematical formula includes an exact equality, the negation of its negation reproduces the same formula; hence, in this case the law of double negation holds true. Owing to this fact, the following analysis of the kinds of reasoning will explore a purely logical domain, where the mathematics is located on the borderline of this domain. This kind of sentence matters because in the 20th century, on the basis of mathematical logic, several scholars concluded that it was not the law of the excluded middle, but rather the law of double negation that constitutes the very borderline between classical logic and most kinds of non-classical logic ([18, 23, 29, 30])2 ; whose the intuitionist logic will be considered in the following as the most representative one. Hence, a DNS introduces to argue in a different logical world from the classical one. Notice that in the philosophy of science, Leibniz suggested that two basic logico-philosophical principles underlay our theoretical activity, i.e. the non-contradiction principle and the principle of sufficient reason, itself stated by means of a DNS: “Nothing is without reason, or everything has its reason, although we are not always capable of discovering this reason. . .3 ” Clearly, the two principles govern the two respective organizations, AO and a new kind of organisation, still to be discovered. A first question arises: In what way does an argument that includes a DNS proceed? Let us inspect the logical developments of the more relevant theories including DNSs. Our first conclusion is: each theory is based upon a universal problem. The discursive part of S. Carnot’s booklet, i.e. the part from which most of modern thermodynamics originated, starts by stating a universal problem by means of a DNS: “It is unknown whether a bound to the efficiency in converting heat into work there exists” (cf [3, 14, 16]4 ). The same occurs in Lobachevsky’s booklet. After the introductory propositions nos. 1-15, in the proposition no. 16 a DNS states the general problem by means of the sentence: “In the uncertainty the parallel line is not unique...”, i.e. “It is not true that the parallel line is not unique”. This kind of theory, organised on the basis of a universal problem, will be called a problem-based theory (PO). 1

2

3 4

Notice that the word “impossibility” belongs to modal arguing, which is linked to intuitionist logic through the so-called S4 model the author, in order to founf intuitionist logic, argues by means of all DNS’s, called by him “pseudotruths”; they give also an ad absurdum theorem, although in verbal terms only. See my [12] see [25] Modern thermodynamics born from S. Carnot’s thermodynamics by discarding a hypothesis (caloric) and moreover by changing the kind of organisation of the theory in a deductive one; but at the cost of putting as the first principle what was discovered 25 years after S. Carnot’s result, which is presented as the second principle.

134

Antonino Drago

In the case of each of the above-mentioned theories, an analysis of the original scientific text written by its founder, gives a list of DNSs, whose sequence may preserve the entire logical thread of the presentation. This is the case of Carnot’s text which includes many DNSs. Its series of DNSs can be severed (divided??) each time a DNS states a problem; his exposition is severed into six sequences of DNSs, which I call units of argument. Inside each unit of argument, certain methodological principles, which are also DNSs, lead to the result of the reasoning. Moreover, the last unit of argument, like some of the other units, ends by means of an ad absurdum theorem, truly the celebrated S. Carnot’s ad absurdum theorem; it establishes the main result of thermodynamics, i.e. a scientific method for establishing the efficiency of all conversions of heat into work. Almost the same kind of development occurs in Lobachevsky’s text on non-Euclidean geometry (see [26]5 ). The problem enounced in proposition no. 16 subsequently is subdivided into two sub-problems − although covertly; i.e. a new definition of parallelism and a new definition of the angle of parallelism. They are solved by two corresponding units of argument, making use of ad absurdum theorems. The first unit is composed of the propositions no.s 17-18; their sentences are the theses of the two respective following theorems. Together they solve the first sub-problem. The second unit of argument is composed of the following propositions no.s 19-22 together with the attached proofs. In them, some DNSs (e.g.: “It is impossible that a straight line coming out of a vertex of a triangle does not meet the opposite side to this vertex”) work as methodological principles, leading to the building of a new method for solving the given sub-problem. Moreover, one easily recognises that the six propositions − except for proposition no. 21 − are proved by means of ad absurdum arguments − although the author’s words obscure this kind of proof; in particular, whenever ¬¬T is obtained by the proof, the corresponding proposition enounces the sentence T . From the evidence provided by the above two instances of a PO theory including DNSs, I deduce that within a non-deductive organisation the natural way to end an unit of reasoning is to make use of an ad absurdum proof; a moment of reflection confirms this conclusion, because no other evidence can be extracted from a chain of DNS’s. An interpretation of this kind of unit of reasoning in the formalism of natural deduction follows (see [28]). Table 1. A unit of reasoning in intuitionistic logic Common knowledge plus some DNSs as methodological principles (X), (Y), (Z),. . . A sub-problem : T ? (AAA)¬T ¬X ⊥ ¬¬T Legenda: ⊥=Adsurdum; AAA= Ad Absurdum Argument; T =Thesis; (X), (Y), (Z)...=methodological principles.

Notice that the weak version of this kind of proof includes no more than a DNS, ¬¬T . One would like to further conclude ¬¬T → T , as classical logic allows; this additional step gives the strong version of an ad absurdum proof. But this logical step, by requiring the law of the 5

This theory was rationally re-constructed according the model of a PO theory by [17]

Four Ways of Logical Reasoning

135

double negation, is rejected by non-classical logic, which ends the proof by concluding ¬¬T ; this DNS then works as a methodological principle for the subsequent unit of reasoning. I conclude that the ideal model of a PO theory, after stating the main problem, presents a sequence of units of reasoning, each one being composed of a chain of DNSs leading to an ad absurdum proof. I call this way of arguing the Carnotian way of arguing.

2

A New Model of Organisation of a Scientific Theory

How does a sequence of units of reasoning, all based upon DNSs, lead to a certain result? An accurate inspection of the original texts of the above theories, reveals a specific final move. The last unit of reasoning in S. Carnot’s theory states the highest efficiency for any heat engine; Lobachevsky’s proposition no. 22 states that his new hypothesis is not contradictory for both all triangles and all parallel lines inside all space. From these two facts I deduce that the final unit of reasoning reaches a conclusion about an universal property, by means of a sentence of the kind “For all... holds true a DNS”; which may be represented by the formula ¬¬U T (universal thesis). If we further inspect the original texts, we see that after this move, each author draws consequences from a new hypothesis, which is precisely the affirmative proposition U T . I deduce that the author, owing to the universal validity of this DNS ¬¬U T , i.e. the universal DNS concluding the last unit, thinks he can exclude any other possibility; thus, he thinks he has obtained complete proof of the corresponding affirmative statement, i.e. U T ; although this logical step is not allowed by non-classical logic, which the author previously adhered to6 . We conclude that the author, having established a universal proof, felt that he was allowed to jump to the conclusion that belonged to the strong version of an ad absurdum proof; and thus to move to arguing in terms of classical logic. In more general terms, the author jumps from his search for a new scientific method (through arguments belonging to intuitionist logic), to making use of the deductive method by assuming U T as a new hypothesis U T . This also means that he jumps from open sentences, i.e. DNSs, to affirmative sentences, each of which can be represented by a mathematical formula which includes the equality symbol. This move can be recognised in Lobachevsky’s presentation of non-Euclidean geometry. At the end of the proof of the last proposition of the second unit of reasoning, no. 22, he states that two hypotheses only are possible as universal hypotheses regarding all space and “The hypothesis of two parallel lines does not lead to any contradiction”. In the following development of the theory Lobachevsky covertly changed it into the corresponding affirmative sentence; and adds it as a new positive hypothesis, expressed in mathematical terms: Π(p) < π/2. Unfortunately, Lobachevsky performs this change without further explanation. The same covert move can be recognised in S. Carnot’s exposition; after having negated the inequality ηirr ≥ ηrev (the efficiency of a reversible engine cannot be less than that of any irreversible engine), in the following development of the theory, he argues by considering the latter magnitude as the theoretical maximum (see [3] p. 29). On the other hand, the same move was declared overtly by Einstein at the end of the first section of his celebrated paper on the PO theory of special relativity; he announces that in the subsequent text he will “upraise” a methodological “postulate” to an (axiom-)“principle of 6

This global move was applied by chemists too, after to have induced the Mendeleieff table; since the periodic sequence of the chemical elements presented two holes, they assumed the existence of two new elements; which actually have been thereafter discovered. This logical move is similar also to what A.A. Markov suggested as his “principle”, although he was well-aware that intuitionist logic does not allows it; it is a merely local move, ¬¬∀xP (x) → ∀xP (x) provided that two general criteria for the predicate P (x) are satisfied, i.e. to conclude an ad absurdum theorem and to be decidable (see [27]).

136

Antonino Drago Table 2. The organization of a problem-based theory Common knowledge ↓ The main problem (a DNS) ↓ Sub-problem P1 MPs AAA : ¬¬T1 ↓ Sub-problem P2 MPs plus ¬¬T1 AAA : ¬¬T2 ↓ n-times Sub-problem Pn+1 MPs plus ¬¬Tn AAA : ¬¬U T ↓ A local Chance, trough: ¬¬T = T , causing a global change, i.e. from a PO theory to an AO theory ↓ Deductive development of a new theory from the hypothesis T Legenda: AAA = Ad absurdum argument; M P = Methodological principles P = Problem; U T = universal thesis. ↓ = inductive logical step. ⇓ = deductive logical step.

relativity” (cf [19] p. 891); i.e. he will change a DNS into an affirmative axiom, from which he will deduce the remaining theory. By a comparative analysis of the texts forming the basis of the above-mentioned theories, I obtained the flow-chart of the model of a PO scientific theory (see table 2).

3

Lagrangian Reasoning

There exists a third way of reasoning, which I call Lagrangian, for reasons which will become clear in the following. It is obtained by means of two operations performed upon the previous way of arguing, which I call Carnotian. Within an ad absurdum theorem, the Carnotian way proves a universal thesis ¬¬U T by denying a universal methodological principle, U M P (e.g.: in Lobachevsky’s theory “One can trace from a given point a straight line which intersects a given straight line at an limitless least angle”; in S. Carnot’s theory: “A motion without an end is impossible”). The Lagrangian way of arguing translates into classical logic the previous unit of argument, in order to translate the U M P into a positive version and then into mathematical terms. It is illustrated in Table 3. While the left column represents the intuitionist implication ¬U T → ⊥, the second column presents the converse in classical logic of this implication. Let us recall that if M → N , the classical converse law (which is rejected by intuitionist logic) gives ¬N → ¬M . In our case

Four Ways of Logical Reasoning

137

Table 3. The Carnotian and Lagrangian version of a unit of reasoning Lagrangian translation of the same Cantorian unit of reasoning inside unit of arguing in classical logic and intuitionistic logic in mathematics From both common knowledge and a previous unit of arguing: Methodological Principles as DNS’s: (X), (Y ), (Z), . . . , one of which (X) is of a universal nature: UMP A problem: U T ? (AAA) ¬U T ¬ UMP ⊥ ¬¬U T

¬⊥ UMP UMP* UT Then U T is translated into differential equations. Also the MP’s (X), (Y ), (Z), . . . are translated into mathematical conditions X, Y, Z, . . . Problems are solved by means of mathematical derivations in calculus only.

it gives ¬⊥ → ¬¬U T , implying, according again to classical logic, U T ; i.e. the converse law reverts an ad absurdum argument in an affirmative argument ¬⊥ = U M P → U T , however the conclusion may be idealistic in nature. U T can be translated into mathematical formulas with equality, of course pertaining to classical logic. The same is performed on the methodological principles. Hence, after this single unit of arguing, the theory develops through mathematical deductions only. Let us now examine Lagrange’s mechanics (see [24]7 ). We easily recognize that this theory also aims to solve a universal problem, which Lagrange himself stated could not be solved by means of Newton’s theory (p. 166): How to develop a non-specific theory of a mechanical system evolving under the impediments of some constraints. Thus, Lagrange’s theory is a PO theory. Moreover, Lagrange declares that his theory develops according to a general principle, capable of governing the whole of theoretical physics, i.e. the principle of Virtual Velocities (PVV; p. 23, 140) Since this principle is instrumental in solving problems of constrained motions, actually it works as methodological principle for this PO theory8 . The physical content of this principle states as a DNS a more specific version of the sentence (“A motion without an end is impossible”) which S. Carnot’s arguing made use of: “No work from nothing”; or even: “It is impossible that constraints produce work that is greater than zero”. In the first of the several historical introductions (Sect. I, pp. 1-26) Lagrange performed the two above-mentioned steps to change this principle into a mathematical principle. As a first step, Lagrange reproduces in verbal terms, the PVV as an affirmative sentence; he claims that his version is a generalisation of the common version.(p. 22) Here we recognise the same logical operation which constitutes the concluding operation in a PO theory (¬¬A is forced in A), but applied from the beginning of the theory. However, he then manifests a scruple; the PVV “...is not so evident by itself to be elevated to a primitive principle”; p. 23). But, instead of explaining the nature of his generalisation, he 7

8

The second edition differs in several points from the first one, whereas the next editions include some notes by the editor Bertrand (Engl. tr. of the second edition by A. Boissonnade and V.N. Vagliente, Kluwer, Dordrect, 1997; It. tr. of the historical parts by D. Capecchi and A. Drago: Lagrange e la storia della meccanica, Progedit, Bari, 2005). The following quotations refer to the French second edition About the great and long debate upon the nature of PVV see [1] and also [8]

138

Antonino Drago

wants to persuade the reader by trying to deduce this principle (p. 23-25). He offers a celebrated “theorem”, whose proof reduces any set of points, acted on by forces, to one point only, moved by a composite force. But this reduction in the number of points has nothing to do with the theoretical justification of the PVV from other principles concerning unconstrained bodies. As second move, he expressed his sentence referring to the PVV in mathematical terms.(p. 25) PVV, being essentially a DNS, is commonly translated into a mathematical formula inP cluding an inequality fi δsi ≥ 0. Lagrange does not consider the essential inequality a sign of the PVV. As a justification, he later deals with this question through a “theorem”. Indeed, in the following (pp. 25-26) he offers both a direct “proof” and a converse “proof” for claiming that the case of disequality is included in the equality case. Clearly, this is a comfortable result, but is wishful thinking In the remaining part of his book, Lagrange does not raise previous questions again. In other words, he considers the translation of PVV into his formula with equality only to be a well-established principle. Indeed, Lagrange’s reduction of PVV to both classical logic and to a mathematical equality allows him to then translate it into a set of differential equations, from which he develops deductively all possible consequences in a purely mathematical manner, provided that even the formulation of the problem is translated into mathematical terms and the specific conditions are translated into some suitable mathematical hypotheses. (For example, let us return to S. Carnot. He obtained the desired result ηrev ≥ ηirr by following the methodological principle restraining the thermal processes to be reversible (see [3] p. 18). In the same study-case, the axiom of Lagrangian reasoning would be the mathematical translation of PVV, which is a specific version of S. Carnot’s methodological principle of the impossibility of a motion without an end. Then Lagrange would translate, beyond the PVV into differential equations, reversibility into a mathematical condition; which we now know to be the uniform continuity of all functions and variables involved in the theory. By making a priori these mathematical assumptions, he would try to deduce the formula of maximum efficiency, of course in the equality case only.) After the first cycle of reasoning, Lagrange successfully applies his differential equations to an impressive series of sub-problems of all kinds; he seemingly obtains the solutions for all problems of theoretical physics. Here, Lagrangian reasoning does no more than verify, by means of a merely mathematical deduction, that the desired solution of a problem can be obtained under certain specific mathematical conditions. This way of solving physical problems a potentially correct one in-asmuch as any physical result cannot be excluded by that very general principle about what is reality, which is the PVV. Therefore, after the first unit of reasoning, Lagrangian reasoning simply performs deductions by means of the mathematics of the formulas including equalities, which means within classical logic only. Hence, the first historical section constitutes the only unit of logical reasoning in the whole book. The solution is derived so easily (in p. vi of his book Lagrange himself claims to be suggesting “a regular and uniform march”) that it makes us feel we are making use of a “magic wand”, as one author puts it (see [31] p. 451). Indeed, a scholar reads the introductory parts of Lagrange’s book as a mere facilitation to the hard work he has to do in the following parts. Then, by applying Lagrange’s equations, he or she sees the mathematical development only, which proceeds at a regular and sure pace of an entirely mathematical nature; no surprise, if for two centuries scholars have been unable to trace back Lagrange’s reasoning to his premises, which actually are DNSs pertaining to an unusual logic. However, the nature of Lagrange’s mechanics as a PO theory is preserved not only in the first section where the verbal presentation presents several DNSs (see [13]), but also in its results; which are not trajectories, as in Newton’s mechanics, by in-variants, i.e. DNS, although expressed by mathematical equalities.

Four Ways of Logical Reasoning

4

139

The Fourth Model of Reasoning. The Relationships of all Models of Reasoning with Mathematics

In the above we considered three ways of arguing, i.e. the Newtonian, the Carnotian and the Lagrangian. Each one can be linked to the following characteristic features in the basic notions of a corresponding representative theory. The first way of arguing is represented by the development of Newton’s mechanics; it makes use of both idealistic notions (such as absolute space and absolute time as axiom; AO) and idealistic mathematics (actual infinitesimals; let us call them AI). The second way of arguing is represented by the development of the original theory of S. Carnot’s thermodynamics; it makes use of both empirical notions only (in agreement with the PO nature of the theory) and operative mathematics, which is based upon potential infinity only (let us call it PI); in formal terms, constructive mathematics. The third way of arguing is represented by the theory of Lagrange’s mechanics; it makes use of both empirical notions only (i.e. constraints, PVV; in agreement with the PO nature of the theory) and idealistic mathematics (e.g. infinitesimals, essential use of differential equations; AI). They all correspond respectively to three of the four models of scientific theory I previously recognised (see [7, 10, 2]): the Newtonian, the Carnotian and the Lagrangian. The remaining model, the Cartesian, is represented for instance by the theory of geometrical optics. This theory is characterised by the two basic choices for AO and no more than instrumental mathematics, PI. However, in its principles (e.g. the laws of thin lens) it accepts even a point at infinity, when it is recognised by geometrical intuition; or when it is the l.u.b. (or the g.l.b.) of an approximating series of constructive numbers; or, in different terms, when it results from the use of one quantifier. All these non-constructive properties are mutually equivalent; they characterise Weyl’s elementary mathematics (see [21]9 [10, 11]). Hence, this way of arguing includes idealistic notions too, but restrained to at a minimum level with respect to the high level of AI inside Newtonian notions. The above four ways of arguing are easily linked to four corresponding ways of reasoning mathematically. Newtonian reasoning promptly links the idealistic mathematics of infinitesimals (AI) to idealistic axioms; e.g. when the idealistic notion of infinitesimals is linked to the idealistic version of space. The Carnotian reasoning achieves a mathematical formula only after having changed the universal sentence ¬¬U T into the affirmative sentence U T , which subsequently is translated into a mathematical equality; from this step on, the theory can be developed in quantitative terms by means of formulas with equality, which however belong to constructive (operative) mathematics (PI), because no idealistic operations are allowed. By reversing the units of argument performed by means of DNSs into classical logic, Lagrangian reasoning starts from idealistic statements, which, as A next step, introduce an idealistic mathematics, i.e. the system of differential equations (AI). In Cartesian reasoning the basic mathematics is of an operative nature; but the axioms allow the introduction of the least idealistic notions of Weyl’s elementary mathematics.

5

A Verification: Four Versions of the Inertia Principle

Corresponding to the four ways of arguing, one can suggest four versions of the principle which characterises modern science with respect to ancient scientific thinking, i.e. the inertia principle; the four versions have been offered by Newton, L. Carnot, Enriques, Cavalieri respectively ([33]). 9

This mathematics is equivalent to Cavalieri’s calculus of indivisibles.

140

Antonino Drago

I. Newton: “All bodies at rest or in rectilinear uniform motion perseveres its state of motion unless a force changes it”. This statement begins with the most idealistic words; its subject “All bodies...” includes even the bodies which will be discovered in the future. This is a typical feature of an AO. Moreover, its statements include the AI in determining with absolute accuracy the rest, the uniform and rectilinear motion, the length of the path of the body (see [22, 5]). L. Carnot: “Once a body is at rest, it cannot move by itself; and once it is set in motion, it cannot change its velocity and its direction by itself.” This sentence is manifestly a fully operative one (PI), from the very first word: “Once. . .”; it means a contingent situation, established by experimental approximations only. This version works as a methodological principle within a PO theory; indeed, this theory includes essential DNSs. F. Enriques: the “generalised inertia principle”. It is represented through an infinitesimal act of motion (AI), in order to offer a new method for discovering the solution of the universal problem of the motion of the material point in any system of reference (PO). Indeed, Enriques states that in a any kind of reference system, the transition from an infinitesimal motion to a finite motion requires that the force acting on a moving material point has to be evaluated as if the point were stopped for an instant: “In each instant the motion of a material point occurs as if this body is allowed to move from a state of rest, provided that: 1) owing to this stopping the positions of external bodies (which sensibly influence the phenomenon) do not suffer any modification; 2) to the stopped point, on which is applied the acting force, which is measured as a static force,. . . the velocity of previous motion is added.” (see [20]) B. Cavalieri: “I say moreover that, if we consider the motion of a bullet, when it is fired in a given direction, if there is no other force acting on it causing it to move along a different direction, it will achieve the site to which the bullets motion in a straight line addresses,. . . ; from that straightness it is not reasonable that the mobile body detaches itself when no other motrice virtue removing it there exists. . . I moreover say that that projectile not only would proceed along a straight line. . . but also that in equal times it would go through equal spaces of the same line.” (see [4]10 .) It is apparent that in this sentence we are considering a constructive (actually, operative) series which approaches the limit-situation of absence of forces, e.g. gravity. This mathematical feature characterises Weyl’s elementary mathematics. These four versions of the inertia principle show that from each way of arguing we derive a different way of seeing the same physical reality11 . 10

11

Some years after Cavalieri, Torricelli synthesised this same version by a short passage: “Let a mobile body be launched from A with a whatsoever inclination. It is clear that, without gravitys attraction, it would proceed by an uniform and rectilinear motion along the direction AB. But, because the gravity works from the interior, the body at once starts to decline, with an ever increasing deviation. . .”, cf [32] pag 156 (emphasis added), see also [9] One can look for the same four ways of arguing in thenscientific theory which since two millennia was more linked to a lucid arguing, i.e. geometry. Actually, the Newtonian way is represented by the Euclidean deductive geometry. The Carnotian way of arguing is represented, as we saw in the above, by Lobachevsky’s presentation of the hyperbolic geometry. Then the number of geometrical theories grew up enormously and it is hard analyse all the original texts in order to interpret them by means of the basic choices. However, one short way to conclude this investigation is to refer to the more applied geometries by theoretical physics. The elliptic geometry, whose choices are PI and AO, can be linked to the Descartesian way of arguing; whereas the Lagrangian way of arguing may be associated to the fourth geometry widely used by physicists, i.e. Minkowsky’s geometry, whose choices are AI (because it include in its space the lines at infinity of the light-cone) and PO (because it has to solve the problem of tracing lines in a constrained space).

Four Ways of Logical Reasoning

6

141

The two Ways of Arguing in Computer Science

The last kind of arguing can be recognised in other theories than geometrical optics; i.e., probability theory, statistical mechanics and Computer science. In the last case, the special theory of Computability theory originated in Turing’s celebrated paper. It presents a typical PO theory; i.e. no axioms, but rather a fundamental problem; which set of functions can be covered by all possible operative calculations? This theory manifests the nature of a PO even when its main sentence makes use of an odd word: Church “thesis”; which surely is not an axiom for an AO; moreover, it is not surely the thesis of a theorem (although after Church made the proposal of his statement, even Goedel tried to prove it). Mathematicians, by ignoring an alternative to AO, were unable to find out a better word. Actually, its nature is that of a methodological principle for a PO theory, to be correctly stated by means of a DNS: “It is impossible to obtain by effective computations a set of functions greater than Turing-computable functions”. Computer science theory, in order to adhere to the physical operations performing calculations on any computer, is obliged to argue by means of an operative, or constructive, mathematics; in mathematical terms, by means of PI only. Hence, in Computer science there are two correct ways only of arguing, i.e. the Carnotian way and the Cartesian way. However, in the latter the principles of computer language can add the use of one quantifier; this use changes the mathematics into Weyl elementary mathematics, where there exist both l.u.b. and g.l.b. of a constructive series of numbers. Computer science can argue through DNSs only within a PO theory and make use of constructive mathematics, PI; an instance is given by the Turing theory or finite Automata theory. Computer science can also argue within an AO by drawing consequences from some principles belonging to Weyl elementary mathematics. The latter case concerns two instances of theory, to my knowledge: Game theory, where Weyl himself made use of his kind of mathematics for showing the minmax theorem (see [35] and also [15]); and Information theory, whose fundamental principle is the existence of a log function as the function fitting a list of basic probabilities; this is the same idea (to suggest a function fitting a set of experimental data of physics) upon which Weyl founded his kind of mathematics (see [34]).

References 1. Bailhache P., L. Poinsot: La théorie générale de l’équilibre et du mouvement des systèmes, Vrin, Paris, 1975 2. Capecchi D., Drago A., Lagrange’s history of mechanics, Meccanica, 40 (2005) 1933. 3. Carnot S., Réflexions sur la puissance motrice du feu, Blanchard, Paris, 1824 (critical edition by R. Fox, Vrin, Paris, 1978); 4. Cavalieri B., Lo Specchio Ustorio, overo Trattato delle Settioni Coniche, Bologna, Clemente Ferroni, 1632, ch. XXXIX, p. 153 and p. 155. 5. Drago A., A Characterization of Newtonian Paradigm, in P.B. Scheurer, G. Debrock (eds.): Newton’s Scientific and Philosophical Legacy, Kluwer Acad. P., 1988, 239-252. 6. Drago A., Incommensurable scientific theories: The rejection of the double negation logical law. D. Costantini e M. G. Galavotti (eds.): Nuovi problemi della logica e della filosofia della scienza, CLUEB, Bologna, 1991, I, 195-202. 7. Drago A., Le due opzioni, La Meridiana, Molfetta BA, 1991. 8. Drago A., The Principle of virtual works as a source of two traditions in 18th Century Mechanics, in F. Bevilacqua (ed.): History of Physics in Europe in 19th and 20th Centuries, SIF, Bologna, 1993, 69-80. 9. Drago A., La nascita del principio d’inerzia in Cavalieri e Torricelli secondo la matematica elementare di Weyl, Atti del XVII Congresso Nazionale di Storia della Fisica e dell’Astronomia, Univ. Milano Dip. Fisica Gen., Milano, 1997, 181-197.

142

Antonino Drago

10. Drago A., Which kind of mathematics for quantum mechanics? The relevance of H. Weyl’s program of research, in A. Garola, A. Rossi (eds.): Foundations of Quantum Mechanics. Historical Analysis and Open Questions, World Scientific, Singapore, 2000, 167-193 11. Drago A., The introduction of actual infinity in modern science: mathematics and physics in both Cavalieri and Torricelli, Ganita Bharati, Bull. Soc. Math. India, 25 (2003) 79-98. 12. Drago A., A.N. Kolmogoroff and the Relevance of the Double Negation Law in Science, in G. Sica (ed.): Essays on the Foundations of Mathematics and Logic, Polimetrica, Milano, 2005, 57-81. 13. Drago A., Lagrange’s kind of arguing in “Mécanique Analytique”, in G.A. Sacchi (ed.): Sfogliando la Mécanique Analytique, Ist. Lombardo Accademia Lettere e Scienze, Milano (in print). 14. Drago A., Pisano R., Interpretazione e ricostruzione delle Réflexions di Sadi Carnot mediante la logica non classica, Giornale di Fisica, 41 (2000) 195-215. Sica (ed.): Essays on the Foundations of Mathematics and Logic, Polimetrica, Milano, 2005, 57-81. 15. Drago A., Finite game theory according to constructive, Weyl’s elementary, and set-theoretical mathematics, Atti Fond. Ronchi, 57 (2002) 421-436. 16. Drago A., Pisano R., Interpretation and reconstruction of S. Carnot’s Réflexions according to non-classical logic, Atti Fond. Ronchi, 58 (2003). 17. Drago A., Perno A., La teoria geometrica delle parallele impostata coerentemente su un problema (I), Per. Matem., ser. VIII, vol. 4, ott.-dic. 2004, 41-52. 18. Dummett M., Principles of Intuitionism, Oxford U.P., Oxford, 1977. 19. Einstein A., Zur Elektrodynamik bewegter Koerper, Ann. der Phys., 17 (1905) 891-921. 20. Enriques F., Il problema della scienza (1906), Zanichelli, Bologna, 1985, 419-426, p. 424. 21. Feferman S., Weyl vindicatus: Das Kontinuum 70 years later, in C. Cellucci et al. (eds.): Temi e problemi della logica e della filosofia della scienza contemporanee, CLUEB, Bologna, 1988, 59-93. 22. Hanson R. N., Newton’s first Law. A Philosopher’s door in Natural Philosophy, in R.G. Colodny (ed.): Beyond the edge of certainty, Prentice-Hall, 1965, 6-28. 23. Kolmogorov A. N., On the principle “tertium non datur”, Mathematicheskii Sbornik, 32 (1924/25) 646-667; Engl. transl. in J. van Heijenoorth: From Frege to Goedel, Harvard U.P., Cambridge, 1967, 416-437; sect. V.B p. 431). 24. Lagrange J. L. Mécanique analytique, Desaint, Paris, 1788. 25. Leibniz G. W., Letter to Arnauld, 14-7-1686, Gerh. II, Q., Opusc. 402–513. 26. Lobachevsky I. N., Untersuchungen der Theorien der Parallelellineen, Finkl, Berlin, 1840 (Engl. Transl. as an Appendix to R. Bonola: Non-Euclidean Geometry, Dover, New York, 1909; It. tr. in S. Cicenia and A. Drago: La teoria delle parallele secondo Lobacevsky, Danilo, Naples, 1996). 27. Markov A. A., On constructive mathematics, Trudy Math. Inst. Steklov, 67 (1962) 8-14; also in Am. Math. Soc. Translations, (1971) 98 (2) 1-9. 28. Prawitz D., Natural Deduction: A Proof-Theoretical Study, Almqvist & Wiksell, Stockholm, 1965. 29. Prawitz D., Meaning and Proof. The Conflict between Classical and Intuitionistic Logic, Theoria, 1977, 43, 6-39. 30. Prawitz D., Melmnaas P. E., A survey of some connections between classical intuitionistic and minimal logic, in H. A. Schmidt, K. Schuette and H.-J. Thiele (eds.): Contributions to Mathematical Logic, North-Holland, Amsterdam, 1968, pp. 215-229. 31. Scott L. D., Can a projection method of obtaining equations of motion compete with Lagrange’s equations?, Am. J. Phys., 56 (1988) 451-456. 32. Torricelli E., De Motu Projectorum, 1644, l. II. 33. Vella M. R., Le quattro versioni del principio d’inerzia, P. Tucci et al. (eds.): Atti XXIV Congr. Naz. Storia Fisica e Astr., Avellino, 2005, (in print). 34. Weyl H., Das Kontinuum, Veit, Leipzig, 1918. 35. Weyl H., An elementary proof of von Neumann minmax theorem, in H.W. Kuhn, A.W. Tucker (eds.): Contributions to Game Theory, Princeton U.P., 1950, vol. 1, 19-25.

Totally d-c.e. Real Numbers? Yun Fan1 , Decheng Ding1 , and Xizhong Zheng2,3?? 1

2

Department of Mathematics, Nanjing University, China Department of Computer Science, Jiangsu University, China 3 Theoretische Informatik, BTU Cottbus, Germany

Abstract. Computably enumerable (c.e.) reals are defined as the limits of computable increasing sequences of rational numbers. They are the first weakening of the computable reals. The differences of two c.e. reals are called d-c.e. (difference of c.e.) The class of d-c.e. reals has very nice computability-theoretical as well as mathematical properties and it can be characterized equivalently by different ways. In this paper we explore the d-c.e. reals which are equivalent only to d-c.e. reals. Such kind of reals are called totally d-c.e. We show that, below any c.e. degree, there exists a totally d-c.e. real of c.e. degree and there exists also a totally d-c.e. real of a properly ω-c.e. (and hence not c.e.) degree. On the other hand we prove also that not every real of low c.e. degree is totally d-c.e.

1

Introduction

A real is called c.e. (computably enumerable) or left computable if it is the limit of a computable increasing sequence of rational numbers. Analogously, the limits of computable decreasing sequence of rational numbers are called co-c.e. or right computable. Left and right computable reals are called semi-computable. The differences of c.e. reals are called d-c.e. Ambos-Spies, Weihrauch and Zheng [1] have shown that, the class of d-c.e. reals is the arithmetical closure of c.e. reals and it is a proper superset of semi-computable reals. This class can also be characterized equivalently in several other ways. For example, x is d-c.e. iff there is a computable sequence (xs ) of rational numbers which converges to x weakly effectively in the sense that P |x s − xs+1 | ≤ c for a constant c ([1]). This is the reason, why d-c.e. reals are also called s∈IN weakly computable. Besides, d-c.e. reals are exactly those reals which can be Solovay reduced to a random c.e. real (see [13]), where a real x is Solovay reducible to y means that there are a constant c and two computable sequences (xs ) and (ys ) of rational numbers converging to x and y, respectively, such that |xs − x| ≤ c(|ys − y| + 2−s ) for all s. These facts show that d-c.e. reals form a very robust class of reals. This paper continues the investigation of d-c.e. reals. And we are mainly interested in the properties of d-c.e. reals related to Turing reduction and Turing degrees. Here P the Turing reduction on reals is defined by means of its binary expansion. Let xA := i∈A 2−(i+1) be the real of binary expansion A. Then, xA is Turing reducible to xB (denoted by xA ≤T xB ) if A ≤T B. The reals x and y are Turing equivalent (denoted by x ≡T y) if x ≤T y and y ≤T x. The Turing degree degT (x) of a real x is the set of all reals which are Turing equivalent to x. Without loss of generality, we consider here only the reals in the unit interval [0; 1]. Thus, a real is corresponding uniquely to an infinite set of natural number. Therefore, it is not necessary to distinguish strictly between a real and a set of natural numbers which corresponds to its binary expansion. Several results about Turing degrees of d-c.e and c.e. reals can be found in literatures. For example, Zheng shows in [18] that, the Turing degree of a d-c.e. real is not necessarily ω-c.e. On the other hand, Downey, Wu and Zheng [6] show that there is a ∆02 -Truing degree which ? ??

This work is supported by DFG (446 CHV 113/240/0-1) and NSFC (10420130638). Corresponding author, email: [email protected]

144

Fan, Ding and Zheng

does not contain any d-c.e. real, although every ω-c.e. Turing degree contains a d-c.e. real. More recently, Ng, Stephan and Wu [12] called a degree completely weakly computable (cwc, for short) if it contains only d-c.e. reals. The existence of nontrivial cwc degrees follows from the existence of non-computable K-trivial reals because any K-trivial real is d-c.e. and K-triviality is closed under Truing reduction downward (Downey, Hirschfeld, Miller and Nies [8]). In [12] it is shown that there exists also reals of cwc which are not low (and hence not K-trivial) and the c.e. cwc degrees are closed under Turing reduction downward. We call a real totaly d-c.e. if it is Turing equivalent only to d-c.e. reals, i.e., it is of a cwc degree. It is shown in this paper that, below any given nonzero c.e. degree, there is a nontrivial totally d-c.e. real and there exists also totally d-c.e. real which is of a properly ω-c.e., and hence not c.e. degree. On the other hand, we prove also that not every d-c.e. real of low Turing degree is totally d-c.e. The paper is organized as follows. In the section 2 we discuss the Turing degrees which contain c.e. and d-c.e. reals, respectively. The existence results of some totally d-c.e. reals are proved in section 3. In section 4 we prove that there is a low c.e. degree which does not contain any totally d-c.e. real.

2

Turing Degrees of c.e. and d-c.e. Reals

In this section, we discuss the Turing degrees which contain c.e. or d-c.e. reals. As it is pointed out by Jockusch (see [16]), a c.e. real does not necessarily have a c.e. binary expansion. For example, if A is a non-computable c.e. set, then xA⊕A is a c.e. real with a nonc.e. binary expansion A⊕A. In [1], it is even shown that there is a properly ω-c.e. set A such that xA is still a c.e. real, where a set A is properly ω-c.e. means that A is ω-c.e. but not k-c.e. for any constant k. Remember that, according to [11], a set A is called h-c.e. for a function h, if there is an h-enumeration (As ) of A, where an h-enumeration of a set A is a computable sequence (As ) of finite sets such that A0 = ∅, A = lims→∞ As and |{s ∈ IN : As (n) 6= As+1 (n)}| ≤ h(n) for all n. A set A is ω-c.e. if it is h-c.e. for a computable function h and it is k-c.e. if it is h-c.e. for the constant function h ≡ k. According to [2], the binary expansion of a c.e. real is strongly ω-c.e. and hence it must be h-c.e. for the function h(n) := 2n . Here set A is strongly ω-c.e. if there is a computable sequence (As ) of finite sets converging to A such that n ∈ As \As+1 =⇒ (∃m < n)(m ∈ As+1 \As ) for all s, n. A nice relationship between c.e. reals and c.e. sets can be established, however, if we consider the Dedekind cuts. It is easy to see that the Dedekind cut of a c.e. real is a c.e. set of rational numbers. This implies that any c.e. real has a c.e. Turing degree, because the Dedekind cut of a real is Turing equivalent to the binary expansion of the real. Therefore, the binary expansions of c.e. reals are h-c.e. sets of c.e. Turing degrees for the function h(n) := 2n . On the other hand, any c.e. Turing degree contains at least one c.e. real. Therefore, the c.e. Turing degrees (the degrees containing at least a c.e. set) and the Turing degrees containing c.e. reals are in fact the same. These degrees can be simply called c.e. degrees. Dunlop and Pour-El [9] have shown an interesting property of c.e. degrees as follows: a real x is of c.e. degree a if and only if there is a computable sequence (xs ) of rational numbers which converges to x with an a-computable modulus m such that s ≥ m(n) ⇒ |x − xs | ≤ 2−n for all s and n. This implies especially that, if a computable sequence converges to a non-c.e. real, then its modulus has a degree strictly bigger than degT (x). It is well known that, a c.e. degree does not necessarily contain only c.e. sets. Analogously, we can prove that a c.e. degree can contain non-c.e. real too. To this end, we need the following necessary condition of semi-computable reals. Theorem 2.1 (Ambos-Spies, Weihrauch and Zheng [1]). Let A and B be c.e. sets. If xA⊕B is semi-computable, then A and B are Turing comparable.

Totally d-c.e. Real Numbers

145

Theorem 2.2. Any non-computable c.e. degree contains a d-c.e. but not semi-computable real. Proof. Let a be a non-computable c.e. degree. By Sacks’ Splitting Theorem [14] there exist two incomparable c.e. degrees b0 , b1 such that a = b0 ∨ b1 (the least upper bound of b0 and b1 ). Choose two c.e. sets B0 ∈ b0 and B1 ∈ b1 and define a set A =: B0 ⊕ B 1 . Then, degT (A) = degT (B0 ⊕ B 1 ) = degT (B0 ) ∨ degT (B1 ) = b0 ∨ b1 = a. Furthermore, A is a dc.e. set, because A = C0 \ C1 for C0 := 2B0 ∪ (2IN + 1) and C1 := 2B1 + 1. This implies that xA = xC0 − xC1 and hence xA is a d-c.e. real of the c.e. degree a. According to Theorem 2.1, xA is not semi-computable since B0 and B1 are not Turing comparable. Now we turn to the Turing degrees of d-c.e. reals. The following properties of d-c.e. reals will be useful for the discussions later. Theorem 2.3 (Ambos-Spies, Weihrauch and Zheng [1], Zheng [19]). 1. Let A be an f -c.e. set for a computable function f . If c, then the real xA is d-c.e. 2. If x2A is d-c.e., then A is h-c.e. for h(n) := 23n .

P n∈IN

f (n) · 2n ≤ c for a constant

An immediately corollary of Theorem 2.3.2 is that reals of c.e. degrees are not even necessarily d-c.e. For example, let A be a set of c.e. degree but is not ω-c.e. Then, x2A is a non-d-c.e. real of c.e. degree. Since the degree 00 of the halting problem contains non-ω-c.e. set, it contains non-d-c.e. real too. We call a Turing degree d-c.e. real degree if it contains at least a d-c.e. real. Notice that, a Turing degree is called d-c.e. in computability theory if it contains a d-c.e. set. For any dc.e. set A, xA is obviously a d-c.e. real. Therefore, any d-c.e. degree is also a d-c.e. real degree. However, the inverse is not true because there is a d-c.e. real degree which does not contain ω-c.e. set and hence does not contain d-c.e. set ([1]). That is, the classes of d-c.e. degrees and d-c.e. real degrees are different. Let Dr be the class of all d-c.e. real degrees. The next theorem shows that the structure (Dr , ≤T ) is a non-dense upper semilattice. Theorem 2.4. 1. If a and b are d-c.e. real degrees, then so does a ∨ b. 2. There is a minimal degree which is a d-c.e. real degree. Proof. (1). Let a and b be Turing degrees which contain d-c.e. reals xA and xB respectively. Then, there are computable sequences (As )P and (Bs ) of finite sets which Pconverge to A and B, respectively, such that, for some constant c, |x −x | ≤ c and AsP As+1 s∈IN s∈IN |xBs −xBs+1 | ≤ c P hold. This implies that s∈IN |x2As −x2As+1 | ≤ c and s∈IN |x2Bs +1 −x2Bs+1 +1 | ≤ c and hence X

|xAs ⊕Bs − xAs+1 ⊕Bs+1 | =

s∈IN

X

|(x2As + x2Bs +1 ) − (x2As+1 + x2Bs+1 +1 )|

s∈IN



X

(|x2As − x2As+1 | + |x2Bs +1 − x2Bs+1 +1 |) ≤ 2c.

s∈IN

That is, xA⊕B is a d-c.e. real which belongs to the degree a ∨ b and hence a ∨ b is a d-c.e. real degree too. (2). According to [10] (the proof of Theorem 4 of [10]) there is an id-c.e. set A which has a minimal Turing degree. By Theorem 2.3.1, xA is a d-c.e. real. That is, the degree deg(A) is a minimal and d-c.e. real degree.

146

3

Fan, Ding and Zheng

Totally d-c.e. Reals

In the last section we have discussed the Turing degrees which contain at least a d-c.e. real. Now we investigate the Turing degrees which contain only d-c.e. reals. It is well known that computable reals are only Turing equivalent to computable ones. By Theorem 2.2, no c.e. real or semi-computable real has the similar property. It is natural to ask, is there a d-c.e. real which has this property? Let’s define the notion precisely as follows. Definition 3.1. A real x is called totally d-c.e. if it is Turing equivalent only to d-c.e. reals, i.e., (∀y)(x ≡T y =⇒ y is d-c.e.). Trivially, any computable real is totally d-c.e. The existence of a non-com-putable totaly dc.e. real follows from the existence of nontrivial K-trivial real (see [7] for details). Ng, Stephan and Wu [12] shown that a real x of c.e. degree is totally d-c.e. iff only d-c.e. reals can be Turing reducible to x. In the following, we will show constructively that below any given non-computable c.e. real, there exists a non-computable totally d-c.e. real. Therefore, the class of degrees of totally dc.e. reals is dense downward in the c.e. degrees. The notations we used in the following are standard. The reader can refer [17] for details. Theorem 3.2. For any non-computable c.e. real x, there exists a totally d-c.e. real y of c.e. degree which is not computable such that y ≤T x Proof. For any non-computable c.e. real x, there is a non-computable c.e. set C such that x ≡T xC . Let f be an 1 : 1 computable enumeration of C and let Cs = {f (0), · · · , f (s)}. We will construct a non-computable c.e. set A such that A ≤T C and, for any B ≤T A, xB is a d-c.e. real. That is, A has to satisfy, for all i, e ∈ IN, the following requirements: P : Ri : Qe :

A ≤T C; A 6= Φi ; Be = ΨeA =⇒ xBe is a d-c.e.

where (Φi ) and (Ψe ) are computable enumeration of all computable functionals. The strategy for meeting A ≤T C is a standard permitting method. That is, some natural number m can be enumerated into A at stage s + 1 only if C permits it in the sense that Cs ¹ (m + 1) 6= Cs+1 ¹ (m + 1), or equivalently, f (s + 1) ≤ m. The non-computability of the set C guarantees that this method works. The strategy for the requirement Ri is a simple diagonalization. We choose for each requirement Ri a potential “witness” mi not yet in A and wait for a stage s such that ΦC i (mi )[s] ↓= 0 and then put mi into A. If such stage s does not exist, then Ri is satisfied automatically. To keep witnesses for different Ri distinct, we choose the witnesses for Ri from ω [i] = {hx, yi : x = i}. To satisfy the requirements P Qe , recall that xBe is d-c.e. if Be is an f -c.e. set for a computable function f such that the sum f (n) · 2n is bounded by a constant c (Theorem 2.3.1). Thus, the requirements Qe can be replaced by: Qe :

Be = ΨeA =⇒ Be is pe -c.e.

where pe is a polynomial with integral coefficients. By definition (see, e.g., [3, 11], the set Be is pe -c.e. means that there exists a computable sequence (Be,s ) of finite sets converging to Be such that Be,0 = ∅ and |{s : Be,s (n) 6= Be,s+1 (n)}| ≤ pe (n) for all n. Notice that, if the condition Be = ΨeA holds, then we have As a natural computable sequence (Be,s ) converging to Be defined by Be,s := Ψe,s , where As is

Totally d-c.e. Real Numbers

147

the finite set consisting of numbers which is enumerated in to A up to stage s. In this way, the requirement Qe can be divided naturally into infinitely many sub-requirements as follows: Nhe,ni :

A

s+1 As Be = ΨeA =⇒ |{s : Ψe,s (n) 6= Ψe,s+1 (n)}| ≤ pe (n),

where he, ni := (e+n)(e+n+1) + e is the Cantor pairing function. 2 Let uA (n) denote the least t such that only the portion A ¹ t is actually used during e,s A A A the computation Ψe,s (n), if Ψe,s (n) ↓, and uA e,s (n) := 0, otherwise. (ue,s is so-called the use A s+1 As A ). Thus, the inequality Ψe,s (n) 6= Ψe,s+1 (n) can occur only function of the computation Ψe,s A if some elements less than ue,s (n) enters A at stage s + 1. In order to reduce the number of As A (n) by allowing only the elements changes of Ψe,s (n), we try to preserve the computation Ψe,s which are bigger than uA (n) to enter A after stage s, if it is possible. For j := he, ni, let e,s A r(j, s) := ue,s (n) be the restraint function for the requirement Nj . Our strategy for Nj is simply to preserve the initial segment As ¹ r(j, s) at any stages. Remember that, an element could be enumerated into A only if the strategy for some positive requirement Ri demands that. We arrange a priority ordering of the requirements as follows: R0 , N0 , R1 , N1 , R2 , N2 , · · · , and choose the witness mi for Ri larger than all r(j, s) for all j < i. This guarantees that our strategy for a negative requirement Nj can only be injured by the actions for the positive requirements R0 , R1 , · · · , Rj . Since one action for each positive requirement Ri suffices, the As value Ψe,s (n) can change at most j + 1 := (e+n)(e+n+1) + e + 1 times for different s. Therefore, 2 the requirement Nhe,ni is satisfied for the polynomial pe (n) := (e+n)(e+n+1) + e + 1. 2 The formal construction is omitted here. By Theorem 2.3.2, a totally d-c.e. real has an ω-c.e. degree. Our next theorem shows that a totaly d-c.e. real can have even a proper ω-c.e. degree. Theorem 3.3. There is a totaly d-c.e. real which is of an ω-c.e. but not k-c.e. Turing degree for any constant k. Proof. We will construct a set A which is not Turing equivalent to any k-c.e. set for any constant k. That is A satisfies all the following requirements Rhi,j,u,vi :

(Vi,s ) is a j-enumeration of Vi =⇒ A 6= ΨuVi or Vi 6= ΨvA ,

where (Vi,s ) is a computable enumeration of all computable sequences of finite subsets of IN and Ψ is a partial computable functional. To make the real number xA totaly d-c.e., we should guarantee that all reals xB are d-c.e. if B ≤T A. That is, A should satisfy, for all e, n, also the following following requirements Nhe,ni :

A

s+1 As Be = ΨeA =⇒ |{s : Ψe,s (n) 6= Ψe,s+1 (n)}| ≤ pe (n),

where pe is a polynomial of integral coefficients. Since any non-ω-c.e. degree contains a non-dc.e. real by Theorem 2.3.2, the requirements guarantee also that A is of an ω-c.e. degree. The strategy to satisfy a requirement Re for e := hi, j, u, vi is as follows. We choose a witness n which is not yet in A at the beginning and wait for some stage s1 such that the following condition A(n) = ΨuVi (n) & Vi ¹ ψu (n) = ΨvA ¹ ψu (n).

(1)

holds at the stage s1 . Here we use the lower case ψu to denote the corresponding use-function of the computation Ψ A . If such a stage s1 occurs, then, at stage s1 + 1, put the witness n into A, i.e., A(n) is changed from 0 to 1, and at same the time we preserve the initial segment

148

Fan, Ding and Zheng

A ¹ ψv ψu (n). Otherwise, if no such s1 exists, then the requirement Re is satisfied automatically and we need to do nothing. If at a later stage s2 > s1 , the condition (1) holds again, then we remove n from A at the stage s2 + 1. This means that the initial segment A ¹ ψv ψu (n) is recovered to that of stage s1 because no other elements m ≤ ψv ψu (n) are allowed to enter or leave A between the stages s1 and s2 . As a result, the initial segment ΨvA ¹ ψu (n) is recovered to that of stage s1 too (We assume that the use-functions are increasing). Since ΨuVi (n)[s1 ] = 0 6= 1 = ΨuVi (n)[s2 ], the initial segment Vi ¹ ψu (n) has to be changed between the stages s1 and s2 . Similarly, we can put n into A if the condition (1) hold at stage s3 again. In this case, Vi ¹ ψu (n) is changed between s2 and s3 too. We do such kind action at most k + 1 times and this suffices to satisfy the requirement Re . The strategy to satisfy the requirements Nhe,ni is similar to that of the proof of the Theorem 3.2. That is, we try to preserve the initial segment A ¹ ψe (n) if the computation ΨeA (n) halts As at some stage s. In this way, we can control the number of changes of Ψe,s (n). The set A is constructed again by a standard finite injury priority construction. We arrange the priority ordering as follows N0 , R0 , N1 , R1 , · · · . Since each Re for e := hi, j, u, vi needs only to change some A(n) at most k + 1 times which can only hurt the strategy of Ni for i > e, we As can always guarantee that the number of changes of Ψe,s (n) is bounded by a polynomial pe (n). Therefore, all requirements can be satisfied.

4

Non-Totally d-c.e. Real of Low Turing Degree

In this section we are going to explore the non-totally d-c.e. reals. As we have mentioned in section 2, the Turing degree 00 contains a non-d-c.e. real and hence the reals of degree 00 are not totally d-c.e. This is even true for reals of high and even non-low2 c.e. degrees, where a degree a is high and low2 if a0 = 000 and a00 = 000 , respectively. The reason is as follows. According to [5, 4], any non-low2 degree a is not totally ω-c.e., i.e., there is a non-ω-c.e. degree b ≤ a. If x is a real of a non-low2 c.e. degree, then there is a non-low2 c.e. set A such that x ≡T xA . Let B be a non-ω-c.e. set which is Turing reducible to A. Since x2B is not d-c.e. the real xA , and hence also x, is not totally d-c.e. because, according to Ng, Stephan and Wu [12], xA is totaly d-c.e. if and only if xB is d-c.e. for any set B ≤T A. The next theorem shows that not every c.e. low degree contains a totally d-c.e. real. Theorem 4.1. There is a c.e. real x which is low but not totally d-c.e. Proof. By Theorem 2.3.2, for any set B, if x2B is a d-c.e. real, then the set B must be hc.e. for the function h(n) := 23n . Thus, to construct a non-totally d-c.e. real x, it suffices to find a non-h-c.e. set B and let x := x2B . In order to guarantee that x is also low, we need only to construct a low set A such that B ≤T A because any set below a low set is low too. Unfortunately, the real x2B constructed in this way is not a c.e. real (it is not d-c.e. indeed). This problem can be solved by requiring that A to be a c.e. set and then define y := x2(A⊕B) . In this case, y is not d-c.e. because A ⊕ B is not h-c.e. But, on the other hand, the real x := xA is a c.e. real which is Turing equivalent to a non-d-c.e. real y. That is, the real x is not totally d-c.e. but c.e. and low. Concretely, we should construct a c.e. set A and a set B which satisfies the requirement P :

B ≤T A.

In addition, the set A should be low, i.e., A0 ≡T ∅0 . This can be achieved by satisfying, for all e, the following requirements: Ne :

A (∃∞ s)ϕA e (e)[s] ↓=⇒ ϕe (e) ↓ .

Totally d-c.e. Real Numbers

149

where (ϕe ) is a computable enumeration of all partial functions. The reason is, that all require0 0 ments Ne together implies that A0 (e) = lims→∞ ϕA e (e)[s] for all e and hence A ≤T ∅ by the Limit Lemma of Shoenfield [15]. Finally, the set B has to be non-h-c.e. That is, if (Cs ) is an h-enumeration of C, then C 6= B. Here a computable sequence (Cs ) is an h-enumeration of C means that C0 = ∅, lims→∞ Cs = C and |{s : Cs (n) 6= Cs+1 (n)}| ≤ h(n) for all n. Let ((Ce,s )s )e be an uniformly computable enumeration of all computable sequences of finite subsets of natural numbers. Thus, the set B has to satisfy, for all e, the following requirements: Re :

(Ce,s ) is an h-enumeration of Ce =⇒ B 6= Ce .

As usual, the single requirement P can be satisfied by standard permitting method. All other requirements will be assigned different priority in the following ordering: N0 , R0 , N1 , R1 , · · · . To meet a single requirement Ri , we choose a natural number n which should witness the inequality, i.e., Ci (ni ) 6= B(ni ). To this end, we should put ni into B as long as it is not yet in Ci . Otherwise, if ni enters Ci , it should be removed from B. At most h(ni ) such kind of changes suffice to satisfy the requirement Ri . The permitting method for P works as follows. To each witness ni we assign an interval I(ni ) := [ni , h(ni ) + ni + 1) of natural numbers. For different witnesses, the corresponding intervals should be disjoint. Whenever we change the membership B(ni ) according to the strategy for Ri , a new element of I(ni ) should be put into A. Since the interval I(ni ) has h(ni ) elements, this can be done at most h(ni ) times which suffice to satisfy the requirement Ri . The reason is, if we have if As ¹ h(ni ) = A ¹ h(ni ) for some stage s, then B(ni ) = Bs (ni ). That is, B(ni ) can be A-effectively computed and hence B ≤T A. In order to meet the requirement Ne , we should preserve the computation ϕA e (e)[s] as much s as we can. To this end, we define a restraint function r(e, s) = uA (e) for all e and restrain the e,s witnesses less than r(e, s) from entering A for the requirements of lower priority. To accommodate all requirements simultaneously, we must occasionally, but finitely often, change the witness ni . Let ni,s be an approximation at the end of stage s and ni = lims ni,s . The Formal Construction: Stage s = 0. Define A0 = B0 = ∅, d0 := 0, n0,0 = 0 and I0,0 = [0, h(n0,0 )). Stage s + 1. Given As , Bs , ds and ni,s , Ii,s for all i ≤ ds , a requirement Re requires attention if e ≤ ds and Ce,s+1 (ni,s ) 6= Ci,s (ni,s ). Let Re be the requirement of the highest priority which requires attention. The requirement Re then receives attention by the following actions. 1. 2. 3. 4.

Define Be,s+1 (ne,s ) := 1 −· Ce,s+1 (ne,s ); Put max(Ie,s \As ) into A, i.e., define As+1 := As ∪ {max(Ie,s \As )}; Define ds+1 := e; and Initialize all requirements Ri by setting ni,s+1 and Ii,s+1 undefined for all i > e.

If no requirement requires attention at this stage, then define ds+1 := ds + 1, ne,s+1 := n and Ie,s+1 := [n, h(n) + n + 1), where e := ds+1 and n is the least natural number which is bigger than all numbers used in the construction so far. Notice that, the choice of ne,s+1 guarantees that especially ne,s+1 > uA i (i)[s] for all i ≤ e and the interval Ie,s+1 is disjoint with all other intervals Ii,s defined so far. This ends the construction. We show now that the construction succeeds. First of all, the set A constructed is obviously a c.e. set, since we put only elements into A and never remove them from A. Secondly, we can prove by a simple induction that any requirement Re can be initialized finitely many often. This means that the limits ne := lims→∞ ne,s and Ie := lims→∞ Ie,s = [ne , h(ne )) exist. With respect to the witness ne the requirement Re will be satisfied eventually

150

Fan, Ding and Zheng

by receiving attention at most h(ne ) times. Therefore, the set B is different from the e-th h-c.e. set Ce for all e and hence it is not a h-c.e. set. Thirdly, by the choice of ni,s , a computation ϕA e (e)[s] can be destroyed only if some m < h(ni,s ) enters A when Ri receives attention for some i < e. This can, however, happen only finitely many often. That is, after some stage, the computation ϕA e (e)[s] will never be destroyed again and hence the requirement Ne will be satisfied eventually. Finally, we always put some number m < h(n) into A whenever the value Bs (n) is changed. Thus, for any natural number n, the value B(n) can be A-effectively computed by Bs0 (n) for s0 := min{s : As ¹ h(n) = A ¹ h(n)}. This implies that B ≤T A.

References 1. K. Ambos-Spies, K. Weihrauch, and X. Zheng. Weakly computable real numbers. Journal of Complexity, 16(4):676–690, 2000. 2. C. S. Calude, P. H. Hertling, B. Khoussainov, and Y. Wang. Recursively enumerable reals and Chaitin Ω numbers. Theoretical Computer Science, 255:125–149, 2001. 3. H. G. Carstens. ∆02 -Mengen. Arch. Math. Logik Grundlagenforsch., 18(1–2):55–65, 1976/77. 4. R. Downey, N. Greenberg, and R. Weber. Totally ω-computably enumerable degrees and bounding critical triples. submitted. 5. R. Downey and R. A. Shore. Lattice embeddings below a nonlow2 recursively enumerable degree. Israel J. Math., 94:221–246, 1996. 6. R. Downey, G. Wu, and X. Zheng. Degrees of d.c.e. reals. Mathematical Logic Quarterly, 50(4/5):345–350, 2004. 7. R. G. Downey and D. R. Hirschfeldt. Algorithmic Randomness and Complexity. Springer-Verlag, 200? monograph to be published. 8. R. G. Downey, D. R. Hirschfeldt, J. Miller, and A. Nies. Relativizing Chaitin’s halting probability. Journal of Mathematical Logic, 5:167–192, 2005. 9. A. J. Dunlop and M. B. Pour-El. The degree of unsolvability of a real number. In J. Blanck, V. Brattka, and P. Hertling, editors, Proceedings of CCA 2000, Swansea, UK, September 2000, volume 2064 of LNCS, pages 16–29, Berlin, 2001. Springer. 10. R. L. Epstein, R. Haas, and R. L. Kramer. Hierarchies of sets and degrees below 00 . In Logic Year 1979–80 (Proc. Seminars and Conf. Math. Logic, Univ. Connecticut, Storrs, Conn., 1979/80), volume 859 of Lecture Notes in Math., pages 32–48. Springer, Berlin, 1981. 11. Y. L. Ershov. A certain hierarchy of sets. i, ii, iii. (Russian). Algebra i Logika, 7(1):47–73, 1968;7(4):15–47, 1968; 9:34–51, 1970. 12. K. M. Ng, F. Stephan, and G. Wu. Degrees of weakly computable reals. In Logical Approaches to Computational Barriers, CiE’06, LNCS 3988, pages 413–422. Springer, 2006. 13. R. Rettinger and X. Zheng. Solovay reducibility on d-c.e. real numbers. In COCOON 2005, August 16-19, 2005, Kunming, China, LNCS, pages 359–368. Springer-Verlage, 2005. 14. G. E. Sacks. On the degrees less than 00 . Ann. of Math., 77:211–231, 1963. 15. J. R. Shoenfield. On degrees of unsolvability. Ann. of Math. (2), 69:644–653, 1959. 16. R. I. Soare. Cohesive sets and recursively enumerable Dedekind cuts. Pacific J. Math., 31:215–231, 1969. 17. R. I. Soare. Recursively enumerable sets and degrees. A study of computable functions and computably generated sets. Perspectives in Mathematical Logic. Springer-Verlag, Berlin, 1987. 18. X. Zheng. On the Turing degrees of weakly computable real numbers. Journal of Logic and Computation, 13(2):159–172, 2003. 19. X. Zheng. Computability Theory of Real Numbers. Habilitation’s thesis, BTU Cottbus, Germany, Feb., 2005.

Why is P Not Equal to N P ?

?

Michael Fellows1,2 and Frances Rosamond1 1

University of Newcastle, Callaghan NSW 2308, Australia {michael.fellows,frances.rosamond}@newcastle.edu.au 2 Durham University, Institute of Advanced Study, Durham DH1 3RL, United Kingdom

Abstract. The question of whether the complexity classes P and N P are equal remains one of the most important open problems in the foundations of mathematics and computer science. We discuss two conjectures that seem to be reasonably accessible to intuition, and that imply P 6= N P . One of these is a novel conjecture about the existence of intrinsically hard puzzles. We propose a way to explore this conjecture empirically in a way that may have a number of scientific payoffs.

1

Introduction

This paper is primarily concerned with issues of mathematical aesthetics and speculation. Some well-known researchers in computational complexity have predicted that the “P versus N P ” question may well be with us for some time. Ogihara, for example, has suggested that it might not be resolved before the year 3000, so difficult and lacking of programmatic access does the question appear [Hem02]. Aaronson has argued that we may never have a proof one way or the other [Aar06b]. The question is so fundamental, of such practical importance, and philosophically so accessible and compelling, that even in the absence of mathematical progress, there is still interest (and usefulness, we would argue) in the philosophical appraisal of the problem, and in informed opinions about how it might turn out, what techniques might be relevant, and how difficult it might be. The (notably diverse) 2002 “poll” of expert opinion on these questions by Bill Gasarch [Gas02] can be taken as one primary reference point for discussions of this sort. The Gasarch survey has been widely noted, and used in educational settings to stimulate thinking (e.g., courses on algorithms and computational complexity). The issue has also proved irresistable in the theoretical computer science blogosphere. Discussions that can also be taken as reference points are the blogs by Aaronson [Aar06a], and the videoblog rejoinder by Fortnow and Gasarch [FoGa06]. Aaronson has given a thoughtful list of ten reasons for believing P 6= N P that provides some useful initial “order” to the area [Aar06a]. What we do here is simply add another opinion, not already represented in the Gasarch catalog, and (hopefully) contribute some novel points of discussion. We discuss the “P versus N P ” question in relation to two other conjectures: • The conjecture that FPT is not equal to W [1]. • A conjecture that we introduce here and term the Hard Puzzles Conjecture. ?

This research has been supported by the Australian Research Council through the Australian Centre for Bioinformatics, by the University of Newcastle Parameterized Complexity Research Unit under the auspices of the Deputy Vice-Chancellor for Research, and by a Fellowship to the Durham University Institute for Advanced Studies. The authors also gratefully acknowledge the support and kind hospitality provided by a William Best Fellowship at Grey College while the paper was in preparation.

152

Michael Fellows and Frances Rosamond

In Gasarch’s survey, 61 complexity theory experts testified in favor of P 6= N P , 9 surmised that P = N P , 4 gave the opinion that it is independent (e.g., of ZFC) and 22 offered no opinion. It can happen in mathematics that statement S1 implies statement S2 , while S1 may seem “more intuitive” than S2 . In considering the “P versus N P ” question, it is natural to ask if there are other statements that might be even more accessible to mathematical intuition, that might help in forming an opinion about it. In particular, are there candidate statements that might be at least as defensible, in the sense of the kind of intuitive argumentation offered in [Aar06a], as the P 6= N P conjecture? One of the remarkable characteristics of the P versus N P question is that some of the arguments in favor of P 6= N P can be explained to anybody (in particular, Reason 9, “The Philosophical Argument” of the Aaronson list). If such questions about computational complexity are going to be with us a long time, then it is reasonable to ask if there might be other such statements that are natural resting points for mathematical intuition. We are with the majority: we believe P 6= N P , for the following reasons.

2

Because FPT is not Equal to W [1]

There are three natural forms of the Halting Problem, all three of which play central roles in metamathematics and contemporary theoretical computer science. These are: The Halting Problem (I) Instance: A description of a Turing machine M. Question: On an empty input tape, will M halt? The Polynomial-Time Halting Problem for Nondeterministic Turing Machines (II) Instance: A description of a nondeterministic Turing machine M. Question: On an empty input tape, can M halt in at most |M| steps? This problem can also be defined with a halting time of |M|c for any fixed constant c, without effect on the discussion. The k-Step Halting Problem for Nondeterministic Turing Machines (III) Instance: A description of a nondeterministic Turing machine M where the alphabet size is unlimited and the amount of nondeterminism (the number of transitions that might be possible at any computation step) is unlimited. Parameter: k Question: On an empty input tape, can M halt in at most k steps? It is quite natural to focus on this “natural descent of the Halting Problems” for a number of reasons: 1. Halting Problem I is fundamental to what we can prove about unsolvable problems. It is also worth noting that Gödel’s Incompleteness Theorem (in a very natural form — the form concerned with proving things about computation) follows as a one paragraph corollary. 2. Halting Problem II is trivially complete for NP, basically a re-presentation of the definition of NP. 3. Halting Problem III is complete for W [1]. The conjecture that FPT 6= W [1] plays a role in parameterized complexity analogous to the P 6= N P conjecture in classical complexity.

Why is P Not Equal to N P ?

153

We argue that the FPT 6= W [1] conjecture is at least as defensible, in the sense of Aaronson, as the P 6= N P conjecture. We argue that it displays the essential issues in a way that is easier for mathematical intuition to approach. This opinion might be disputed, but if there is to be any discussion of why P = 6 N P is true, then a claim that there is a more defensible conjecture which implies it is at least worth examining. Because the subject of parameterized complexity theory is not so well known, some background discussion is in order. 2.1

Some Background on Parameterized Complexity

Some concrete results of the theory may serve to set the stage for readers unfamiliar with the subject: – The primordial combinatorial problem of Vertex Cover seeks to determine if it is possible to choose a set V 0 ⊆ V of k nodes in a conflict graph G = (V, E) on |V | = n nodes, so that every edge of the graph is incident on at least one vertex of V 0 (all the conflict edges are “covered” in the sense that if the vertices of V 0 are deleted, then there are no more conflicts between the “data points” represented by V − V 0 ). Although NP-complete, after several rounds of increasingly sophisticated algorithmic improvements developed by many researchers in the context of parameterized complexity, this can now be solved in time O(1.274k n2 ) [CKX05]. This is a result of an increasingly sharp positive toolkit of FPT design techniques. As one can easily imagine, conflict graphs are used to model issues in a vast range of computing applications. – There is a natural hierarchy of parameterized complexity classes F P T ⊆ M [1] ⊆ W [1] ⊆ W [2] ⊆ · · · ⊆ W [P ] ⊆ XP with respect to which hundreds of diverse combinatorial problems have been classified, revealing the surprising fact that there are only four or five natural degrees (under parameterized reducibility) that capture “almost all” natural parameterized problems. This empirical fact is analogous to the surprising ubiquity of the NP-complete degree in classical (one-dimensional) computational complexity. It is now known that F P T = M [1] if and only if n-variable 3SAT can be solved in time O(2o(n) ) [DEFPR03]. The conjecture that FPT 6= M [1] is equivalent to the Exponential Time Hypothesis introduced by Inpaggliazzo et al. [IPZ01] (see also the survey by Woeginger [Woe03]). A fundamental result, part of a parameterized analog of Cook’s Theorem, is that the k-Step Halting Problem for Turing machines of unrestricted nondeterminism (trivially solvable in time nO(k) ) cannot be in FPT (that is, solvable in time O(f (k)nc )) unless F P T = W [1], i.e., this natural 2-dimensional form of the Halting Problem is complete for W [1] [DF99]. Determining whether a graph has an independent set of size k is complete for W [1] [DF99]. Determining whether a graph has a dominating set of size k is complete for W [2] [DF99]. Obviously, both of these problems can be solved by the brute force approach of trying all k-subsets in time nO(k) . – The negative toolkit of parameterized complexity is producing deeper and deeper information. Recent results show that k-Dominating Set cannot be solved in time no(k) unless F P T = M [2], and k-Independent Set cannot be solved in time no(k) unless F P T = M [1] [CCFHJ+04]. We also now know that there exists a constant ²V C > 0 such that the Vertex Cover problem cannot be solved in time O((1 + ²V C )k nc unless F P T = M [1], hence the striking improvements given by the positive toolkit for this problem must eventually end, although we currently have no concrete bounds on the barrier constant ²V C . The positive algorithmic improvements for Vertex Cover have led directly and strikingly to improved practical algorithms for this NP-hard problem.

154

Michael Fellows and Frances Rosamond

– As another example of the way that parameterized complexity interacts with practical computing, there is a natural problem in the context of the implementation of programming language compilers known as Type Checking (checking the consistency of the type declarations for variables). For the logic-based language ML this was shown to be hard for EXP (a superset of NP). However, the implementors of compilers for ML noted that this result seemed to be irrelevant, “The compiler works just fine.” In the context of parameterized complexity, an explanation has emerged. The natural input distribution for the problem (ML programs written by human beings), tends to have small nesting depth of type declarations. That is, there is natural parameter k for this problem, measuring the maximum nesting depth of the type declarations, for which in practice almost always k ≤ 5. The ML implementations naively employed a type-checking algorithm that runs in time O(2k n), where n is the size of the program — thus an FPT algorithm, although not deliberately developed or known in these terms by the computing practitioners who implemented the compiler. There are many such examples where the concepts of parameterized complexity clarify and mathematically deepen naive approaches that have been adopted by programmers facing hard problems. Further background can be found in the survey of Downey [Dow03] and the recent monographs of Flum and Grohe [FlGr06] and Niedermeier Nie06. 2.2

Why the FPT 6= W [1] Conjecture Seems More Accessible to Intuition

Aaronson’s Reason 9, “The Philosopical Argument” might be of genuine interest to philosophers of science. This is the argument that one generally uses to explain the issue to a nonmathematician. Paraphrasing Aaronson, if P = N P , then the world would be a profoundly different place than we usually assume and experience it to be. “There would be no special value in creative leaps, no fundamental gap between solving a problem and recognizing a solution.” Wouldn’t life have evolved to take advantage of such an equality, and the biological world be very different? The Philosophical Argument interprets the mathematical conjecture physically and metaphorically, and under this translation is quite accessible to anyone’s intuition. The heart of our argument that the FPT 6= W [1] Conjecture is more directly accessible to intuition is that “The Philosophical Argument” seems to apply with even greater force. Most creative breakthroughs involve five key steps, or ten, not “a polynomial number”. Halting Problems I and II can be directly interpreted in terms of searching unstructured mazes. Trying to find a way out that involves a small or moderate fixed number of steps seems more directly comparable with the way the world, and mazes, are experienced. At any given moment, having many different possible choices also fits well with the way the world is experienced. The conjecture inhabits a more “finitistic” conceptual space, closer to the one that we do in fact physically inhabit, as discussed in the concluding section of Aaronson’s essay on the possible independence of the P versus N P question [Aar06b]. How do you solve the 10 step Halting Problem in O(n9 )? Aaronson’s list of ten reasons for believing P 6= N P includes a number of other reasons that reflect accumulated mathematical experience, as well as structural complexity results. Reason 1, The Obvious Argument, is that people have tried for a long time to find polynomial-time algorithms for canonical NP-complete problems such as Circuit Satisfiability and have consistently failed. The W [t] classes are defined in terms of weight k circuit satisfiability, a quite natural restriction that has also received attention, although perhaps somewhat less. Here too, no algorithm essentially better than brute force has been found. Reason 2, The Empirical Argument, is about the evidence provided by half a century of evidence from the efforts of practical algorithm designers in the computer industry. This same history, viewed differently, also offers evidence of great efforts spent to devise FPT algorithms (before there was a name for this notion). Practitioners have long paid attention to parameters

Why is P Not Equal to N P ?

155

and tried to exploit them where possible. One of the successes of the parameterized complexity framework is the way it has succeeded in capturing more sensitively the way that practical computing has actually been conducted all along. Some of the other Reasons that concern mathematical evidences, such as the surprising collapses that would occur if P = N P , are here less developed (which should make this an attractive area for investigation) but some striking things are known. The tower of parameterized intractability classes begins with: F P T ⊆ M [1] ⊆ W [1] ⊆ M [2] ⊆ W [2] · · · and it is known that FPT is equal to M [1] if and only the Exponential Time Hypothesis (ETH) fails [DEFPR03]. Thus FPT = W [1] would imply that n-variable 3SAT can be solved in 2o(n) time. Some known lower bounds on circuit complexity carry over to provide indirect evidence that the W [t] hierarchy is proper [DF99]. Incidentally, but relevantly, we do not feel much intuitive connection to the ETH — this seems difficult (to us) to form much direct intuition about. How would a Gasarch-style survey of the ETH turn out? 3

3

Because There Are Hard Puzzles

We first describe some concrete background experiences that have led to the Hard Puzzles Conjecture (HPC). We then show that the HPC implies P 6= N P . Finally, we propose a way in which the HPC can be concretely explored, and some possible payoffs of such an empirical investigation. 3.1

Background

For many years the authors have been active in popularizing topics in theoretical computer science to elementary school children, using concrete activities such as described in the books This Is MEGA-Mathematics! [CF92] and Computer Science Unplugged [BWF96], and in the paper by Fellows and Koblitz about presenting modern cryptography to children [FK93]. The “P versus N P ” question can be made concretely accessible to children in a number of ways. For example, to get the issue across concretely, one can first pass out a 2-colorable graph (without telling the children that it is 2-colorable), and explain the basic rule for a proper coloring: that vertices joined by an edge must get different colors, and the basic objective: finding a coloring with a minimum number of colors. Almost always, the solutions will improve for a few rounds, before someone discovers that a 2-color solution is possible (basically rediscovering the 2-coloring algorithm). In the second part of the presentation, one can pass out a 3-colorable graph, such as the one shown in Figure 1(b). This particular instance has proved itself in dozens of classroom visits as usually defying everyone’s efforts for most of an hour. Figure 1(a) shows a similarly “proven useful” small hard instance of Minimum Dominating Set. Based on these experiences we wish to raise a few issues about hard combinatorial problems and algorithm design that may have bearing on the P versus N P question. It seems to us remarkably easy to design relatively small hard instances for most NP-hard problems, and one can’t help sensing a kind of amorphous opacity beginning to assert itself even in these small instances: there just isn’t much structure to hang on to, and one doesn’t get much “inkling” of algorithmic possibility. It seems to us that for most of the problems for which we have P-time algorithms, one can point to such “inklings” and “partial algorithms” and “possible strategies” 3

Experts in parameterized complexity are divided on whether perhaps M [1] = W [1]. This may be accessible to “holographic” proof techniques (if it is true).

156

Michael Fellows and Frances Rosamond

(a)

(b)

Fig. 1. Some small hard instances of NP-hard problems used in classrooms: (a) Minimum Dominating Set, (b) 3Coloring

that begin to appear if you spend some time solving some concrete instances. Amorphous opacity is the quality of an unstructured maze. This is essentially Argument 6, The KnownAlgorithms Argument, of Aaronson, but here experienced afresh in small hard instances in an elementary school classroom, using our heads as computers. In this context, the following seemingly natural conjecture concerning the existence of such hard puzzles has suggested itself. The Hard Puzzles Conjecture (for Graph 3-Coloring). There exists a sequence of increasingly larger 3-colorable graphs G1 , G2 , G3 , . . . Gn , |Gi | = i, such that ∀c, ∃nc such that ∀n ≥ nc , and ∀ algorithm φ for which: (1) |φ| ≤ c (c is thus a bound on the program size), and (2) φ computes a 3-coloring for Gn (the single “puzzle” graph Gn ; there are no requirements on φ being able to solve any other puzzle) the program φ will necessarily require Ω(nc ) time to compute the 3-coloring. We believe that, over time, this may come to be seen as a very natural conjecture (with versions, of course, for any NP-hard problem). 4 3.2

The Hard Puzzles Conjecture Implies P 6= N P

The proof is immediate: if P = N P then there exists a single algorithm φ and two constants c0 and c1 such that: (1) |φ| ≤ c0 , and (2) for every 3-colorable graph G of size n, φ takes time at most O(nc1 ). In the conjecture, c plays both of these roles: placing a bound on the size of the algorithm, and giving us an exponent for a target polynomial running time. The conjecture essentially says that there exists a series of 3-coloring puzzles that are so intrinsically hard, that no matter how generous you are about the size c allowed for the 4

It might also be seen to be in harmony with the possible terrible existential destiny of mathematics sketched in Aaronson’s essay [Aar06b], where we may be confronted, perhaps forever, with key puzzles of mathematics that defy all efforts.

Why is P Not Equal to N P ?

157

algorithm, and for the running time of the exponent — if you go far enough out in the list of puzzles, and even if you are only looking for algorithms that solve a single puzzle, then you will be disappointed: any algorithm that is small enough will take too much time. (The reason for the restriction on the size of the algorithm should be clear, since without this restriction, for an algorithm designed to solve a single puzzle, there would always be the trivial linear time algorithm that simply prints the solution.) Any proof of the HPC would seem to have to be extremely nonconstructive, and in any case, cannot be expected anytime soon. We remark that the HPC seems to be a different, albeit, related notion to instance complexity, as defined by several authors [OKSW94,Mu00].

4

Engaging the Hard Puzzles Conjecture: A Proposal for a Stock Market of Hard Instances

If we could somehow find the hard puzzles promised by the conjecture, this could be useful and interesting. There seem to be several routes to “where the hard instances are”. We mention two that have previously been investigated: 1. Random Instances from the Phase Transition. Cheeseman, et al. [CKT91] introduced the notion of phase transitions in the distribution of hard instances to NP-hard problems. One of their key examples was Graph Coloring where they discuss polynomial-time reduction rules to simplify an instance. 2. FPT polynomial-time kernelization. This is discussed in the monograph by Niedermeier [Nie06]. The hard instances are the kernelized instances. Whether a strong set of kernelization rules will take you to the phase transition boundary is an interesting unresolved issue. For example, if (G, k) is an instance of the Vertex Cover problem, then in polynomial time we can reduce to an equivalent instance (G0 , k 0 ) where k 0 ≤ k and where the number of vertices in G0 is bounded by 2k 0 . Thought Experiment. We propose to employ “the minds of children” in an implicit genetic algorithm. Suppose there were a Hard Instance Stock Market where (small?) investors could contribute a small, presumably hard, instance of an NP-hard problem for a small fee, such as $1. The Stock Market would evaluate the invested instances by gathering statistics on competitive puzzle-solving. Perhaps each instance in the top 10% is awarded a $60 prize, and every instance in the top 0.01% is awarded $1000. Investing in the market might allow one to freely download puzzle instances from the stock market to a cellphone. This business model would seem to combine two of mankind’s greatest addictions: puzzling, and something like gambling. Inevitably, if such a venture succeeded (and besides serving to popularize theoretical computer science on a grand scale), some developments can be predicted. If there were big money in it, then adults with computers would get involved. Who would win, in designing the best small hard instances? This is somewhat akin to the problems confronted by, e.g., chess-playing programs, and it is not at all clear that adults with computers would have an advantage over the minds of children or other natural players. This seems to be an interesting scientific question about human cognitive abilities in relation to the power of computers that we propose to call the reverse Turing test. A collection of small, somehow certifiably, intrinsically hard instances would also seem to be useful for the testing of heuristics for hard problems. Perhaps, evidence that the HPC is true would gradually emerge.

158

5

Michael Fellows and Frances Rosamond

Discussion and Open Problems

In considering a discussion such as offered in this paper, it is necessary to confront an issue of traditional mathematical culture head on. There is a point of view held by some that any discussion of “mathematical intuition” is oxymoronic. Of course, this is an old discussion, taken up on the other side by, e.g., Lakatos [Lak76]. There are also “softer forms” of the opinion discouraging discussion, for example, the one contributed by Fortnow to the Gasarch collection, where essentially it is argued that since we currently have so little programmatic traction towards settling the question, that there is not enough relevant mathematical material to support a credible discussion of, say, serious intuition. (It is notable, and we think reflects this traditional mathematical taboo, that some of the responses to the Gasarch survey seem a bit flippant.) In the spirit of Aaronson’s thoughtful and serious attempt to catalog the reasons to believe P 6= N P , we offer here an initial catalog of: 5.1

Four Reasons Why The Discussion is Interesting, Useful and Worthy

Reason 1. Inevitability and Responsibility. Like or not it, the discussion is already well underway, and will inevitably continue for the forseeable future. Others will certainly be engaging in such discussion and speculation, because of the absolutely compelling nature of the question, and also its relative intuitive accessibility. The mathematical community might as well contribute what it can from an expert perspective, out of general responsibility to intellectual life. Reason 2. Stimulation. The discussion is fun and irresistable (even to those who initially dismiss it, following the traditional mathematical taboos). It may serve to stimulate creative thinking about the problem, or energize and boost the morale of those spending time on it (as advocated by Ron Fagin in the Gasarch opinion catalog). Given the relative failure of mathematical approaches to date, speculative thinking on the question (and related issues) might help to stir up some fresh mathematical approaches. Reason 3. Creative and Responsible Engineering. The fact is, the P 6= N P conjecture is already “operationally true”, and it is far from the only mathematical conjecture that now has this status. The point here is that there is a distinction that should be made between the high church role of computational complexity theory, and what might be called its pastoral role in the world of algorithm design, where theoretical computer science attempts to minister to the needs of industry and the other sciences. (This might be considered as possibly its central mission, by analogy with theoretical physics and other theoretical sciences that have an explanatory or predictive or engineering-support mission.) At the pastoral level, the P 6= N P conjecture is already “true enough for government work”, in the sense that on a day-to-day basis it functions (quite reasonably) to direct the efforts of algorithm designers. Nobody at the pastoral level gets paid to continue working to find a polynomial time constant factor approximation algorithm, after somebody else has shown that this is impossible unless P = N P . If you explained to the famous “boss” in the Garey and Johnson cartoons, that based on your high church belief that P = N P , you wanted to continue the lead role in this project, you’d be fired. The responsibility part here has to do with a more general situation that has developed: the entire enterprise of mathematical sciences and engineering has become immensely “forward leaning”. For example, the world economy now runs on the conjecture that Prime Factorization is difficult. And not just that. The companies that develop and sell elliptic curve cryptosystems have to gather investor money, and convince customers that the system is secure. Essentially they have to explain why investing in conjectures about elliptic curves is reasonable. Most investors would not have a problem investing in P = 6 NP. The creativity part here has to do with another distinction, between conjectures that have operational value, and those that do not. The Exponential Time Hypothesis and the conjecture

Why is P Not Equal to N P ?

159

that FPT is not equal to W [1] have day-to-day uses at the pastoral level. It is useful to identify plausible conjectures that can help routinely to establish lower bounds on problem kernel sizes for parameterized problems that are FPT (see the discussion in Burrage, et al. [BE+06], also [Fe06]), and we currently do not have such conjectures identified. Similar issues are confronted in the recent work of Harnik and Naor [HN07]. Reason 4. It is Good for the Field. Computational complexity theory might offer more job opportunities in the long run, if it conceived of itself as concerned with the identification of statements that are “probably true” and that have operational value, whether provable or not, and doing what it can to gather some evidence to support such conjectural frameworks that are useful at the pastoral level. Since most of the candidate statements imply P = 6 N P , the field is not going to find these gems by waiting for a mathematical resolution of the P versus N P question, before finding time and inclination. It is surprising that there is not a single mention of the F P T 6= W [1] conjecture in such comprehensive surveys of the field as, for example, Fortnow and Homer [FH02], even though the pastoral role of the conjecture is so strong. The field of computational complexity theory seems to have centered itself on the P versus N P (and related) questions in a manner that one might expect for Recursion Theory II, but not necessarily for a theoretical science accountable to functional criticism in the manner of theoretical physics (which is held accountable to explanation and prediction). No other theoretical discipline has so tied itself to a central question that: (1) It apparently cannot answer, in the terms it has set for finding an answer. (2) The answer, were it proven, would surprise nobody. (3) The outcome would have absolutely no impact on practice. It would be unfortunate if the state of the field were to become the state of the question. 5.2

Two “New” Reasons for Believing P = 6 NP

We have offered here two reasons for believing P 6= N P , the first being the conjecture that FPT is different from W [1]. We believe that this conjecture, while closely related and less wellknown, is actually easier to think about and admits a defense at least as strong as P 6= N P , despite that it is mathematically a “stronger” conjecture. The Hard Puzzles Conjecture (HPC) seems to us to arise quite naturally, and we have described its experiential origins in popularizing to children the P versus N P question. The HPC implies P 6= N P , but it also seems compelling on intuitive grounds (at least to us). Basically the argument comes down to an “Extended Philosophical Argument”. Routinely, we experience situations where small hard puzzles defy everybody (who play the role of the many possible algorithms). More importantly, we have proposed a possible experiment towards exploring the intellectual terrain in the direction of the HPC that seems to be of independent scientific interest. Final Note. The small hard instances depicted in Figure 1, do have some structure that can be exploited (they are really not good representatives of completely amorphous opacity): they are planar. It is √ known that both 3Coloring and Minimum Dominating Set can be √ o( k) O( k) ) for planar graphs, and cannot be solved in time O(n ) unless solved in time O(n FPT = M [1] (equivalently, ETH fails).

References [Aar06a] S. Aaronson. Reasons to Believe. Shtetl-Optimized: The Blog of Scott Aaronson (http:// www.scottaaronson.com/blog/), posted 4 September 2006. [Aar06b] S. Aaronson, Is P versus NP formally independent? Manuscript, 2006.

160

Michael Fellows and Frances Rosamond

[BE+06] K. Burrage, V. Estivill-Castro, M. Fellows, M. Langston, S. Mac and F. Rosamond. The undirected feedback vertex set problem has polynomial kernel size. Proceedings IWPEC 2006, Springer-Verlag, Lecture Notes in Computer Science 4169 (2006), 192–202. [BWF96] T. Bell, I. H. Witten and M. Fellows. Computer Science Unplugged: Offline activities and games for all ages, 231 pp., 1996. (Web available). [CF92] N. Casey and M. Fellows. This is MEGA-Mathematics!, 1992, (134 pp.), available for free from: http://www.c3.lanl.gov/~captors/mega-math. [CCFHJ+04] J. Chen, B. Chor, M. Fellows, X. Huang, D. Juedes, I. Kanj and G. Xia. Tight lower bounds for certain parameterized NP-hard problems. Proceedings of the IEEE Conference on Computational Complexity (2004), 150–160. [CKT91] P. Cheeseman, R. Kanefsky, and W. Taylor. Where the Really Hard Problems Are. Proceedings IJCAI-91 (1991), 163–169. [CKX05] J. Chen, I. Kanj and G. Xia. Simplicity is beauty: improved upper bounds for vertex cover. Manuscript, 2005. [CM97] S. A. Cook and D. G. Mitchell. Finding Hard Instances of the Satisfiability Problem: A Survey. in D. Du, J. Gu, and P. Pardalos, Eds. The Satisfiability Problem. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 35 (1997), 1–17. [DEFPR03] R. Downey, V. Estivill-Castro, M. Fellows, E. Prieto-Rodriguez and F. Rosamond. Cutting up is hard to do: the complexity of k-cut and related problems. Electronic Notes in Theoretical Computer Science 78 (2003), 205–218. [DF99] R. G. Downey and M. R. Fellows. Parameterized Complexity. Springer-Verlag, 1999. [Dow03] R. G. Downey. Parameterized complexity for the skeptic. Proc. 18th IEEE Annual Conf. on Computational Complexity (2003), 147–169. [Fe06] M. Fellows. The lost continent of polynomial time. Proceedings IWPEC 2006, Springer-Verlag, Lecture Notes in Computer Science 4169 (2006), 276–277. [FlGr06] J. Flum and M. Grohe, Parameterized Complexity Theory, Springer-Verlag, 2006. [FK93] M. Fellows and N. Koblitz. Kid Krypto. Proceedings of Crypto ’92, Springer-Verlag, Lecture Notes in Computer Science Vol. 740 (1993), 371–389. [FoGa06] L. Fortnow and W. Gasarch. Number 60: a videoblog response, in “61 Responses to Reasons to Believe.” Shtetl-Optimized: The Blog of Scott Aaronson (http://www.scottaaronson.com/ blog/) posted 21 December 2006. [FH02] L. Fortnow and S. Homer. A Short History of Computational Complexity. (http://www.neci. jn.nec.com/homepages/fortnow), 2002. [Gas02] W. I. Gasarch. Guest column: The P=?NP poll. Sigact News 36 (www.cs.umd.edu/users/ gasarch/papers/poll.ps), 2002. [HN07] D. Harnik and M. Naor. On the compressibility of NP instances and cryptographic applications. Manuscript, 2006. [Hem02] L. A. Hemaspaandra. Sigact news complexity theory column 36. Sigact News, 2002. [IPZ01] R. Impagliazzo, R. Paturi, and F. Zane. Which problems have strongly exponential complexity? Journal of Computer System and Science, 63(4):512–530, 2001. [Lak76] I. Lakatos. Proofs and Refutations. Cambridge University Press (New York), 1976. [Mu00] M. Mundhenk. On hard instances. Theoretical Computer Science, 242 (2000) 301–311. [Nie06] R. Niedermeier. Invitation to Fixed Parameter Algorithms. Oxford University Press, forthcoming, 2005. [OKSW94] P. Orponen, K. Ko, U. Schöning, and O. Watanabe. Instance complexity. Journal of the Association for Computing Machinery Vol. 41, No. 1 (1994), 96–121. [Woe03] G. J. Woeginger. Exact algorithms for NP-hard problems: a survey. Proceedings of 5th International Workshop on Combinatorial Optimization-Eureka, You Shrink! Papers dedicated to Jack Edmonds, M. Junger, G. Reinelt, and G. Rinaldi (Festschrift Eds.) Springer-Verlag, LNCS (2003).

P = NP for Expansions Derived from Some Oracles Christine Gaßner

?

Institut für Mathematik und Informatik, Ernst-Moritz-Arndt-Universität, F.-L.-Jahn-Straß e 15 a, 17487 Greifswald, Germany [email protected]

Abstract. We consider the uniform model of computation over an arbitrary structure. We define an oracle which allows the fast deterministic decidability of all problems in the corresponding polynomial hierarchy and of all problems recognized in alternating polynomial time over this structure. This oracle implies P = NP if it can be used. By means of this oracle it is possible to expand structures over strings by an additional relation which implies P = NP. Moreover, we present an ordered structure over integers with P = NP.

1

Introduction

The uniform model of computation over arbitrary algebraic structures with two constants can be defined analogously to the BSS model over the real numbers which was introduced by Blum, Shub, and Smale. For structures as the ordered ring of reals in the BSS model and the structure ? ({0, 1}; 0, 1; ; =) in the classical setting questions as P = NP are open. Only few structures of finite signature with P = NP are known. The constructions of these structures were based on an idea of Poizat to define an additional relation in order to get P = NP. In [8, 9] structures of finite signature are extended to structures over binary trees and trees of constant width, respectively. The new relations are defined on padded codes of the elements of a universal problem. In [10, 13, 19] structures over binary strings are expanded by relations derived from several hard problems. In [11, 12] for extensions of arbitrary structures some relations are recursively defined on the length of strings. Following the constructions in [8, 9], we extend an arbitrary structure to some extension ¯ containing strings and two constants, and then we define some relation R implying that an IK NP-complete problem is decidable in polynomial time. In [10, 11, 12] the relations are derived from the known Satisfiability Problem. Here, the relation is defined by means of encoding and padding the elements of some oracle W which is recursively defined as an NP-complete problem recognized by a universal machine. Since each computation path can be described by a first order formula, we can analogously to [10, 12] show that, for any machine over the new structure, the replacements of large inputs and guesses by small strings are possible without changing the sequence of executed instructions. For this reason, we can obtain PW = NPW for ¯ and P = NP for the new structure. IK We want to mention that in [13] another idea to derive a structure with P = NP from an oracle is presented. Some structure over binary strings with a special relation was defined by means of an arbitrary classical PSPACE-complete problem P for which PP = NPP was shown by Baker, Gill, and Solovay in [1]. Since the single elements are not assumed as known, another form of padding was necessary. While we double the length of the codes of the elements of some problem as in [10, 12], in [13] every element of the considered PSPACE-complete problem was transformed into an infinite number of padded strings. After introducing the computation model we start with a statement including the result of Baker, Gill, and Solovay. For any structure IK with two constants we define some universal ?

I thank Volkmar Liebscher and Rainer Schimming for encouraging me to continue my investigations and for helpful hints.

162

Christine Gaßner

IK IK oracle OIK with PO = NPO IK IK . It also allows that all problems recognized in alternating polynomial time over IK can be decided by an OIK -machine in polynomial time. The mentioned oracle W is only defined for structures over strings, and it contains only tuples of characters. In the last section we transfer our result to an ordered structure over integers.

2

The Model of Computation

Let IK = (U ; d1 , . . . , dk0 ; f1 , . . . , fk1 ; r1 , . . . , rk2 , =) be a structure of finite signature with two constants where d1 = a, d2 = b, d3 , . . . , dk0 ∈ U are the constants of this structure, f1 , . . . , fk1 are the operations of arities nf1 , . . . , nfk1 , and r1 , . . . , rk2 are the relations of arities nr1 , . . . , nrk2 , respectively. We define the IK-machines analogously to [4, 17, 7] such that we get a natural format of abstract computers for such a structure. Every machine M is provided with registers Z1 , Z2 , . . . for the elements of U and with a fixed number S∞ of registers I1 , I2 , . . . , IkM for indices in IN \ {0}. For an input (x1 , . . . , xn ) ∈ U ∞ =df i=1 U i , the sequence x1 , . . . , xn , a, a, . . . is assigned to the registers Z1 , Z2 , . . .. The index registers get the content n. The labelled computation, copy, and branching instructions have the form Zj := fk (Zj1 , . . . , Zjnf ); Zj := dk ; ZIj := ZIk ; and if cond then goto i1 else k goto i2 ; where cond can have the form Zj = Zk or rk (Zj1 , . . . , Zjnr ). The index registers are k used in the copy instructions. We allow also Ij := 1; Ij := Ij + 1; and if Ij = Ik then goto i1 else goto i2 ;. Moreover, oracle machines can execute if (Z1 , . . . , ZI1 ) ∈ O then goto i1 else goto i2 ; for some oracle O ⊆ U ∞ . The non-deterministic machines are able to guess an arbitrary number m of arbitrary elements y1 , . . . , ym ∈ U in one step after the input and to assign the guesses to Zn+1 , . . . , Zn+m . In any case, the size of an input (x1 , . . . , xn ) is, by definition, its arity n. If the output instruction is reached, then (Z1 , . . . , ZI1 ) is the output and the machine halts. Let MIK and MN IK be the sets of deterministic and non-deterministic IK-machines, respectively. Let MIK (O) and MN IK (O) be the sets of deterministic and non-deterministic IK-machines, respectively, which use the oracle O. A deterministic IK-machine accepts or rejects, respectively, an input in U ∞ if the machine outputs a or b, respectively, without guessing. A IK-machine M accepts an input (x1 , . . . , xn ) ∈ U ∞ non-deterministically if there is some finite sequence (y1 , . . . , ym ) ∈ U ∞ such that M guesses exactly the elements y1 , . . . , ym in one step and if M outputs a for the input (x1 , . . . , xn ) and these guesses. The execution of one instruction is one step of the computation. Each step can be executed in one fixed time unit. A IK-machine comes to the results in polynomially bounded time if there is a polynomial p such that, for every input (x1 , . . . , xn ) ∈ U ∞ , the machine produces, in the deterministic case, the output or, in the non-deterministic case, one output a within at most p(n) time units. For each structure IK, let PIK and NPIK be the usual complexity classes of problems P ⊆ U ∞ decided or non-deterministically recognized by a machine in MIK or in MN IK in polynomial time. O PO and NP denote the polynomial time complexity classes for IK-machines using the oracle IK IK O. FPIK is the set of functions f : U ∞ 7→ U ∞ computable over IK in polynomial time. For any structure IK, we can define the higher alternation levels of the polynomial hierarchy. k A problem P ⊆ U ∞ is in ΣIK (k ∈ IN) iff there exist a set P0 ∈ PIK and polynomial functions p1 , . . . , pk such that for any x ∈ U n , the following condition holds: x ∈ P ⇔ (Q1 y (1) ∈ U p1 (n) ) · · · (Qk y (k) ∈ U pk (n) )(x, y (1) , . . . , y (k) ) ∈ P0 . Qi stands for the existential quantifier ∃ if i is odd and for the universal quantifier ∀ if i is even. Σ k−1

k k There holds ΣIK =NPIKIK . ΠIK denotes the class obtained if the alternating S∞ quantifiers S∞ start k k with the universal one. The polynomial hierarchy is defined by PHIK = k=0 ΣIK = k=0 ΠIK .

P = NP for Expansions Derived from Some Oracles

163

A problem P ⊆ U ∞ is in the class of the problems recognized in polynomial alternating time PATIK iff there exist a set P0 ∈ PIK and some polynomial function q such that for any x ∈ U n : x ∈ P ⇔ (∃y1,1 ∈ U )(∀y1,2 ∈ U ) · · · (∃yq(n),1 ∈ U )(∀yq(n),2 ∈ U ) (x, y1,1 , y1,2 , . . . , yq(n),1 , yq(n),2 ) ∈ P0 .

(1)

We know that PHIK ⊆ PATIK [5]. In the classical setting, we have PAT = PAT({0,1};0,1;;=) = PSPACE (compare [2, 5]).

3

OIK IK An Oracle OIK with PO IK = NPIK

In order to define a universal problem, we want to use the possibility to encode the machines over IK by strings in {a, b}∗ . Definition 1. Let A∗ be the set of strings (or words) over a given alphabet A, and let ε be the symbol for the empty string. The concatenation of two strings s1 and s2 is denoted by s1 s2 as usual. For r ∈ A∗ and any sets S, S1 , S2 ⊆ A∗ , let S1 S2 = {s1 s2 | s1 ∈ S1 & s2 ∈ S2 }, rS = {r}S, and Sr = S{r}. For every s ∈ A∗ , let |s| be the length (the number of the characters) of s. For any nonnegative integer k and S ⊆ A∗ , let S (≤k) = {r ∈ S | |r| ≤ k}, S (=k) = {r ∈ S | |r| = k}, and so on. Definition 2. Let Scode =df b2 ({a, b}∗ \ ({a, b}∗ b2 {a, b}∗ ))a be a set of codes such that the codes contain the sub-string b2 as prefix only. Let Code∗IK be an injective mapping of the set of all deterministic and non-deterministic IK-machines into Scode such that every character of the programs is unambiguously translated into a string by this mapping. Let every index j of a variable or the labels i1 , i2 , . . . be represented by aj and let every other character of the programs be represented by other suitable strings in (a{a, b}a · · · {a, b}a)(=n0 ) for one fixed integer n0 . Note that we omit the index IK since confusion is not to be expected. In general, the strings over U are not elements of U . In such a case, we transform the strings into tuples and we use the tuple notation (c1 , . . . , ck ) ∈ U ∞ in order to point out that we shall use k registers in storing theses objects. Definition 3. For every non-empty string s = c1 · · · ck ∈ (U ∗ )(=k) ⊂ U ∗ , let dse be the representation of s in the form of a tuple (c1 , . . . , ck ) ∈ U k ⊂ U ∞ , that means that dc1 · · · ck e = (c1 , . . . , ck ). For any IK-machine M, let Code(M) = dCode∗ (M)e. As in the last section, we use the vector notation in a tuple for a sequence of components of the tuple. (x, dc1 · · · ck e) stands for (x1 , . . . , xn , c1 , . . . , ck ) and (x1 , . . . , xl0 , c1 , . . . , ck ), respectively, and the like. The following problems can be recognized by universal oracle machines. Definition 4. Let t UNIIK (O) = {(x, Code(M), dbt e) | x ∈ U ∞ & M ∈ MN IK (O) & M(x) ↓ } t be the universal NPO IK -problem where M(x) ↓ means that M accepts x for some guesses S (i) (0) (i) within t steps. Let OIK = i∈IN OIK be an NPIK -universal oracle where OIK = ∅ and OIK = S (j) t {(x, Code(M), dbt e) ∈ U i | M ∈ MN IK ( j 0, (i) i UNIIK (OIK ) ∩ U = OIK holds since we assume that the codes of the oracle machines do not depend on the used oracle.

164

Christine Gaßner

OIK IK Proposition 1. UNIIK (OIK ) = OIK ∈ PO IK = NPIK .

u t

Proposition 2. Any problem in PATIK can be reduced to OIK by some function in FPIK . That IK means that PHIK ⊆ PATIK ⊆ PO IK . Proof. We want to use that ∃yi,1 ∀yi,2 H is true iff ∃yi,1 ¬(∃yi,2 ¬H) is true. Let M0 decide any problem P0 in (1) without using guesses in a time bounded by some polynomial p. Then, M1,1 recognizes P in (1) non-deterministically, if, for i ≤ q(n), the machine Mi,1 works with the program (Pi,1 ), the machine Mi,2 uses, for i < q(n), the program (Pi,2 ), and Mq(n),2 outputs a for any guess yq(n),2 iff (z (q(n),1) , yq(n),2 , Code(M0 ), dbp(n+2q(n)) e) 6∈ OIK . In any case Mq(n),2 computes the |Code∗ (M0 )| + p(n + 2q(n)) new values in polynomial time tq(n),2 . Its results are used by Mq(n),1 , and so on. The wished reduction function in FPIK assigns the code of M1,1 which determines the code of the next machine, and so on, to any input x ∈ U n ⊂ U ∞ . Program (Pi,1 ). 1: Input z (i−1,2) = (x, y1,1 , y1,2 , . . . , yi−1,1 , yi−1,2 ); guess yi,1 ,. . .; 2 : compute (Code(Mi,2 ), dbti,2 e); 3 : if (z (i−1,2) , yi,1 , Code(Mi,2 ), dbti,2 e) ∈ OIK then goto 4 else goto 5; 4: output b; 5: output a. Program (Pi,2 ). 1: Input z (i,1) = (x, y1,1 , y1,2 , y2,1 , . . . , yi−1,2 , yi,1 ); guess yi,2 ,. . .; 2: compute (Code(Mi+1,1 ), dbti+1,1 e); 3: if (z (i,1) , yi,2 , Code(Mi+1,1 ), dbti+1,1 e) ∈ OIK then goto 4 else goto 5; 4: output b; 5: output a.

4

u t

A Relation R with PIK ¯ R = NPIK ¯R (A)

(A)

(A)

(A)

(A)

be an ni Definition 5. Let IK = (A; d1 , . . . , de0 ; f1 , . . . , fe1 ; r1 , . . . , re2 , =). Let fi ary operation and let ri be an n0i -ary relation. e0 = 0, e1 = 0, and/or e2 = 0 are possible. We also permit A = ∅. Let a and b be two constants where a = d1 if e0 ≥ 1, b = d2 if e0 ≥ 2, and ¯ be the structure a, b ∈ A if |A| ≥ 2. Let A = (A ∪ {a, b})∗ and let IK (A; a, b, d3 , . . . , de0 , ε; f1 , . . . , fe1 , add, subl , subr ; r1 , . . . , re2 , =) . a, b, d3 , . . . , de0 , and ε are the only constants. add is a binary operation for adding a character to a string. subr and subl are unary operations for computing the last character and the remainder of a string, respectively. That means that these functions are defined for the strings s ∈ A, r ∈ A \ (A ∪ {a, b}), and c ∈ A ∪ {a, b} by add(s, c) = sc, subl (sc) = s, subr (sc) = c, add(s, r) = ε, subl (ε) = ε, and subr (ε) = ε. For each i ≤ e1 , let fi be an ni -ary function of Ani into A ∪ {ε} ⊂ A. For all (s1 , . . . , sni ) ∈ ni A , let ½ (A) fi (s1 , . . . , sni ) if (s1 , . . . , sni ) ∈ Ani fi (s1 , . . . , sni ) = ε otherwise ¯ if be given. For all i ≤ e2 , let ri be an n0i -ary relation on A. Let ri (s1 , . . . , sn0i ) be true in IK 0

(A)

and only if (s1 , . . . , sn0i ) ∈ Ani and ri

(s1 , . . . , sn0i ) is true in IK.

In the definition of the next oracle and the additional relation we use that the tuples of strings can be encoded by strings in the following way. Definition 6. For every string s ∈ A, let the value hsi be recursively defined by hεi = a, hrci = hrica for all strings r ∈ A and all characters c ∈ A ∪ {a, b}. For every integer n > 1 and every tuple (s1 , . . . , sn ) ∈ An , let hs1 , . . . , sn i be the string hs1 ib2 · · · hsn ib2 .

P = NP for Expansions Derived from Some Oracles

165

S (0) Definition 7. Let W = i∈IN W (i) be an NPIK = ∅ and W (i) = ¯ -universal oracle where W S (j) ¯R {(dhxie, Code(M), dbt e) ∈ (A ∪ {a, b})i | M ∈ MN ) & t ≥ 1 & M(x) ↓t }. Let IK ¯ ( j 0 and for each x ∈ Al0 , (l ,l )

(l ,l )

m ¯0 = m ¯ 0 0 1 = l0 (l1 + 1) − 1 , m0 = m0 0 1 = (l0 + 1)(l1 + 1) − 2, (l ,0) (l1 ,0) ¯1 = m ¯ 1 1 = l1 − 1 , m1 = m1,x = max{|x1 |, . . . , |xl0 |} + l1 , m (l ,l ) ¯ R -machine maxi = maxi 0 1 = 4mi + 5m ¯ i + 2|Code∗0 | + 11, and Code∗0 is the code of some IK which accepts any input within 4 steps. Since we use these notations in a fixed context, we omit the indices (l0 , l1 ), (l1 , 0), and x. N Note that we do not distinguish between the codes of the machines in MN ¯ R |B and MIK ¯ R. IK Since we know that every problem in NPIK ¯ R can be reduced to UNI by a function in FPIK ¯ R, we shall show PIK ¯ R = NPIK ¯ R by proving the following.

1. 2. 3. 4. 5.

UNI can be reduced to the problem SUB-UNI by a function in FPIK ¯ R. SUB-UNI ⊂ RES-UNI. SUB-UNI ∩ RES-UNI ∈ NPW ¯ . IK W W NPIK ¯ = PIK ¯ . PW ⊆ P . ¯ IK IK R

We consider the computation of l1 steps of a machine for an input x ∈ Al0 and guesses. Let (z1 , . . . , zl0 ) = (x1 , . . . , xl0 ), and let the guesses and the computed values be in {zl0 +1 , . . . , zl0 +l1 } = {y1 , . . . , yl1 }. Then there are chains of strings zi1 ⊂1 · · · ⊂1 ziv where each predecessor is the result of deleting the last character in the successor and where each zij is the prefix of the level k − j of zik if k ≥ j. Whereas each minimal element of a maximal chain is a prefix of the other strings in the chain, no prefix of this minimal element is computed from the strings in this chain. Let us specify the used notions and introduce some useful notations. Definition 9. For s ∈ A and k ∈ IN, let the prefix s[k] of the level k of s be recursively defined by s[0] = s and s[k+1] = subl (s[k] ). For any r, s ∈ A, let r ⊂1 s be true iff r = s[1] and s 6= ε. For each tuple h = (h1 , . . . , hw ) ∈ Aw , let Mh be the set {h1 , . . . , hw }, particularly, Mz is defined by z = (z1 , . . . , zl0 +l1 ). The set M = {g1 , . . . , gv } ⊆ Mz is called a maximal chain of Mz if g1 ⊂1 · · · ⊂1 gv and there is not any g ∈ Mz satisfying g ⊂1 g1 or gv ⊂1 g. Let Vz be the set of all maximal chains of Mz .

166

Christine Gaßner

For any z ∈ Mz , let the minimal element minz (z) be the smallest prefix r of z for which there is some maximal chain r ⊂1 · · · ⊂1 z ⊂1 · · · in Mz such that r is the smallest element of this chain. For all h with Mh ⊆ Mz , let M inh be the set {minz (z) | z ∈ Mh }. Moreover, for all g ∈ M S inz , let Vz,g be the set of all chains M ∈ Vz containing g as minimal element. Let Vz,g = M ∈Vz,g M . Each minimal element g ∈ M inz is an input value, a guess, or a computed value, and it is a prefix of all strings in Vz,g . Each computation of any value can be described by some chains of total length ≤ l1 + 1 one of which includes an input value or, in the other case, at least one guess. Therefore, there are some i ≤ l0 or j ≤ l1 and some k1 ≤ l1 and k2 < l1 such that [k ] [k ] g = xi 1 or g = yj 2 for some guess yj if g ∈ M inx and g ∈ M inz \ M inx , respectively. Any part of a chain in Mz which contains only guesses and computed values has a length ≤ l1 . Lemma 1. (1) If zi = zj , zi ⊂1 zj or zj ⊂1 zi holds, then minz (zi ) = minz (zj ) = g for some string g and zi , zj ∈ Vz,g . (2) For each minimal element g ∈ M inx there is some ig ≤ l0 such that there holds [l ] Vz,g ⊆ xig1 A(≤l0 +2l1 −1) . (3) For each minimal element g ∈ M inz \ M inx there are some jg ≤ l1 and lg < l1 such [l ] that there holds Vz,g ⊆ yjgg A(≤l1 −1) and yjg is a guess. u t Proposition 4. UNI = RES-UNI. [l ]

Proof. We can replace each prefix yjgg ∈ A(>m1 ) in all z ∈ Vz,g by some ¯ 1 −jg ¯ 1) ¯ 1) qg = hajg bm1 +m iCode∗0 b4 akg ∈ A(>m1 +m ∩ A(≤max1 −m [l ]

such that R(yjgg ak ) = R(qg ak ) for all k ≤ l1 − 1. Then, qg1 is not a prefix of qg2 , and reverse, if g1 6= g2 . The behaviour of any M in MN ¯ R on a given input and the new guesses within the IK first l1 steps is the same one as on this input and the old guesses. u t If each string s in a condition R(s) has polynomial length, then the branching conditions can be transformed into oracle queries in polynomial time. Proposition 5. SUB-UNI ∩ RES-UNI ∈ NPW ¯ . IK

u t

If we want to make analogous replacements of prefixes in an input x, it is required that all possible different combinations of guesses are considered. Different guesses could generate different computation paths, different sets Mz \ Mx , and different relevant chains. In any case of replacements, all possible chains must keep their form. Because of (1) in Lemma 1 this condition is sufficient. Besides (2) in Lemma 1 we can use that, for any g ∈ M inx and any [k ] [k ] [kt ] [k0 ] xk ∈ Vz,g , there is a sequence xj1 = xk 1 , xj2 = xj12 , . . . , xjt = xjt−1 = xig in Vz,g where ki ≤ l1 + 1 and kt , k 0 ≤ l1 . The relevant connections between input values can be described by the following relation. The first part of its definition for any values xi and xj is restricted on the existence of chains of {xi , xj , y1 , . . . , yl1 }. The transitive closure allows us to consider all common prefixes of input values which are reachable by traversing all possible chains described above. [l +1]

Definition 10. For i, j ≤ l0 , let xi º xj be true if xi 1 Moreover, let xi º xk and xk º xj imply xi º xj .

[l ]

[1]

[l ]

= xj or xi 1 ∈ {xj , xj , . . . , xj 1 }. [l ]

¯ 0) If xi º xj , then there is some s ∈ A(≥l1 ) ∩ A(≤m such that xi = xj 1 s. Since, for any z and each minimal element g, xk ∈ Vz,g implies xk º g, this relation is suitable to decompose Mx into equivalence classes pv Sv without decomposing the sets of the form Vz,g ∩Mx for any z with ¯ 0) Mx ⊆ Mz . We define prefixes p1 , p2 , . . . which satisfy Vz,g ∩ Mx ⊆ pvg Svg ⊆ pvg A(≤m and, (≤m ¯ 0 +l1 ) consequently, Vz,g ⊆ pvg A such that they are suitable for the wished replacements.

P = NP for Expansions Derived from Some Oracles

167

Definition 11. For x ∈ Al0 , let the decomposition Mx = S0 ∪ p1 S1 ∪ · · · ∪ pw Sw into the (a) ¯ 0) equivalence classes Mv = pv Sv be inductively defined. Let p0 = ε and S0 = S0 ⊆ A(≤1+m (b) (≤m0 +m ¯ 0) and S0 = S0 ⊆ A , respectively, where (a) (a) S0 = {x ∈ Mx | (∃x0 ∈ Mx )((x0 )[l1 ] ∈ A(≤1) & x º x0 )} , (b) (b) S0 = {x ∈ Mx | (∃x0 ∈ Mx )((x0 )[l1 ] ∈ A(≤m0 ) & x º x0 )} . Sv−1 If, for v > 0, Nv =df Mx \ i=0 Mi 6= ∅, then let iv be the minimum of {i | xi ∈ Nv & ¬(∃x ∈ [l ] [l ] Nv )(xi º x & xi 1 6= x[l1 ] )}, pv = xiv1 , and Sv = {r | (∃x ∈ Nv )(x º xiv & x = pv r)}. In case of Nv = ∅, we set w = v − 1. These decompositions allow the following two statements where we start the decomposition (a) (b) with S0 in the first case and with S0 in the second one. W Proposition 6. We have NPW ¯ = PIK ¯ . IK (a)

Proof. If we replace each prefix pv of xi ∈ pv Sv \ S0 by some new prefix qv where 1 < |qj | = |qj 0 | = 2 + m ¯ 0 and qj 6= qj 0 for any j 6= j 0 , then, for suitable additional guesses, the behaviour of any M ∈ MN ¯ (W) on the strings resulting from z is the same one as IK on the old strings in the course of the first l1 steps. Therefore, there is some reduction ¯ 0 +m ¯ 0 ) l0 function assigning some (dhx0 ie, Code(M), dbl1 e) ∈ W where x0 ∈ (A(≤2+m ) to any l1 (x1 , . . . , xl0 , Code(M), db e) ∈ UNIIK ¯ (W). This function is computable over (A; a, b, ε; add, subl , subr ; =) in polynomial time. u t Proposition 7. The problem UNI can be reduced to SUB-UNI by a machine over (A; a, b, ε; add, subl , subr ; R, =) in polynomial time. (b)

Proof. If we replace each prefix pv of xi ∈ pv Sv \ S0

by some new prefix

¯ 0 −v qv = hav bm0 +m iCode∗0 b4 akv

where R(pv ak ) = R(qv ak ) for all k ≤ m0 , then the behaviour of any machine in MN ¯ R on IK the new strings derived from z and suitable further guesses is the same one as on the old strings within the first l1 steps. Therefore, there is some reduction function which assigns some (x0 , Code(M), dbl1 e) ∈ SUB-UNI to any (x1 , . . . , xl0 , Code(M), dbl1 e) ∈ UNI, where x0 ∈ (A(≤max0 ) )l0 holds, in polynomial time. u t If the tuple in a query with respect to W has polynomial length, then this query can be transformed into a condition with respect to R in polynomial time. Proposition 8. PW ¯ R. ¯ ⊆ PIK IK

u t

Theorem 1. PIK ¯ R = NPIK ¯ R.

u t

Note that, analogously to Proposition 6, we can replace the inputs of the machine M0 ∈ MIK ¯ deciding the problem P0 in (1), where we use l0 = n + 2q(n), l1 = p(l0 ), and m ¯ 0 = (n + 2q(n))(p(n + 2q(n)) + 1) − 1. Therefore we also get the following. W Proposition 9. PATIK ¯ ⊆ PIK ¯ R. ¯ ⊆ PIK

5

u t

An Ordered Structure with P = NP

Many structures are known, e. g. the ring of real numbers and the additive group of real numbers, for which we have P 6= NP with respect of the uniform model of computation. However, if we consider these structures with order, then we do not know the relations between P and NP. Here we show that it is possible to construct an ordered structure with P = NP.

168

Christine Gaßner

First, we transfer our definition for strings to the binary codes of non-negative integers. The constants a and b corresponds to 1 and 0, respectively. ˜ = (IN; 0, 1; shr, shl, inc ◦ shl; ≤) where shr and shl are the usual shift operations Let IN shifting all binary digitals (all bits) at one bit position to the right and to the left, respectively, n and inc is the usual increment operation. We have shr(n) = n div 2, n div m = b m c, shl(n) = |r| ˜ 2n, and inc(n) = n + 1. R in INR can be fixed by R(n) ↔ ∃r(dre ∈ WIN ˜ & bin(n) = r1 ) ∗ where bin(z) is the binary representation of z ∈ IN and WIN ˜ ⊆ {0, 1} is defined as in Definition 7 where hxi =df hbin(x1 ), . . . , bin(xl0 )i. Here, let UNI = UNIIN ˜ R (∅) and SUB-UNI = {(x1 , . . . , xl0 , Code(M), dbl1 e) ∈ UNI | x1 , . . . , xl0 < 2max0 } , RES-UNI = {(x1 , . . . , xl0 , Code(M), dbl1 e) ∈ UNI | M(x) ∈ MN ˜ |[0,2max1 ) } IN R

where max1 is here defined by means of m1 = max{|bin(x1 )|, . . . , |bin(xl0 )|} + l1 . Let us sketch the proof of PIN ˜ R = NPIN ˜ R . Analogously to the proof of Proposition 4, a ¯1 minimal element g = bin(pg ) which is here given by pg = yjg div 2lg ≥ 2m1 +m for some guess yjg can be replaced by some small bin(qg ) where |bin(qg )| > m1 + m ¯ 1 such that R(pg 2k + Pk−1 i Pk−1 i k i=0 2 ) = R(qg 2 + i=0 2 ) for all k ≤ l1 − 1 and the order remains valid. If g is a prefix of the level ly < l1 of bin(y) for some guess or computed value y, then the new value results from y by y − (y div 2ly )2ly + qg 2ly . If Mx is decomposed into the sets Mv = {pv 2|bin(r)| + r | r ∈ Sv } according to (b) in Definition 11 where º refers to the binary codes and p0 = 0, then, for all 1 ≤ v ≤ w and Zv =df {pv 2|bin(r)| + r | r < 2m0 }, there hold ¯0 ¯0 (1) pv ≥ 2m0 , Sv ⊆ [2l1 −1 , 2m ), S0 ⊆ [0, 2m0 +m ) . 0 00 00 (2) If z < z < z , z, z ∈ Zv , and |bin(z)| = |bin(z 00 )|, then z 0 ∈ Zv . Pk−1 Pk−1 ¯0 pv ≥ 2m0 +m can be replaced by some qv if R(pv 2k + i=0 2i ) = R(qv 2k + i=0 2i ) for all k ≤ m0 and the following conditions hold for the possible cases: (a) pv div 2j = pv0 or (b) pv div 2j > pv0 > pv div 2j+1 for some j ≤ m0 , and (c) pv div 2m0 +1 ≥ pv0 . q0 = 0. If (c), then |bin(qv )| > |bin(qv0 )| + m0 . In case of (a) let qv = qv0 2j + (pv − pv0 2j ). If (b), then Zv ∩ Zv0 = ∅ and, for z ∈ Zv ,z 0 ∈ Zv0 , and |bin(z)| = |bin(z 0 )|, |bin(pv )| = |bin(pv0 )| + j implies z > z 0 , and z < z 0 results from |bin(pv )| = |bin(pv0 )| + j + 1. Thus, we can choose |bin(qv )| = |bin(qv0 )| + j and qv div 2j > qv0 . The order also remains valid if |bin(z)| = |bin(z 0 )| + 1. u t

References ? 1. Baker, T., J. Gill, and R. Solovay (1975). Relativizations of the P = NP question. SIAM J. Comput. 4, 431–442. 2. J. Balcázar, J. Díaz, and J. Gabarró (1988/90). Structural Complexity I and Structural Complexity II. Springer-Verlag. 3. Blum, L., F. Cucker, M. Shub, and S. Smale (1998). Complexity and Real Computation. Springer-Verlag. 4. Blum, L., M. Shub, and S. Smale (1989). On a theory of computation and complexity over the real numbers: NP-completeness, recursive functions and universal machines. Bulletin of the Amer. Math. Soc. 21, 1–46. 5. Bournez, O., F. Cucker, P. J. de Naurois, and J.-Y. Marion (2006). Implicit complexity over an arbitrary structure: Quantifier alternations. Information and Computation 202, 2, 210–230. 6. Cucker, F., M. Shub, and S. Smale (1994). Separation of complexity classes in Koiran’s weak model. Theoretical Computer Science 133, 3–14. 7. Gassner, C. (2001). The P-DNP problem for infinite abelian groups. Journal of Complexity 17, 574–583. 8. Gassner, C. (2004). Über die Konstruktion von Strukturen endlicher Signatur mit P = NP. Preprint 1/2004.1 1

Preprint-Reihe Mathematik, E.-M.-Arndt-Universität Greifswald

P = NP for Expansions Derived from Some Oracles

169

9. Gassner, C. (2004). NP ⊂ DEC und P = NP für Expansionen von Erweiterungen von Strukturen endlicher Signatur mit Identitätsrelation. Preprint 13/2004.1 10. Gassner, C. (2006). A structure with P = NP. CiE 2006. Computer Science Report Series of the University of Wales Swansea CSR 7, 85–94. 11. Gassner, C. (2006). Expansions of structures with P = NP. CiE 2006. Computer Science Report Series of the University of Wales Swansea CSR 7, 95–104. 12. Gassner, C. (2006). From Structures with P 6= NP to Structures with P = NP Preprint 10/2006.1 13. Hemmerling, A. (2005). P = NP for some structures over the binary words. Journal of Complexity 21, 557–578. 14. Koiran, P. (1994). Computing over the reals with addition and order. Theoretical Computer Science 133, 35–47. 15. Mainhardt, G. (2004). P versus NP and computability theoretic constructions in complexity theory over algebraic structures. Journal of Symbolic Logic 69, 39–64. 16. Meer, K. (1992). A note on a P 6= NP result for a restricted class of real machines. Journal of Complexity 8, 451–453. 17. Meer, K. (1993). Real number models under various sets of operations. Journal of Complexity 9, 366–372. 18. Poizat, B. (1995). Les Petits Cailloux. Aléas. 19. Prunescu, M. (2006). Structure with fast elimination of quantifiers. Journal of Symbolic Logic 71, 321–328.

Resolution Proofs Hidden in Mathematical and Physical Structures and Complexity Annelies Gerber? Department of Computer Science, University Paris 6, 75013 Paris, France [email protected]

Abstract. Certain proofs of classical propositional logic can be implicitly contained in particular mathematical or physical structures thus encoding logical information (proofs) within such structures. The complexity of such hidden proofs as well as the computational complexity of the corresponding decision problems must therefore be reflected in the mathematical or physical structure containing a proof. Propositional resolution proofs with non-tautological clauses and no clause contraction can be naturally embedded in IRn . We give a direct analogy between certain input resolution proofs and particular closed curves in IRn . We also show that unit resolution proofs can be viewed as sequences of completely inelastic particle collisions on n-dimensional manifolds with a flat connection.

1

Introduction

Apart from examining properties of particular proof systems directly, we can also ask whether certain proofs are implicitly contained in particular mathematical or physical structures. This can lead to an analogy between a proof and such a mathematical or physical structure, where a particular proof is encoded in that structure. Analogies between proofs and geometrical structures could be of particular interest. In [3] problems associated with describing complicated mathematical objects by implicit rules or patterns, when explicit definitions are too complicated or not even feasible, are addressed. Various examples of mathematical structures encoding logical information such as Boolean algebras [1], by means of which classes of logically equivalent formulas can be represented, exist. In a series of articles Girard [12] introduced a ’geometry of interaction’ for linear logic based on C∗ -algebras. In [2] Carbone established a correspondence between propositional resolution proofs and diagrams in the theory of small cancellation. There are also interesting links between computational complexity and phase transitions in statistical physics [20]. For any mathematical or physical structure, which implicitly contains a proof of a particular proof strategy, it necessarily has to be the case that the complexity of such a proof and of the corresponding decision problem is reflected in this structure. Therefore, mathematical structures implicitly containing proofs may lead to new (polynomial-time) reductions between the corresponding decision problems. If a (polynomial-time) reduction exists then there also exists an algorithm computable by a Turing machine [22] with the same time complexity for the corresponding decision problem within the mathematical or physical structure (up to polynomial-time). Since each Turing machine defines a Turing computable function such analogies may also provide examples of computable functions being hidden in mathematical or physical structures (for a more detailed discussion on incomputability in nature see for instance [5]). Many different resolution strategies such as Davis-Putnam resolution [7] exist. Resolution calculus for predicate logic was first developed in a more general form by Robinson [21]. A detailed account of predicate resolution calculus is given in [19]. All examples of analogies ?

Supported by a grant from the Holcim Foundation, Switzerland

Resolution Proofs

171

discussed here are based on propositional resolution calculus (for a detailed account of propositional resolution see [16]). Here, we are going to present examples of how some input and unit resolution proofs are encoded in particular mathematical or physical structures in IRn . In Section 2, we give a summary of relevant material from resolution calculus and complexity theory. In Section 3, we explain how resolution proofs, where no clause contraction occurs, can be regarded as paths in IRn . For certain input resolution proofs we give a direct analogy with particular closed curves. Finally, we show that unit resolution proofs can be viewed as particular sequences of completely inelastic particle collisions on a n-dimensional manifold with a flat connection.

2

Propositional Resolution and Complexity

Any formula of classical propositional logic is logically equivalent to at least one formula in CNF, which only contains the logical connectives ∨, ∧ and ¬. Any formula F in CNF can be regarded as a conjunction of clauses. A clause either consists of the empty clause ⊥ containing no literals, of a single literal, or of a disjunction of a number of literals. A literal li is a propositional variable vi or its negation ¬vi , 1 ≤ i ≤ n. The complement of a literal l is ¬l and commonly denoted by ¯l whereas the complement of ¬l is l. Clauses with multiple occurrences of literals such as v1 ∨ v1 ∨ v3 are un-reduced clauses whereas clauses, in which each literal or its complement occur at most once, are reduced. Resolution calculus is based on sets of clauses and one deduction rule only, namely, the rule that from two parent clauses F ∨ l and G ∨ ¯l we can infer F ∨ G: F ∨ l , G ∨ ¯l (Res) , F ∨G where F ∨ G is called the resolvent. We assume clause contraction to apply whenever it occurs: when we have two parent clauses of the form F ∨ G ∨ l and F ∨ H ∨ ¯l with a common sub-clause F , then {F ∨ G ∨ l , F ∨ H ∨ ¯l} `Res F ∨ G ∨ H and not F ∨ F ∨ G ∨ H. Resolution derivations can be represented as trees. If a clause is used as a parent clause in several resolution steps, we can share the sub-tree leading to this clause so that we obtain a directed acyclic graph. However, all characterisations of resolution proofs in this article are based on tree-like resolution. Classical propositional resolution is refutationally complete. Certain resolution strategies such as input resolution or unit resolution are no longer refutationally complete. In input resolution derivations, one parent clause at any resolution step always has to be an initial clause. Unit resolution is a strategy, where at each resolution step one parent clause consists of a single literal only. In [4] an important equivalence between input resolution and unit resolution with respect to resolution refutations was shown: a formula F in CNF possesses a unit resolution refutation if and only if F possesses an input resolution refutation (see also [16] for further details). For each resolution strategy, one can also define its corresponding read-once version as introduced in [14]. Decision problems for a number of resolution strategies are NP-complete (a large number of mostly NP-complete problems is given in [8]). Iwama and Miyano [14] showed that read-once resolution is NP-complete (see also [17] for a more general proof). Kleine-Büning and Zhao proved that read-once unit resolution is also NP-complete [18]. They also showed that a set of clauses admits an input resolution proof if and only if it admits a read-once input resolution proof. Unit resolution is a curious example of a resolution strategy which is P-complete [15] but where its read-once version is NP-complete [18]. Links between propositional logic and questions from proof complexity do also exist. Cook and Reckhow [6] showed that there exists a propositional proof system with polynomial-sized proofs for all tautologies if and only if N P = coN P. Resolution calculus though cannot be

172

Annelies Gerber

a candidate to show a potential equality since there exist exponential lower bounds for the resolution size of some classes of unsatisfiable formulas [13].

3

Resolution Proofs as Closed Curves Embedded in IRn

The idea to represent resolution proofs of certain resolution strategies over n atoms as paths in IRn is based on two observations. Firstly, the order in which literals occur in a clause is normally irrelevant, i.e. v1 ∨ v2 and v2 ∨ v1 are regarded as the same clause. Secondly, each literal in a clause is ’connected’ to the other literals by the same logical connective ’∨’. This suggests that we can represent non-tautological clauses either as vectors or as points in IRn . To a given (un)-reduced, non-tautological clause F , F = v1 ∨ · · · ∨ v1 ∨ · · · ∨ vn ∨ · · · ∨ vn ∨ ¬v1 ∨ · · · ∨ ¬v1 ∨ · · · ∨ ¬vn ∨ · · · ∨ ¬vn , | {z } {z } | {z } {z } | | ε1

εn

ε¯1

(1)

ε¯n

where εi , ε¯i are non-negative integers (εi > 0 → ε¯i = 0 and ε¯i > 0 → εi = 0), we associate a clause-vector F by setting F := (ε1 − ε¯1 )e1 + . . . + (εn − ε¯n )en .

(2)

We have vi 7→ ei and ¬vi 7→ −ei and if the clause F is reduced, then εi , ε¯i ∈ {0, 1}. Instead of a vector we can also associate a point xF ∈ ZZ n ⊂ IRn with a clause F by defining the coordinates of xF to be given by xF := (ε1 − ε¯1 , ε2 − ε¯2 , . . . , εn − ε¯n ) .

(3)

Thus, we obtain a bijection from all non-tautological (reduced and un-reduced) clauses F over n atoms to all vectors F or points xF in IRn with integer components or integer coordinates respectively. IRn is an affine space which can also be regarded n-dimensional manifold with Pnas a iflat i f g between any two vectors F = metric given by the usual scalar product hF , Gi = i=1 Pn Pn i ei , G = i=1 g i ei at any point p ∈ IRn . The length of a clause-vector F is given by i=1 f p ||F || = hF , F i, which we have to distinguish from |F |, by which we mean the the length of a clause, i.e. the number of literals occurring in F . For reduced, non-tautological clauses it holds though that |F | = ||F ||2 . Any resolution proof with non-tautological initial clauses and resolvents, and where no clause contraction occurs, can be expressed as a continuous path of piecewise straight lines in IRn . This is because for any tree-like propositional resolution proof without clause contraction derived from a set S = {S1 , S2 , . . . , Sk } of k reduced, non-tautological initial clauses, we can find a set of k non-negative integers λj , not all vanishing identically, such that for the corresponding clause-vectors S j holds k X

λj S j = 0 .

(4)

j=1

It is clear merely from complexity considerations that condition (4) is by no means sufficient for a set S of initial clauses to admit any resolution proof. Further, if clause contraction occurs during a proof, then this amounts to a non-linear operation and it needs a different mathematical approach [10]. Resolution proofs with clause contraction can theoretically still be embedded in IRn but we cannot expect to for instance determine in an efficient way whether a given curve of piecewise straight lines does in fact correspond to any resolution proof.

Resolution Proofs

173

Remark 1. The logical connective ∨ between two literals in a non-tautological clause F can be interpreted as ’+’ in IRn . If we apply ∨ to two clauses F, G such as in F ∨ G, then F + G corresponds to the non-tautological, un-reduced part of F ∨ G only. We only interpret ¬ as ’−’ in IRn when ¬ occurs in negated variables within clauses as this is sufficient for resolution calculus. If we still wish to interpret ¬ as ’−’ when applied to clauses then this is no longer compatible with de Morgan’s rules: i.e. F = v1 ∨v2 7→ F = e1 +e2 , where then −F = −e1 −e2 . Clearly, −F does not correspond to ¬F ≡ ¬v1 ∧ ¬v2 , which is interpreted as the set of vectors {−e1 , −e2 }. 3.1

Input Resolution Proofs Encoded as Closed Curves in IRn

There exist particular closed curves of piecewise straight lines in IRn which correspond to input resolution proofs with non-tautological clauses and where no clause contraction occurs. Condition (4) is a necessary condition for a set S of initial clauses to admit such proofs but it is by no means sufficient. The following theorem gives a sufficient condition in the form of a direct analogy. Theorem 1. Given a set S = {S1 , S2 , . . . , Sk } of k reduced, non-tautological initial clauses. S possesses an input resolution refutation of length m with non-tautological resolvents and where no clause contraction occurs if and only if for the corresponding set of k clause-vectors S j and a set of k non-negative integers λj , not all identically zero, holds that k X

λj S j = 0 ,

j=1

k X

λj ||S j ||2 = 2m ,

j=1

k X

λj = m + 1 ,

j=1

and if an ordering of the m + 1 clause-vectors {S 1 , . . . , S 1 , S 2 , . . . , S k , . . . , S k } (each Sj occurring with multiplicity λj ), exists, given by {S i0 , S i1 , S i2 , . . . , S im }, ||S im || = 1, for which we have l X hS ij , S il+1 i = −1 , 0 ≤ l ≤ m − 1 . (5) j=0

The corollary given below follows directly from the above theorem, whereby the existence of an appropriate ordering of the clause-vectors is now guaranteed by the existence of m + 1 ’binary’ k-dimensional vectors V r , each with ||V r || = 1. Corollary 1. Given a set S = {S1 , S2 , . . . , Sk } of k reduced, non-tautological initial clauses. Then, S possesses an input resolution refutation of length m with non-tautological resolvents and where m + 1 k-dimensional vectors Pk no clause contraction occurs if and only if there exist P m V r = j=1 v j r ej , v j r ∈ {0, 1} and ||V r || = 1, so that for λj := l=0 v j l , 1 ≤ j ≤ k, we have Pk Pk Pk j j 2 j j=1 λ S j = 0, j=1 λ ||S j || = 2( j=1 λ − 1) = 2m, and for 0 ≤ l ≤ m − 1 holds l X k X j=0 r,s=1

hv r j S r , v s l+1 S s i = hRl ,

k X s=1

v s l+1 S s i = −1 ,

Rl =

l X k X

vr j S r .

(6)

j=0 r=1

Proof. Theorem 1 provides a direct analogy between input resolution proofs, where no clause contraction occurs, and certain piecewise straight lines. We give a concise version of the proof here (see [9] for further details). ⇒: This direction follows directly from the fact that for a given set S of initial clauses an input resolution proof of length m, where no clause contraction occurs, exists. We can then order the clauses in the same way as they appear in the corresponding linear proof-tree as {Si0 , Si1 , . . . , Sim } and associate a clause-vector S ij with each of them. Since the input proof

174

Annelies Gerber

has non-tautological resolvents and no clause contraction occurs the Pl clause-vectors Rl corresponding to the resolvents Rl , 1 ≤ l ≤ m, are given by Rl = follows j=0 S ij . It then Pm j that k non-negative integers λ with the desired properties exist so that Rm = j=0 S ij = Pk r r=1 λ S r = 0 and hRil , S il+1 i = −1, 0 < l < m − 1. Pl ⇐: We define a set of vectors Rl := j=0 S ij , then Rl+1 = Rl + S il+1 . To each S il+1 and Rl , respectively, correspond non-tautological clauses through (2). We can show that the Rl are in fact reduced clauses so that |Rl | = ||Rl ||2 by using Rl+1 = Rl + S il+1 , hRl , S il+1 i = −1 and Pk j 2 j=1 λ ||S j || = 2m. So, no pair Rl , Sil+1 can contain a common sub-clause. By calculating hRl+1 , Rl+1 i, using its bilinearity as well as the fact that hRl , S il+1 i = −1 6= 0, 0 < l < m − 1, we can infer that the Rl with Rm = ⊥ are the non-tautological resolvents of an input resolution proof where no clause contraction occurs. u t The closed continuous curves of piecewise straight lines corresponding to such proofs, where the S ij correspond to initial clauses in S and Rj to resolvents respectively, can be illustrated as follows: © * ©©¢¢¸A S ij+1 ©© ¢ A S ij+2 © A ¢ ©© A © ¢ A ©© ¢ © AU ..@ ´ 3 ¢ ´ ... .I@ Rj+1¢ ´ ´ @ ¢ ´ @ ¢ ´ Rj+2 Rj @ ¢´´ @´ ¢ 0 IRn is especially suited to encode unit resolution proofs since neither tautological clauses nor clause contraction can occur during a unit resolution proof (when S consists of reduced, nonPk tautological initial clauses). Functions λj with j=1 λj S j = 0, which encode unit resolution proofs as closed curves, are not excessively complex since the unit resolution decision problem is tractable [15].

4

Unit Resolution Proofs and Particle Collisions

Apart from representations as particular closed curves in IRn unit resolution proofs can also be encoded by sequences of particle collisions on a n-dimensional manifold with a flat connection. A unit resolution proof is then given by a sequence of particular, completely inelastic particle collisions. A collision of two particles is called completely inelastic if the two particles coalesce after the collision. Completely inelastic collisions are collisions, where we have a maximal loss of kinetic energy. We will have though that the momentum is conserved during each collision. We now associate an initial-clause-particle PF with a non-tautological clause F and we assume that all such initial-clause-particles have equal mass m. We define the velocity of PF to be the clause-vector F as given by (2). Then, we place a sufficiently large, finite number of copies of each initial-clause-particle on a n-dimensional manifold M with Euclidean parallelism such as IRn . Such a manifold is required in order to ensure that the clause-particles move in the appropriate directions before and after a completely inelastic collision so that certain collisions can correspond to unit resolution steps. The flat torus T n is a good example of a compact n-dimensional manifold with Euclidean parallelism. It can be regarded as the quotient space IRn /ZZ n , where opposite sides are identified. If any two clause-particles PVi and PVj , created by subsequent sequences of completely inelastic collisions of some initial-clause-particles, undergo a completely inelastic collision, then

Resolution Proofs

175

the momentum mi V i + mj V j of both colliding clause-particles is conserved. The momentum Pn of any such clause-particle PVi is given by mi V i = m( j=1 µj i ej ), where mi = ki m for some integer ki ≥ 1, and all µj i ∈ {0, ±1, . . . , ±ki }. With such Pn a clause-particle we then associate the clause corresponding to the clause-vector ki V i = j=1 µj i ej through (2). Unit clauses in unit resolution proofs can either be initial clauses of length one or resolvents of length one created during the proof. If we want to describe unit resolution proofs as sequences of particle collisions, then this necessarily means that one of the colliding clause-particles has to correspond to a unit clause. We shall call this particle a unit-clause-particle. We now define a unit-resolution-collision between two clause-particles PV i and PV j , which corresponds to a unit resolution step for the respective clauses. Since the momentum is conserved during collisions we can infer that ||mi V || = m ⇔ mi V = ±mes for some s ∈ {1, 2, . . . , n}. Definition 1. A collision between two clause-particles PVi and PVj is a unit-resolutioncollision if and only if either ||mi V i || = m or ||mj V j || = m, and we have hmi V i , mj V j i = −m2 . Remark 2. If a unit-resolution-collision between two particles occurs this does not ensure that the two colliding particles were previously both created exclusively by a sequence of unitresolution-collisions. In order to ensure that a series of collisions corresponds to a unit resolution proof, we need to verify that all clause-particles involved in the collisions producing the ’proof’ were themselves exclusively created by sequences of unit-resolution-collisions from some initialclause-particles. Theorem 2. Given a set S = {S1 , S2 , . . . , Sk } of k non-tautological, reduced initial clauses. Then, S admits a (tree-like) unit resolution proof if and only if on a given n-dimensional manifold M with flat connection there exist q ≥ 2 initial-clause-particles PSij , S ij ∈ {S 1 , S 2 , . . . , S k }, all of equal mass m, and initial coordinates (xlij ) for each PSij , 1 ≤ l ≤ n, such that a particle PVq of mass mq and velocity V q = 0 is created through a series of q−1 unit-resolutioncollisions amongst the initial-clause-particles and particles subsequently created solely by unitresolution-collisions. Proof. Theorem 2 provides a direct analogy between unit resolution proofs and particular sequences of completely inelastic collisions. ⇒: Since there exists a unit resolution proof for S there exists a proof-tree corresponding to that proof. We now associate a clause-particle PSij with momentum mS ij with each initial clause in that proof-tree. To each unit resolution step then corresponds a unit-resolutioncollision between particles solely created by unit-resolution-collisions. Since the momentum is conserved during each collision there exists an integer q ≥ 2 and a clause-particle PVq of mass mq such that V q = 0. In order to find the initial coordinates (xlij ) for each initial-clauseparticle PSij involved in the collisions we can backtrack the trajectories from the last created clause-particle PVq at rest (for more details see [9]). ⇐: This direction follows directly from the fact that each unit-resolution-collision corresponds to a unit resolution step for the corresponding clauses and that only unit-resolution-collisions occur. Since a series of such collisions exists producing a clause-particle PVq at rest this corresponds to the empty clause being derived by a series of unit resolution steps from the set S of clauses. u t

5

Conclusion

We gave an analogy between some resolution proofs and particular closed curves in IRn . We also encoded unit resolution proofs as sequences of completely inelastic particle collisions. These

176

Annelies Gerber

examples provide direct analogies between the proofs and their mathematical or physical representation. It is of interest to examine, based on the knowledge about the mathematical or physical structures characterising certain proofs directly, how such proofs can equivalently be represented implicitly since this can lead to implicit logical descriptions of these proofs [11]. Further, mathematical and physical structures representing proofs also implicitly contain the corresponding decision problems. For some analogies it may be interesting to investigate whether polynomial-time reductions exist between the two sets of decision problems. This could also produce examples of how particular computable functions, associated with the corresponding decision problems, are encoded mathematically. If we wish to find analogies between resolution proofs, where clause contraction occurs, and mathematical or physical structures, then Euclidean geometry is no longer suitable to provide ’natural’ analogies. Different mathematical structures are necessary to represent such proofs [10].

Acknowledgement The author would like to thank Professor A. Carbone for valuable discussions.

References 1. Boole, G.: An investigation of the laws of thought. Walton and Maberley, London (1854). Republished by Open Cours, La Salle (1954) 2. Carbone, A.: Group cancellation and resolution. Studia Logica 82 (2006) 73-93 3. Carbone, A., Semmes, S.: A graphic apology for symmetry and implicitness. Oxford Mathematical Monographs, Oxford University Press (2000) 4. Chang, C.L.: The unit proof and the input proof in theorem proving. J. Assoc. Comput. Mach. 17 (1970) 698-707 5. Cooper, S.B., Odifreddi, P.: Incomputability in Nature. Computability and Models: Perspectives East and West. Kluwer Academic/Plenum Press, (2003) 137-160 6. Cook, S.A., Reckhow, R.A.: The relative efficiency of propositional proof systems. J. Symb. Logic 44 (1979) 36-50 7. Davis, M., Putnam, H.: A computing procedure for quantification theory. J. Assoc. Comput. Mach. 7 (1960) 201-215 8. Garey, M.R., Johnson, D.S.: Computers and intractability. Freeman, San Fransisco (1979) 9. Gerber, A.: Analogies between propositional resolution and mathematical or physical structures. (preprint) 10. Gerber, A.: Mathematical models for resolution proofs with clause contraction. (in preparation) 11. Gerber, A.: On implicit characterisations of resolution proofs. (in preparation) 12. Girard, J.-Y.: On a geometry of interaction. NATO Adv. Sci. Inst. Ser. F Comput. Systems Sci. 139 Springer, Berlin (1995) 145-191 13. Haken, A.: The intractability of resolution. Theoret. Comput. Sci. 39 (1985) 297-308 14. Iwama, K., Miyano, E.: Intractability of read-once resolution, Proceedings Structure in Complexity Theory, 10th Annual Conference IEEE (1995) 29-36 15. Jones, N.D., Laaser, W.T.: Complete problems for deterministic polynomial time. Theoret. Comput. Sci. 3 (1977) 105-117 16. Kleine Büning, H., Lettmann, T.: Propositional logic: deduction and algorithms. Cambridge University Press (1999) 17. Kleine Büning, H., Zhao, X.: The complexity of read-once resolution. Ann. Math. Artif. Intell. 36 (2002) 419-435 18. Kleine Büning, H., Zhao, X.: Read-once unit resolution. LNCS 2919 Springer (2004) 356-369 19. Leitsch, A.: The resolution calculus. Springer, Berlin (1997) 20. R. Monasson, R., Zecchina, R., Kirkpatrick, S., Selman, B., Troyansky, L.: Determining computational complexity from characteristic ’phase transitions’. Nature 400 (1999) 133-137

Resolution Proofs

177

21. Robinson, J.A.: A machine-oriented logic based on the resolution principle. J. Assoc. Comput. Mach. 12 (1965) 23-41 22. Turing, A.M.: On computable numbers, with an application to the Entscheidungsproblem. Proc. Lond. Math. Soc. Series 2 42 (1936/37) 230-265

Hybrid Finite Computation Luís Mendes Gomes1 and José Félix Costa2 1

Departament of Mathematics, University of Azores [email protected] 2 Departament of Mathemtics, I.S.T., Technical University of Lisbon Centro de Matemática e Aplicaçõoes Fundamentais, Complexo Interdisciplinar, University of Lisbon [email protected]

Abstract. Taking the most simple kind of finite state automaton, typically used in the digital stage, whose states are continuous instead of discrete, we show that such automata can only recognize periodic infinite patterns. In our case such patterns are generated by real recursive functions, a new trend in analog stage, which are an extension to the reals of Kleene’ s recursive functions. And, thus, we show that automata can only recognize periodic real recursive functions. We also show that these are naturally approximated by Fourier series. With these results in hand, we are bringing together not only the concept of periodicity into the real recursive function theory, and consequently the Fourier series, but also the automata, with continuous states, and their computational limits, in a mathematical characterization of hybrid finite computation.

1

Introduction

In a broad sense, hybrid computation includes all computing techniques combining some of the features of digital computations with some of the features of analog computations. Recall that digital computation has been dominated by the unified work of Turing since mid 30s, while analog computation has not yet experienced that unification. Consequently, there is also a lack of consensus about the most appropriate formal characterization for hybrid computation. But it is well known that hybrid computation occurs at the crossroads of several scientific directions: it is based on several ideas coming from computer science and mathematics (see e.g. [2]); it is also at the intersection of numerical analysis and computer algebra [4]. In the theory of analog computation, that has its roots on Claude Shannon’s General Purpose Analog Computer (GPAC) [11], each state of a machine is continuous rather than discrete. Much of the research that has been made, recently, in this kind of computation is included in a wider program of exploring alternative approaches to classical computation. Among these approaches we have neural networks and quantum computation. We also have an idealization of numerical algorithms, where real numbers are entities in themselves rather than (finite) strings of digits. Actually, one of the most interesting and elegant approaches to analog computation was introduced by Cris Moore, in his seminal paper [7], which is analogous to Kleene’s recursive function theory [5]. Real recursion theory, introduced in [7], has been considered as a model of analog computation, and it has been also used to obtain analog characterizations of classical computational complexity classes [9]. One of the operators that was borrowed from classical recursion theory, the analog minimalization, is far from physical realizability and does not fit well the analytic realm of analog computation. But, as it was emphasized in [8], a most natural operator captured from Analysis, the operator of taking a limit, can be used properly to enhance the real recursion theory, providing not only good solutions to puzzling problems raised by the original model but also providing the opportunity to bring together classical computation and real and complex Analysis. Automata theory is, usually, faced as the study of sets of strings or ω-strings over a finite alphabet accepted by finite state machines. Recently, some work has been done to lift concepts

Hybrid Finite Computation

179

of automata theory from discrete to continuous time [10]. Instead of signals defined over a discrete sequences of time instants, it is considered signals defined over non-negative reals. An interesting subclass of such signals is the set of piecewise continuous functions, because the well-known relationship with Fourier analysis (see e.g. [15]). In this paper, we take as our starting point the real recursive function theory of [7], but following the more recent work of [8], and, then, we define what are periodic real recursive functions and their relationship with Fourier series. Finally, we show that finite state automata, where their states are continuous instead of discrete, considered a version of automata over continuous-time found in [10], can only recognize such periodic real recursive functions.

2

Real recursive functions

In [7], it was defined a set of (vector-valued) functions over Rn , called R-recursive functions, following the inductive approach taken for the construction of recursive functions over N that we can found in e.g. [5]. The discrete recursive operator is replaced by a differential recursion operator, which is the analog counterpart of classical recurrence, and, thus, the set of Rrecursive functions generates a model of continuous time computation. More recently, in [8], the µ-operator defined over the R-recursive functions, which is also the analog counterpart of the classical minimalization operator, was replaced by an infinite limit operator. This operator is captured from Analysis and, as we can see in [8], it can be used properly to enhance the theory of recursion over the reals, providing good solutions to puzzling problems raised by the original model in [7]. Definition 1. [8] The class REC(R) of real recursive vector functions 3 is generated from the real recursive scalars 0, 1, −1, and the real recursive projections Ini (x1 , . . . , xn ) = xi , 1 ≤ i ≤ n, n > 0, by the following operators: Composition: if f is a real recursive vector function with n k-ary components and g is a real recursive vector function with k m-ary components, then the vector function with n m-ary components, 1 ≤ i ≤ n, λx1 . . . λxm . fi (g1 (x1 , . . . , xm ), . . . , gk (x1 , . . . , xm )) is real recursive. Differential recursion: if f is a real recursive vector function with n k-ary components and g is a real recursive vector function with n (k + n + 1)-ary components, then the vector function h of n (k + 1)-ary components which is the solution of the Cauchy problem, 1 ≤ i ≤ n, hi (x1 , . . . , xk , 0) = fi (x1 , . . . , xk ), ∂y hi (x1 , . . . , xk , y) = gi (x1 , . . . , xk , y, h1 (x1 , . . . , xk , y), . . . , hn (x1 , . . . , xk , y)) is real recursive whenever h is of the class C 1 on the largest interval containing 0 in which a unique solution exists. Infinite limits: if f is a real recursive vector function with n (k + 1)-ary components, then the vector functions h, hinf , hsup with n k-ary components, hi (x1 , . . . , xk ) = lim fi (x1 , . . . , xk , y), y→∞

hinf i (x1 , . . . , xk ) = lim inf fi (x1 , . . . , xk , y), y→∞

hsup i (x1 , . . . , xk ) 3

= lim sup fi (x1 , . . . , xk , y), y→∞

Hereafter, for short, we will say only real recursive functions.

180

Luís Mendes Gomes and José Félix Costa

1 ≤ i ≤ n, are real recursive. Assembling and designating components: (a) arbitrary real recursive vector functions can be defined by assembling scalar real recursive function components into a vector function; (b) if f is a real recursive vector function, then each of its components is a real recursive scalar function. Real recursive numbers: arbitrary real recursive scalar functions of arity 0 are called real recursive numbers. ¤ Constant functions 0n , 1n , −1n which are n-ary can be derived from unary constant functions by means of projections. For example, 1n (x1 , . . . , xn ) = 1 can be defined as 11 (In1 (x1 , . . . , xn )) = 1, and constant functions of arity one can be derived by differential recursion as follows: 0(0) = 0, ∂y 0(y) = I22 (y, 0(y)); u(0) = c, ∂y u(y) = 0(I21 (y, u(y))), where c = 1, −1. The functions +, ×, −, sin, cos and λx. x1 are real recursive functions. Let us define +(x, 0) = 1 I1 (x) = x, ∂y + (x, y) = 13 (x, y, +(x, y)). Analogously we can get ×(x, 0) = 01 (x), ∂y × (x, y) = I31 (x, y, ×(x, y)), hence we have by a composition −(x, y) = +(x, ×(−1, y)). Furthermore, the vector (sin(x), cos(x)) and its components can be defined by such differential recursion: µ ¶ µ ¶ µ ¶ µ 3¶ sin 0 sin I3 (0) = , ∂y (y) = (y, sin y, cos y). cos 1 cos −I32 1 Now for λx. x1 , we define h(x) = x+1 (h is defined in the interval (−1, ∞)) in the following way: h(0) = 1, ∂x h(x) = ×(−1, ×(h(x), h(x))) , and then we compose h with λx. x − 1. The division is simply a composition of × and λx. x1 (with the domain equal to (0, ∞), but we can extend the division to the negative numbers via a definition by cases). We can construct also other special real recursive functions. The Kronecker δ function, the signum function, the Heaviside Θ function (equal to 1 if x ≥ 0, otherwise 0), and the squarewave function s are real recursive functions. So, it is sufficient to take the following definitions: 1 y if δ(0) = 1 and for all x 6= 0 we have δ(x) = 0, then let us define δ(x) = lim inf y→∞ ( 1+x 2) . 2 From the function λxy. 1+e−xy − 1, we obtain   1 if x > 0 0 if x = 0 . sgn(x) = lim inf y→∞ 1+e2−xy − 1 =  −1 if x < 0 Let Θ(x) = (sgn(x) + δ(x) + 1)/2 and s(x) = Θ(sin(πx)). In some examples we can use in constructions the predicate of equality eq = λxy. δ(x − y). Sometimes we will use Θ to control whether points are in given intervals. Then for x ∈ [a, ∞) we have the characteristic function Θ(x − a) and for x ∈ [a, b] we can define Θ[a,b] (x) = Θ(x − a)Θ(b − x). Let us add that we can find real recursive numbers (computable reals in our framework) as values of real recursive functions of arity one for, let us say, an argument equal to 0. Of course the argument can be changed to a real recursive number t by a composition of a given real recursive function with λx. x + t. In this sense e and π are computable reals: e = exp(1), 1 π = 4 arctan(1), where arctan(0) = 0, ∂y arctan(x) = 1+x 2. For differential recursion, the domain is restricted to an interval of continuity and, thus, preserving the analitycity of functions that can be generated by this differential schema. For example, using differential recursion we can not define functions such as λx.|x|. It is excluded the possibility of operations on undefined functions: functions are strict in the meaning that for undefined arguments they are also undefined. But we can define such function by |x| = sgn(x) × x. Notice that, in Definition 1, we guarantee the existence of a solution for the differential recursion whose first derivative is continuous on the largest interval containing 0, in which it is also unique. Since [3], it has been discussing a definition for a solution for such differential recursion in order to say, precisely, what is a solution for such differential recursion schema.

Hybrid Finite Computation

181

For example, in differential recursion, it is not imposed that functions fi and gi , for 1 ≤ i ≤ n, are of class C 1 . Several examples have been taking to motivate the adequacy of the definition of a solution to a system of differential equations (see e.g. [9]).

3

Periodic real recursive functions

The classical theory of Fourier series and integrals, as well as Laplace transforms, is of great importance for physical and technical applications, since it enable us to reason about many periodic phenomena in nature, and, then, periodic functions play the main role to model them. In what follows, we will carry only Fourier series, and a particular type of periodic functions, to real recursive function theory. All background about Fourier analysis can be found in [15], which will be our source of notation and terminology. Definition 2. We say that a real recursive function f is periodic with a real recursive period z if, for every x ∈ R, f (x + z) = f (x). ¤ Proposition 1. If f is a periodic recursive function with a real recursive period z (> 0), then the Fourier coefficients an and bn are real recursive numbers. ¤ 2 2πnx Proof. For every x ∈ R and every n ∈ N, z2 f (x) cos( 2πnx z ) and z f (x) sin( z ) are real recursive expressions. For some n ∈ N, let gn be a recursive function defined as follows: gn (0) = 0, ∂x gn (x) = z2 f (x) cos( 2πnx z ). So, we obtain the following real recursive expression:

2 gn (y) = z

Z

y

f (x) cos( 0

2πnx ) dx z

Therefore, gn (z) = an is a real recursive number. Analogously, for bn .

¤

The necessary and sufficient condition for the application of the fundamental theorem for Fourier series tells us that the function must be periodic piecewise smooth on R. Until now, we have said nothing about this kind of function in real recursive function framework. So Proposition 2. If f1 , . . . , fk are real recursive scalar total functions and a1 , . . . , ak are real recursive numbers such that a1 < . . . < ak , then λx. Θ[a1 ,a2 ) (x) × f1 (x) + . . . + Θ[ak ,+∞) (x) × fk (x) is a real recursive function.

¤

We call the real recursive function given just above a piecewise real recursive function if it is according with the mentioned conditions. Thus, we are ready to generate a bridge between Fourier series and the real recursive function theory [8] and, in broad sense, to periodic piecewise functions via their Fourier expansions. Proposition 3. If f is a Fourier expansion with real recursive numbers as coefficients, then f is a real recursive expression. ¤ 2πnx Proof. The expressions sin( 2πnx z ) and cos( z ) are both real recursive, since z 6= 0. Then,

an sin(

2πnx 2πnx ) + bn cos( ) z z

is also a real recursive expression, because an and bn , the Fourier coefficients, are real recursive numbers (see Proposition 1.), as well as, its finite sum byc X n=1

an sin(

2πnx 2πnx ) + bn cos( ). z z

182

Luís Mendes Gomes and José Félix Costa

Then lim

y→0

byc X n=1

an sin(

2πnx 2πnx ) + bn cos( ) z z

is also a real recursive expression. And, finally, byc X 2πnx 2πnx a0 + lim an sin( ) + bn cos( ) y→0 2 z z n=1

is a real recursive expression, which is the expression for the Fourier expansion.

¤

Definition 3. A function f is said to be partially periodic with period z if there exists z 0 such that, for every x ≥ z 0 , f (x + z) = f (x). ¤ Each partially periodic function f is indeed periodic after a point z 0 and, between 0 and z , we will require that there exists a finite number of discontinuities. It is obvious that if z 0 = 0, f is periodic in the sense of Definition 2. In particular, piecewise functions defined on non-negative reals which are divided in two distinct parts: the first part it is defined between 0 and some non-negative real x, where there is a finite number of points of discontinuity, and the second part it is defined on non-negative reals greater than x but it is periodic. We will see that the finiteness of points of discontinuity in the left of x and periodicity of the function in the right of x is the essential property to tackle the restriction of finite state machines in the recognition of infinite signals. 0

4

Automata can only recognize periodic real recursive functions

In this section, we will study the computational power of continuous automata which are able to process piecewise continuous signals. For this, as we will see during the exposition and the proof of the result itself that is irrelevant the form of the function taken in each interval of time of each piecewise continuous signal. So, we restrict our attention to piecewise linear signals which are, particularly, apropriated for a representation based on ω-words, which holds the definition of the required automata in its simplest form. Previous work had been done with the simplest class of piecewise signals: the class of piecewise constant signals (see e.g. [10]); and also with piecewise constant derivatives signals [1]. We construct a certain kind of finite state automaton, whose states are continuous instead of discrete, and show that they only recognize partially periodic piecewise real recursive functions, i.e. partially-periodic piecewise real recursive functions with period z (> 0) which have a finite number of discontinuities between 0 and z. We assume that piecewise real recursive functions are defined over R+ 0 . In particular, without loss of generality, we study partially periodic piecewise linear functions because, between two points of discontinuity, the derivative of such functions is constant and, then, it is representable by a computable number in the classical sense (e.g. integer). In this case, we take the approach based on signals and automata over continuous-time found in [10] that inspired us. In general a signal is a total function from R+ 0 to R. The well known square-wave and sawtooth-wave functions are examples of such signals. In what follows, we describe piecewise linear real recursive functions as signals which, in turn, are described by ω-words over Z. Definition 4. A piecewise linear signal s over Z is a four-tuple < α, β, θ, τ >, where α, β and θ are ω-words over Z such that, for every i ∈ N, if αi > 0 then θi > βi ; otherwise, if αi < 0 then θi < βi ; otherwise, θi = βi , and τ is an unbounded increasing ω-word over Z such that τ0 = 0 and, for every i ∈ N and every t ∈ [τi , τi+1 ), si of s defined on [τi , τi+1 ) by

Hybrid Finite Computation

Z

t

183

αi dt0

si (t) = βi + τi

is a real recursive function.

¤

In each interval [τi , τi+1 ), the derivative of the (linear) real recursive function si is denoted by αi , the value of si (τi ) is denoted by βi (i.e. the initial value) and, finally, θi denotes the maximum or minimum value taken by si in [τi , τi+1 ) according to the value of αi . And, thus, we are not providing the piecewise linear signal itself but its first derivative which it is not necessarily continuous. We denote the set of piecewise linear signals over Z by PLIN(Z). Alternatively, each piecewise linear signal s above can also be represented by X s(t) = lim Θ[τi ,τi+1 ) × si (t). y→∞

1≤i≤byc

In general, s(t) is not a real recursive function. In operational sense, for τi ≤ t < τi+1 , if αi > 0, si (t) increases from βi until θi , since θi > βi ; otherwise, if αi < 0 then si (t) decreases from βi until θi , since θi < βi ; otherwise, si (t) remains constant. As this construction suggests, each of such signals is a discontinuous piecewise linear function which has an infinite number of discontinuities. But, in particular, for every i ∈ N, if βi+1 = θi then we obtain a continuous piecewise linear signal. With the above representation for signals it is easy to say what is a partially-periodic piecewise linear signal based on it. A particular case of such signals are the periodic piecewise linear signals. Definition 5. We say that a piecewise linear signal < α, β, θ, τ > over Z is partially periodic with period p if there exists i ∈ N such that, for every j > i, (αj , βj , θj ) = (αj+p , βj+p , θj+p ).¤ As we can see above, the sufficient condition for a piecewise linear signal to be partially periodic is obtained by considering a time instant for which the triple formed by the first derivative, the initial value, and the maximum (or minimum) value of the signal, after a given period greater than 0, are equal. Notice that in the beginning of time such signals exhibits a non-periodic pattern with a finite number of discontinuities which is followed by a periodic pattern that, in our case, is a periodic piecewise linear signal. We denote by PPPLIN(Z) the set of partially periodic piecewise linear signals, whose representation are generated by computable infinite sequences α, β, θ and τ . And, thus, every signal in PPPLIN(Z) is a real recursive function. 1 0.8 0.6 0.4 0.2 1

2

3

4

Fig. 1. Square and triangle waves

Consider the simplest form of automaton where the set of states is finite. Usually, such states are seen as abstract entities suitable to describe discrete behaviors of systems in a given level of abstraction [6]. But, here, we take an approach that has been taken by hybrid systems community (see e.g. [12]) where, in a simplest case, we have a state variable which

184

Luís Mendes Gomes and José Félix Costa

is described by a continuous behavior over time. In this case, the automaton represents the evolution of the system through time where each transition represents the change of the regime of operation (i.e. the state of the automaton) which is described by a differential equation. So, to be coherent with signals considered above, each automaton has its state equipped with a first-order differential equation of the form ds dt = k. Formally, Definition 6. A continuous automaton A over Z is a triple (Q, δ, q0 ) where – Q is a finite subset (of states) of {(c, b, d) ∈ Z3 : b ≤ d and c ≥ 0} ∪ {(c, b, d) ∈ Z3 : b > d and c < 0}; – δ : Q × N2 → Q is a function (the transition function) such that, for every (c, b, d), (c0 , b0 , d0 ) ∈ Q and (a, a0 ) ∈ N2 such that a0 > a, δ((c, b, d), (a, a0 )) = (c0 , b0 , d0 ) iff c0 ≤ 0 if c > 0, or c0 ≥ 0 if c < 0, or ((c0 ∈ Z − {0}) or (c0 = 0 and b0 6= b)) if c = 0. – q0 ∈ Q (the initial state). ¤ By the condition imposed, in the definition just above, to define the transition function, we can see that a transition take place only when the value of c changes, except in the case of c = 0, which can remain as 0, and in this case we must take b0 6= b. We say that a piecewise linear signal s =< α, β, θ, τ > over Z is accepted by (or is a solution of) A if there exists an infinite sequence (b0 , c0 , d0 ) . . . over Q such that (b0 , c0 , d0 ) is the initial state, and, for every i ∈ N, bi = βi , ci = αi , di = θi and (bi+1 , ci+1 , di+1 ) = δ((bi , ci , di ), (τi , τi+1 )). As an example, consider the square wave and triangle wave signals in figure above. It is easy to see that continuous automata have, respectively, Qsq = {(1, 0, 1), (0, 0, 0)}, q0sq = (1, 0, 1) and δsq = {((1, 0, 1), (i, i + 1), (0, 0, 0)), ((0, 0, 0), (i + 1, i + 2), (1, 0, 1)) : i ≥ 0}, (i, i + 1) (0, 1, 1)

(i, i + 1) (0, 0, 0)

(i + 1, i + 2)

(1, 0, 1)

(−1, 1, 0) (i + 1, i + 2)

Fig. 2. Continuous automata for square and triangle waves

and Qtri = {(0, 1, 1), (1, −1, 0)}, q0tri = (0, 1, 1) and δtri = {((0, 1, 1), (i, i + 1), (1, −1, 0)), ((1, −1, 0), (i + 1, i + 2), (0, 1, 1)) : i ≥ 0}. Proposition 4. A piecewise linear signal s is accepted by a continuous automaton A if and only if s ∈ PPPLIN(Z). ¤ Proof. If s is a piecewise linear signal accepted by A, then there exists an infinite sequence (b0 , c0 , d0 ) . . . over Q such that (b0 , c0 , d0 ) is the initial state and, for every i ∈ N, δ((bi , ci , di ), (τi , τi+1 )) = (bi+1 , ci+1 , di+1 ). Since Q is finite, there exists i < j such that (bi , ci , di ) = (bj , cj , dj ). Let p = j − i, Therefore, for every n ≥ i, (bn , cn , dn ) = (bn+p , cn+p , dn+p ). Conversely, if s =< α, β, θ, τ > is a partially periodic piecewise linear signal, then there exist n0 and p > 0 such that, for every n ≥ n0 , αn = αn+p , βn = βn+p , and θn = θn+p . Consider the continuous automaton A = (Q, δ, q0 ) where Q = {(βi , αi , θi ) ∈ Z3 : i ∈ N} and, for every i ≥ 0, δ((βi , αi , θi ), (τi , τi+1 ) = (βi+1 , αi+1 , θi+1 ), and (β0 , α0 , θ0 ) is the initial state. So, A accepts s. Since s is partially periodic, then there exists i ∈ N such that, for every j ≥ i, (βj , αj , θj ) = (βj+(j−i) , αj+(j−i) , θj+(j−i) ). Therefore, Q is finite. ¤

Hybrid Finite Computation

185

The above result show us that each continuous finite state automaton can accepted only periodic piecewise signals, no matters what function we take between consecutive discontinuities. And, thus, periodicity impose the computational power for the continuous finite state automata.

5

Conclusions and further work

Real recursion theory, introduced in [7], has been considered as a model of analog computation. As it was enhanced in [8], the operator of taking a limit captured from analysis, can be also used properly to provide the opportunity to bring together classical computation and real and complex analysis. In this paper we bring together Fourier analysis and continuous finite state automata. Then, we show that the ingredients needed to deal with Fourier series, namely the piecewise smooth periodic functions, can be embodied in the framework of real recursive functions, originally introduced in [7] and revised and expanded in [8]. It seems obvious that not all piecewise smooth periodic functions can be accepted by continuous finite state automata (see e.g. [10]). Then, we also show that a special kind of automaton over continuous time can only be able to accept periodic piecewise linear signals, for which it is possible a characterization by ω-words is provided. Further work is envisaged to explore how our model of hybrid computation can be extended to take into account interaction in a given notion of analog network (see e.g. [14]), following the definitons and suggestions for interaction introduced in [13].

References 1. E. Asarin, O. Maler, and A. Pnueli. Reachability analysis of dynamical systems having piecewiseconstant derivatives. Theoretical Computer Science, 138:35–65, 1995. 2. M. S. Branicky. Universal computation and other capabilities of hybrid and continuous dynamical systems. Theoretical Computer Science, 138:67–100, 1995. 3. M. L. Campagnolo. Computational Complexity of Real Valued Recursive Functions and Analog Circuits. PhD thesis, IST, Universidade Técnica de Lisboa, July 2001. 4. J. D. Dora, A. Maignan, M. Mirica-Ruse, and S. Yovine. Hybrid computation. In Proceedings of International Symposium on Symbolic and Algebraic Computation, ISSAC’01, pages 101–108. ACM Press, 2001. 5. S. Kleene. Arithmetical predicates and quantifiers. Transactions American Mathematical Society, 79:312–340, 1955. 6. M. Minsky. Computation: Finite and Infinite Machines. Prentice Hall, 1972. 7. C. Moore. Recursion theory on the reals and continuous-time computation. Theoretical Computer Science, 162:23–44, 1996. 8. J. Mycka and J. F. Costa. Real recursive functions and their hierarchy. Journal of Complexity, 20:835–857, 2004. 9. J. Mycka and J. F. Costa. The P6= NP conjecture in the context of real and complex analysis. Journal of Complexity, 22 (2):287–303, 2006. 10. A. Rabinovich. Automata over continuous time. Theoretical Computer Science, 300:331–363, 2003. 11. C. Shannon. Mathematical theory of the differential analyser. Journal Mathematical Physics, 20:337–354, 1941. 12. B. Trakhtenbrot. Automata and hybrid systems. Technical report, Uppsala University, 1998. 13. B. Trakhtenbrot. Automata and their interaction: definitional suggestions. Fundamentals of Computation Theory, 1684:54–89, 1999. 14. J.V. Tucker and J.I. Zucker. Computability of analog networks. Theoretical Computer Science, 2006. to appear. 15. A. Vretblad. Fourier Analysis and Its Applications, volume 223 of Graduate Texts in Mathematics. Springer-Verlag, 2003.

Learnability of Recursively Enumerable Sets of Recursive Real-Valued Functions Eiju Hirowatari1 , Kouichi Hirata2 , and Tetsuhiro Miyahara3 1

Center for Fundamental Education, The University of Kitakyushu, Kitakyushu 802-8577, Japan [email protected] 2 Department of Artificial Intelligence, Kyushu Institute of Technology, Iizuka 820-8502, Japan [email protected] 3 Faculty of Information Sciences, Hiroshima City University, Hiroshima 731-3194, Japan [email protected]

Abstract. In this paper, we investigate the learnability of recursively enumerable (r.e., for short) sets of recursive real-valued functions. Unfortunately, as shown by Hirowatari and Arikawa (1997), there exists an r.e. set of recursive real-valued functions that is not learnable in the sense of Gold (1967). In contrast, in this paper, by introducing admissible and indexed r.e. sets of recursive real-valued functions, we show that an r.e. set of recursive real-valued functions of which domains are unions of open intervals is learnable if it is either admissible or indexed. Furthermore, by introducing a characteristic set of recursive real-valued function, we show that, for an admissible r.e. set T of recursive real-valued functions, if every function in T has a characteristic set, then T is learnable.

1

Introduction

A computable real function [12, 13, 15], of which origin draws back into the classical work by Turing [14], is a model of the computations with continuous data like real numbers. Recently, the computable real function has been developed in a new research field of computational paradigm [4, 6] related to analysis, mathematical logic and computability. A recursive real-valued function [8–10], which we mainly deal with in this paper, is one of the formulations for the computable real function. The recursive real-valued function is formulated as a function that maps a sequence of closed intervals which converges to a real number to a sequence of closed intervals which converges to another real number. Inductive inference of recursive real-valued functions has been first introduced by Hirowatari and Arikawa [7] and developed by their co-authors [2, 8–10]. In their works, the criteria such as RealEx and RealNum! for inductive inference of recursive real-valued functions have been formulated as extensions of the criteria for inductive inference of recursive functions such as Ex for identification in the limit and Num! for identification by enumeration [5], respectively, and their interaction has been widely studied (cf., [11]). In this paper, we pay our attention to a recursively enumerable set (an r.e. set, for short) of recursive real-valued functions. Note that every r.e. set of recursive functions is identifiable in the limit, that is, Ex-inferable [5]. In contrast, for inductive inference of recursive realvalued functions, Hirowatari and Arikawa have shown that there exists an r.e. set of recursive real-valued functions which is not identifiable in the limit, that is, not RealEx-inferable [7]. Hence, the main purpose of this paper is to characterize the RealEx-inferability of r.e. sets of recursive real-valued functions. In order to realize a natural requirement between a target recursive real-valued function and its input, first we introduce coherent, admissible and indexed r.e. sets of recursive real-valued functions. Let T be an r.e. set of recursive real-valued functions, I a rational interval and Ah an algorithm which computes h ∈ T . We say that T is coherent if there exists a computable

Learnability of Recursively Enumerable Sets

187

function that determines whether or not I is a subset of the domain of Ah . Also we say that T is admissible if there exists a computable function that determines whether or not the intersection of I and the domain of Ah is not empty. Furthermore, we say that T is indexed if there exists a computable function that determines whether or not a pair of real numbers is an example of an h ∈ T . Note that an indexed r.e. set of recursive real-valued functions is an extension of an indexed family of recursive languages [1, 11]. Unfortunately, we show that there exists an r.e. set of recursive real-valued functions that is not RealEx-inferable, even if it is coherent, admissible and indexed. Hence, the above r.e. sets of recursive real-valued functions are insufficient to characterize RealEx-inferable r.e. sets, in contrast that every indexed family of recursive languages, which corresponds to an indexed r.e. set of recursive real-valued functions, is Ex-inferable [1]. Next, we restrict the domain of recursive real-valued functions in an r.e. sets. We show that every r.e. set of recursive real-valued functions defined on a fixed domain is RealEx-inferable. Also we show that an r.e. set of recursive real-valued functions of which domains are unions of open intervals is RealEx-inferable if it is either admissible or indexed. Finally, we introduce a characteristic set of recursive real-valued function, which is regarded as a finite set of good examples. Then, we show that, for an admissible r.e. set T of recursive real-valued functions, if every function h ∈ T has a characteristic set, then T is RealExinferable. This paper is organized as follows. In Section 2, we briefly review recursive real-valued functions. In Section 3, we explain inductive inference of recursive real-valued functions. In Section 4, we introduce coherent, admissible and indexed r.e. sets of recursive real-valued functions, and investigate the RealEx-inferability of them. In Section 5, we introduce a characteristic set and investigate the RealEx-inferability of r.e. sets of recursive real-valued functions with characteristic sets.

2

Recursive Real-Valued Functions

In this section, we prepare some notions for recursive real-valued functions, which is one of the formulations for computable real functions [12, 13, 15]. Refer to papers [7–10] in more detail. Let N , Q and R be the sets of all natural numbers, rational numbers and real numbers, respectively. By N + and Q+ , we denote the sets of all positive natural numbers and positive rational numbers, respectively. We call a closed interval [a, b] of rational numbers (for a, b ∈ Q such that a < b) an closed interval simply. Throughout of this paper, h is a real-valued function from S to R, where S ⊆ R. By dom(h) we denote the domain of h, that is, dom(h) = S. Definition 1. Let f and g be functions from N to Q and Q+ , respectively, and x a real number. We say that a pair hf, gi is an approximate expression of x if f and g satisfy the following conditions: (1) limn→∞ g(n) = 0. (2) |f (n) − x| ≤ g(n) for each n ∈ N . Note here that f (n) and g(n) represent an approximate value of x and an error bound of x at point n, respectively. A real number x is recursive if there exists an approximate expression hf, gi of x such that f and g are recursive. In order to formulate a recursive real-valued function, we introduce the concepts of a rationalized domain and a rationalized function.

188

Eiju Hirowatari, Kouichi Hirata, and Tetsuhiro Miyahara

Definition 2. For S ⊆ R, a rationalized domain of S, denoted by RD S , is defined as follows. A rationalized domain of S ⊆ R, denoted by RD S , is a subset of Q × Q+ which satisfies the following conditions: (1) Every interval in RD S is contained in S. For each hp, αi ∈ RD S , it holds that [p−α, p+α] ⊆ S. (2) RD S covers the whole S. For each x ∈ S, there exists a hp, αi ∈ RD S such that x ∈ [p − α, p + α]. Especially, if x ∈ S is an interior point, then there exists a hp, αi ∈ RD S such that x ∈ (p − α, p + α). (3) RD S is closed under subintervals. For each hp, αi ∈ RD S and hq, βi ∈ Q × Q+ such that [q − β, q + β] ⊆ [p − α, p + α], it holds that hq, βi ∈ RD S . We denote RD dom(h) by RD h simply. For each S ⊆ R, RD S is not determined uniquely, that is, there exist several rationalized domains of S. Clearly, there exists a rationalized domain RD S of S if and only if S can be represented as a union of intervals. For example, consider a union S of closed or open intervals with end points of rational numbers and the set RD S of all closed rational intervals in S. Then, RD S is a rationalized domain of S. Definition 3. Let h be a real-valued function. A rationalized function of h, denoted by Ah , is a computable function from RD h to Q × Q+ satisfying the following condition: For each x ∈ dom(h), let hf, gi be an approximate expression of x. Then, there exists an approximate expression hf0 , g0 i of h(x) and it holds that Ah (hf (n), g(n)i) = hf0 (n), g0 (n)i for each n ∈ N such that hf (n), g(n)i ∈ RD h . We also call a rationalized function Ah of h an algorithm which computes h. Definition 4. A function h is a recursive real-valued function if there exists a rationalized function Ah : RD h → Q × Q+ of h, where RD h is a rationalized domain of dom(h). We demand that Ah (hp, αi) does not halt for all hp, αi 6∈ RD h . Furthermore, by RRVF we denote the set of all recursive real-valued functions. If h is a recursive real-valued function, then there exist a rationalized domain Dh of dom(h) h such that Dh = {hp, αi | [p − α, p + α] ⊆ dom(h) } and a rationalized function AD : Dh → h + Q × Q of h. From now on, we can assume that a rationalized domain of dom(h) is denoted by RD dom(h)= {hp, αi | [p − α, p + α] ⊆ dom(h) } without loss of generality. Then, Ah (hp, αi) always halts for each hp, αi ∈ Q × Q+ such that [p − α, p + α] ⊆ dom(h). 1 We call functions x, −x, x1 , ex , log x, sin x, arctan x, x 2 , arcsin x and the constant functions 1 cr for each recursive real number r basic functions. Here, x1 for x = 0, log x for each x ≤ 0, x 2 for each x ≤ 0, and arcsin x for each x ∈ R such that |x| ≥ 1 are undefined as usual. By BF we denote the set of all basic functions. Definition 5. By EF we denote the smallest set which contains BF and satisfies the following condition: If h1 , h2 ∈ EF, then h1 + h2 , h1 × h2 , h1 ◦ h2 ∈ EF. We call a function in EF an elementary function. We can also show that every basic function is a recursive real-valued function. Furthermore, the following theorem holds. Theorem 1 ([7]). Every elementary function is a recursive real-valued function. Hence, we can conclude that the class of recursive real-valued functions is rich enough to express the elementary functions with recursive real coefficients.

Learnability of Recursively Enumerable Sets

3

189

Inductive Inference of Recursive Real-Valued Functions

In this section, we introduce our framework for inductive inference of recursive real-valued functions [8–10]. In our scientific activities, it is impossible to observe the exact value of a real number x, but possible to observe only approximations of x. Such approximations can be captured as a pair hp, αi of rational numbers such that p is an approximate value of the number x and α is its error bound, i.e., x ∈ [p − α, p + α]. We call such a pair hp, αi a datum of x. Definition 6. An example of a pair hx, yi of real numbers is a pair hhp, αi, hq, βii such that hp, αi and hq, βi are data of x and y, respectively. For each recursive real-valued function h and each real number x ∈ dom(h), we also call an example of hx, h(x)i an example of the function h. By EXAM , we denote the set of all examples of pairs of real numbers. Furthermore, by EXAM (h), we denote the set of all examples of the function h. We can imagine an example of h as a rectangular box [p − α, p + α] × [q − β, q + β] such that p, q ∈ Q and α, β ∈ Q+ . Definition 7. A presentation of a recursive real-valued function h is an infinite sequence σ = w1 , w2 , . . . of examples of h in which, for each real number x ∈ dom(h) and ζ > 0, there exists an example wk = hhpk , αk i, hqk , βk ii such that x ∈ [pk − αk , pk + αk ], h(x) ∈ [qk − βk , qk + βk ], αk ≤ ζ and βk ≤ ζ. By σ[n] and σ(n), we denote the initial segment of n examples in σ and the n-th example wn , respectively. Furthermore, by σ[n], we sometimes denote the set of all examples in σ[n]. An inductive inference machine (IIM, for short) is a procedure that requests inputs from time to time and produces algorithms, called conjectures, that compute recursive real-valued functions from time to time. Let σ be a presentation of a function. For σ[n] = hw1 , w2 , . . . , wn i and an IIM M, by M(σ[n]) we denote the last conjecture of M after requesting examples w1 , w2 , . . . , wn as inputs. Definition 8. Let σ be a presentation of a function and {M(σ[n])}n≥1 an infinite sequence of conjectures produced by an IIM M. A sequence {M(σ[n])}n≥1 converges to an algorithm Ah if there exists a number n0 ∈ N such that M(σ[m]) equals Ah for each m ≥ n0 . Definition 9. Let h and h0 be recursive real-valued functions. Then, we say that h0 is a restriction of h or h is an extension of h0 , denoted by h0 = h|dom(h0 ) , if dom(h0 ) ⊆ dom(h) and h0 (x) = h(x) for each x ∈ dom(h0 ). We also write that h0 ⊆ h if h0 is a restriction of h. Furthermore, we write h0 6 ⊆ h if h0 ⊆ h and h0 6= h. Since we do not distinguish a function from its extension, we consider our learning successful even when a sequence of conjectures converges to an algorithm which computes an extension of the target function [3]. Next, we introduce the criterion RealEx for inductive inference of recursive real-valued functions that corresponds to the standard criterion Ex [5]. Definition 10. Let h be a recursive real-valued function and T a set of recursive real-valued functions. An IIM M RealEx-infers h, denoted by h ∈ RealEx(M), if, for each presentation σ of h, the sequence {M(σ[n])}n≥1 converges to an algorithm that computes an extension of h. Furthermore, an IIM M RealEx-infers T if M RealEx-infers every h ∈ T , and T is RealEx-inferable if there exists an IIM that RealEx-infers T . By RealEx, we denote the class of all RealEx-inferable sets of recursive real-valued functions.

190

Eiju Hirowatari, Kouichi Hirata, and Tetsuhiro Miyahara

Definition 11. A set T of recursive real-valued functions is recursively enumerable (r.e., for short) if there exists a recursive function Ψ such that the set T is equal to the set of all functions computed by algorithms Ψ (1), Ψ (2), · · · . By RealNum! [8], we denote the class of all r.e. sets of recursive real-valued functions. Theorem 2 ([7]). Let T be an r.e. set of recursive real-valued functions on the same rational interval. Then, T is RealEx-inferable. Note that, for an r.e. set χ of recursive real numbers, the set of all elementary functions of which coefficients are in χ on the same rational interval is r.e., so it is RealEx-inferable.

4

Learnability of Recursively Enumerable Sets of Recursive Real-Valued Functions

In this section, we characterize RealEx-inferability of r.e. sets of recursive real-valued functions. Remember that, by Ex and Num!, we denote the class of all Ex-inferable sets and all r.e. sets of recursive functions, respectively. Then, it holds that Num! 6 ⊆ Ex [5]. In contrast, for inductive inference of recursive real-valued functions, the following theorem holds. Theorem 3 ([7]). RealNum! \ RealEx 6= ∅. The above theorem claims that there exists an r.e. set of recursive real-valued functions that is not RealEx-inferable. For every r.e. set T of recursive real-valued functions, there exists a recursive function Ψ such that the set T is equal to the set of all functions computed by algorithms Ψ (1), Ψ (2), · · · . In this paper, we consider the sequence of algorithms A1 , A2 , · · · such that Ai = Ψ (i) for each i ∈ N + . Then, we call the sequence an algorithm sequence of T . Furthermore, by hAi , we denote the function computed by Ai . In this paper, we introduce the following r.e. sets of recursive real-valued functions, in order to realize a natural requirement between a target recursive real-valued function and a datum or an example. Definition 12. Let T be an r.e. set of recursive real-valued functions, and ΓT = A1 , A2 , · · · an algorithm sequence of T . Then, we define the following two computable functions CΓT and AΓT from N + × (Q × Q+ ) to {0, 1}: ½ 1 if [p − α, p + α] ⊆ dom(hAi ), CΓT (i, hp, αi) = 0 otherwise. ½ AΓT (i, hp, αi) =

1 if [p − α, p + α] ∩ dom(hAi ) 6= ∅, 0 otherwise.

Also we define the following computable function IΓT from N + × EXAM to {0, 1}: ½ 1 if hhp, αi, hq, βii is an example of hi , IΓT (i, hhp, αi, hq, βii) = 0 otherwise, where hhp, αi, hq, βii ∈ EXAM and hi is a recursive real-valued function which computed by Ai for each i ∈ N + . We say that T is coherent if the function CΓT exists, admissible if the function AΓT exists, and indexed if the function IΓT exists.

Learnability of Recursively Enumerable Sets

191

Note that an indexed r.e. set of recursive real-valued functions is an extension of an indexed family of recursive languages [1, 11]. Unfortunately, the above r.e. sets of recursive real-valued functions are insufficient to characterize RealEx-inferable r.e. sets, as the following theorem shows. Theorem 4. There exists an r.e. set of recursive real-valued functions that is coherent, admissible and indexed but not RealEx-inferable. Next, we restrict the domain of recursive real-valued functions in an r.e. sets. We investigate the case that every recursive real-valued function in an r.e. set is defined on a fixed domain S ⊆ R. We note that the set S is a union of closed or open intervals. Then, the following theorem holds. Theorem 5. Let TS be an r.e. set of recursive real-valued functions from S to R, where S ⊆ R. Furthermore, let ΓTS = A1 , A2 , · · · be an algorithm sequence of TS . Then, TS is RealExinferable. As a special case of Theorem 5, we have the following corollary. Corollary 1. Let T be an r.e. set of recursive real-valued functions from R to R. Then, T is RealEx-inferable. Example 1. Let χ be an r.e. set of recursive real numbers, and PF the set of all polynomial functions f (x) = cn xn + cn−1 xn−1 + · · · + c1 x + c0 , where c1 , c2 , . . . , cn are in χ. Then, PF is RealEx-inferable. Furthermore, we investigate the case where a domain of a recursive real-valued function in an r.e. set is restricted to a union of open intervals with end points of rational numbers. Then, the following two theorems hold. Theorem 6. Let T be an r.e. set of recursive real-valued functions of which domains are unions of open intervals. Furthermore, let ΓT = A1 , A2 , · · · be an algorithm sequence of T . If T is admissible, then T is RealEx-inferable. Theorem 7. Let T be an r.e. set of recursive real-valued functions of which domains are unions of open intervals. Furthermore, let ΓT = A1 , A2 , · · · be an algorithm sequence of T . If T is indexed, then T is RealEx-inferable.

5

Characteristic Sets of Recursive Real-Valued Functions

In this section, we investigate r.e. sets of recursive real-valued functions without restricting domains but with newly introducing finite sets of examples of recursive real-valued functions to be learned. First, we introduce the concepts of a characteristic set of recursive real-valued function. Definition 13. Let h be a recursive real-valued function. A full example of h is an example hhp, αi, hq, βii of h such that [p−α, p+α] ⊆ dom(h) and |q−h(x)| < β for each x ∈ [p−α, p+α]. Definition 14. For each S ⊆ R, by Inter (S), we denote the set of all interior points in S. Let ˘ is defined h be a recursive real-valued function. Then, the interior function of h, denoted by h, as follows: ˘ ⊆ h. (1) h ˘ = Inter (dom(h)). (2) dom(h)

192

Eiju Hirowatari, Kouichi Hirata, and Tetsuhiro Miyahara

Definition 15. Let T be an r.e. set of recursive real-valued functions and h be a function in T . A characteristic set CS of h w.r.t. T is a finite set of full examples of h satisfying the following conditions: (1) CS ⊆ EXAM (h0 ) implies h ⊆ h0 for each h0 ∈ T . ˘ (2) CS is a set of full examples of h, ˘ is the interior function of h. where h For an r.e. set T of recursive real-valued functions, there may exists a function h ∈ T which has no characteristic set w.r.t. T . Example 2. Let C(0,1] and C[0,1) be constant functions defined on the domain (0, 1] and [0, 1) such that C(0,1] (s) = 0 and C[0,1) (t) = 0 for each s ∈ (0, 1] and t ∈ [0, 1), respectively. We consider the set C = {C(0,1] , C[0,1) }. Then, there exists no characteristic set of the functions C(0,1] and C[0,1) w.r.t. C. Now we consider an r.e. set T of recursive real-valued functions such that each h ∈ T has always a characteristic set w.r.t. T . Then, we have the following theorem. Theorem 8. Let T be an admissible r.e. set of recursive real-valued functions. Furthermore, let ΓT = A1 , A2 , · · · be an algorithm sequence of T . If there exists a characteristic set of h w.r.t. T for each h ∈ T , then T is RealEx-inferable. Example 3. For each i ∈ N + , we construct the following recursive real-valued functions hi : ( hi (x) =

1 , 2i 0 if x ≤ 0. 1 if x >

7 1 Consider the set T0 = {h1 , h2 , · · · }. Then, the set {h 2n+3 , 2n+3 i} is a characteristic set of hn w.r.t. T0 for each n ∈ N . Thus, T0 is RealEx-inferable.

6

Conclusion

In this paper, we have investigated the learnability of r.e. sets of recursive real-valued functions, by introducing coherent, admissible and indexed r.e. sets. First, we have shown that there exists an r.e. set of recursive real-valued functions that is not RealEx-inferable even if it is coherent, admissible and indexed. On the other hand, we have shown that every r.e. set of recursive real-valued functions defined on a fixed domain is RealEx-inferable. Also we have shown that an r.e. set of recursive real-valued functions of which domains are unions of open intervals is RealEx-inferable if it is either admissible or indexed. Furthermore, by introducing a characteristic set of recursive real-valued functions, we have shown that, for an admissible r.e. set T of recursive real-valued functions, if every function in T has a characteristic set, then T is RealEx-learnable. It is a future work to investigate a necessary and sufficient condition of an r.e. set to be RealEx-inferable. Also it is a future work to introduce concepts such as a finite tell-tale and finite elasticity [16] for language learning into learning of r.e. sets of recursive real-valued functions and to characterize RealEx-inferability of it.

Learnability of Recursively Enumerable Sets

193

References 1. D. Angluin, Inductive inference of formal languages from positive data, Information and Control 45, 117–135, 1980. 2. K. Aps¯ıtis, S. Arikawa, R. Freivalds, E. Hirowatari, C. H. Smith, On the inductive inference of recursive real-valued functions, Theoret. Comput. Sci. 219, 3–17, 1999. 3. L. Blum, M. Blum, Toward a mathematical theory of inductive inference, Information and Control 28, 125–155, 1975. 4. S. B. Cooper, B. Löwe, L. Torenvliet (eds.), New computational paradigms, Proc. 1st International Conference on Computability in Europe, LNCS 3526, 2005. 5. E. M. Gold, Language identification in the limit , Inform. Control 10, 447–474, 1967. 6. T. Grubba, P. Hertling, H. Tsuiki, K. Weihrauch (eds.), Proc. 2nd International Conference on Computability and Complexity in Analysis, 2005. 7. E. Hirowatari, S. Arikawa, Inferability of recursive real-valued functions, Proc. ALT’97, LNAI 1316, 18–31, 1997. 8. E. Hirowatari, S. Arikawa, A comparison of identification criteria for inductive inference of recursive real-valued functions, Theoret. Comput. Sci. 268, 351–366, 2001. 9. E. Hirowatari, K. Hirata, T. Miyahara, Prediction of recursive real-valued functions from finite examples, Proc. the Workshop on LLLL, JSAI, 91-97, 2005. Also to appear in LNAI. 10. E. Hirowatari, K. Hirata, T. Miyahara, S. Arikawa, On the prediction of recursive real-valued functions, Proc. Computability in Europe 2005, ILLC Publications, University van Amsterdam, 93-103, 2005. 11. S. Jain, D. Osherson, J. S. Royer, A. Sharma, Systems that learn: An introduction to learning theory (2nd ed.), The MIT Press, 1999. 12. K. Ko. Complexity theory of real functions, Birkhäuser, 1991. 13. M. B. Pour-El, J. I. Richards, Computability in analysis and physics, Springer-Verlag, 1988. 14. A. M. Turing, On computable numbers, with the application to the Entscheidungsproblem, Proc. London Mathematical Society 42, 230–265, 1936. 15. K. Weihrauch, Computable analysis – An introduction, Springer-Verlag, 2000. 16. K. Wright, Identification of unions of languages drawn from an identifiable class, Proc. COLT’89, 328–333, 1989.

A Logic for Probabilistic XML documents Robin Hirsch and Evan Tzanis University College of London Department of Computer Science [email protected], [email protected]

Abstract. We introduce a probabilistic modal logic PXML for probabilistic XML documents. We prove that this logic has the finite model property and that the validity problem can be solved in EXPSPACE. The paper contains a number of worked examples.

1

Introduction

XML files usually assume a definite, crisp form. This restricts the applicability of XML files when the structures to be modelled involve uncertain information. In this paper we study the concept of probabilistic XML documents introduced by [NJ02] and we design a simple propositional probabilistic logic strong enough to cope with them. The outline of the paper is as follows: In section 2 we introduce by way of an example the concept of probabilistic XML documents. In section 3 we define a probabilistic modal logic called PXML. In section 5 we present a number of worked examples illustrating the use of this logic. In section 6 we prove that PXML has the finite model property and in section 7 we prove that the validity problem can be decided in EXPSPACE.

2

Related Work

A well known slogan in XML community states that: XML was designed to describe data and to focus on what data is. The classic XML notation, though, is not strong enough to describe data of a probabilistic nature. An XML document contains a number of tags and their elementscontent. A probabilistic extension of the XML structure consists of attaching probabilities over the tags. The importance and the possible reading of this extension was first introduced and studied in [NJ02,MvKA05]. As an example, we introduce the probabilistic XML document in figures 1. There are many papers combining logic and probability, the ones mentioned here are far from exhaustive. In [Koz81,Koz85], two equivalent semantics for a while-like probabilistic language are given. Papers of Halpern on probabilistic first-order logics include [AH94,Hal90,Hal91]. For probabilistic XML documents and logic, an interesting paper is [HL] which includes further references and points to a number of related problems with XML documents. During the 1990s various probabilistic temporal logics were studied, combining temporal logics with probabilities. A probabilistic extension of CTL was given in [HJ94,ASB+ 95] where a number of results concerning the decidability of model checking were also proved. In [FH94] a probabilistic epistemic logic was given and proved to be complete and decidable, based on a number of results of [FHM90] (we call this logic PEL, following [Koo03]). PEL can express linear combinations of probabilities. This linear restriction means that it is impossible to express in PEL such basic properties as the independence of p and q: “the probability of p ∧ q equals the probability of p times the probability of q”. However in [FHM90] a PSPACE decision procedure is given for a probabilistic language which allows multiplication and is thus able to express such properties as independence and conditional probabilities.

A Logic for Probabilistic XML documents

195

Another important extension is to allow updates of a probabilistic system. An update logic is given in [vB03]. A probabilistic logic with a modal update operator is in [Koo03, chapter 6]. A similar update system is considered in [Bacc90]. The contribution of this paper is two sided. The logics given above cannot express properties like: (i) the probability of p being true after three probabilistic updates is greater than 0.7 (expressed as ↓3 p > 0.7 in PXML) or (ii) arbitrary nested operations of the probabilistic operator, for example: ↓5 (↓6 p > 0.7) > 0.5 is expressible in PXML and not in PEL. The main decidability result of the current paper is the first decidability result for a probabilistic logic which allows nesting of the probability operator and arbitrary iterations of the probability operator. On the other side, we prove the finite model property and the decidability of PXML without first proving its completeness, as they authors of [FHM90] did for PEL. A completeness proof for our logic can be obtained, along the lines of [FHM90], but many lemmas need to be rewritten for adopting the iterative nature of our logic. We will present the completeness proof of PXML in a subsequent paper.

3

PXML

We are ready to start our formal work on probabilistic logics by defining the syntax of our logic PXML. Then we define the semantics and present examples for illustrating the use of these semantics. In the following definition it might help to think of ↓ φ as ‘the probability of φ being true’ or ‘the probability of φ being true after one probabilistic update’. Definition 1. The language of PXML has numerical terms τ and propositional formulas φ. Assume we have two disjoint countable sets of “tags” and “contents”. P rop := tag | (tag, content) τ := r |↓n φ | (τ1 + τ2 ) | (τ1 × r) (r ∈ IR, n ∈ IN \ {0}) φ := P rop | τ > 0 | ¬φ | φ ∧ ψ The first two alternatives in the definition of formulas φ are called primitive formulas. We may write ↓ φ for the term ↓1 φ. Let PXML1 be the restriction of PXML obtained by only allowing ↓n when n = 1 (this is essentially the non-epistemic sublanguage of PEL). We also define the language PXML× , similar to PXML with one difference: the definition of a term of PXML× is: τ = r |↓ φ | (τ1 + τ2 ) | (τ1 × τ2 ). So PXML× allows multiplication of arbitrary terms, but it only allows the ↓n operator when n = 1. We write deg(φ) to denote the degree of terms and formulas; which roughly is defined as the sum of the maximal numbers of nested applications of ↓ in φ.

4

Semantics

We have two kinds of models: general and probabilistic. Definition 2 (Models). A structure M for PXML or PXML× is M = (W, f, V ) such that, W 6= ∅, a set of possible worlds, V : P rops → ℘(W ) assigns a set of worlds to each proposition, and P f : W × W → {r ∈ IR : r ≥ 0} satisfies the bounded sum condition: for any w ∈ W the sum v∈W f (w, v) ∈ IR exists. P A structure M = (W, f, V ) is called a probabilistic structure if, for all w ∈ W , we have v∈W f (w, v) = 1. PXML-terms can be evaluated in a structure M = (W, f, V ) by: [r]M,w = r where r ∈ IR, P P M,w n M,w [↓ φ] = v:M,v|=φ f (w, v), [↓ φ] = v f (w, v) · [↓n−1 φ]M,v , for n > 1, [τ1 + τ2 ]M,w =

196

Robin Hirsch and Evan Tzanis

[τ1 ]M,w + [τ2 ]M,w and [τ × r]M,w = [τ ]M,w · r. For PXML× terms we also use [τ1 × τ2 ]M,w = [τ1 ]M,w · [τ2 ]M,w . Formulas of these logics can be evaluated by (we only give the new case here) M, w |= τ > 0 iff [τ ]M,w > 0. If M is a structure, φ is a PXML-formula (or a PXML× formula) and M, w |= φ then (M, w) is a model of φ.

5

Examples

Probability: Given a formula φ, a probabilistic model M and a world w we can interpret [↓ φ]M,w as the probability of φ. Because PXML allows us to nest applications of ↓n , we can write such statements as ↓ (↓ p > 0.5) = 0.8, which is interpreted as “the probability that ‘the probability of p is more than a half’ is 0.8.”.The inclusion of a multiplication operator in PXML× allows us to express the independence of two formulas φ, ψ: ↓ (φ∧ψ) =↓ φ× ↓ ψ. The following formula is expressible in PXML× but not by the langauge introduced in [FHM90]: ↓ (↓ p > 0.3)) × ↓ q > 0.7. P Counting: The term ↓ φ evaluates at a world w to v:M,v|=φ f (w, v). If f (w, v) = 1 uniformly, this evaluated term could simply count the number of successor worlds where φ is true. Let the accessibility relation in a Kripke-like structure be the ‘child of’ relation — so our Kripke structure is a family tree. The term ↓ >, when evaluated at an individual w, represents the number of children of w. The formula ↓ > = 3 is true when evaluated at w iff w has exactly three children. The term ↓2 > represents the number of grandchildren of w, when evaluated at w. The formula ↓2 > = 3 is true at w if w has exactly three grandchildren. Without using the ↓2 operator we could express this as (↓ (↓ > = 1) + 2× ↓ (↓ > = 2) = 3) ∧ (↓ (↓ > > 2) = 0), but the formula ↓2 > = 3 is much simpler. Furthermore, when we allow accessibility relations with non-integral weights it will be impossible to find an equivalent expression using only ↓. It would be of some interest to investigate the application of PXML to the theory of counting modalities; the paper [DL06] seems to be closely related. Deeper XML Trees: Consider the XML tree in figure 1 (this is part of a larger tree given in [NJ02, section 6.2]). chiefsofstates w

»XXX 0.5 »»» XX chiefofstate » chiefofstate 0.2 »XX » HH 0.4 » X » ¡@ X (age, 56) 0.3 H 0.7 H (name, George W. Bush) ¡ 0.35 (age, 55) ¡0.2 @ @ (name, Bill Clinton) (name, George Bush) (age, 55) (age, 54) Fig. 1.

Consider the following queries:Chiefs of state with an age equal to 55. This by the term: ↓2 (age, 55). P can be 0captured 0 2 M,w 1 At the root, it evaluates to: ↓ (age, 55)] = w0 f (w, w ) · [↓ (age, 55)]M,w = 0.5 · 0.35 + 0.2 · 0.3 = 0.235. Thus M, w |= (↓2 (age, 55) = 0.235), which agrees with the corresponding part of the calculation in [NJ02]. Note that this query really requires the ↓2 operator and cannot be expressed in the logic PEL. Chiefs of state with a name equal to George Bush and an age equal to 55. The query “What is the probability that there is a chief of state called George Bush with an age of 55?” can be re-expressed as “What is the probability that there is a head of state definitely called George Bush and definitely 55 years old?”. This corresponds to evaluating the term ↓1 ((↓1 (name, George Bush) = 1) ∧ (↓1 (age, 55) = 1)) at w. But this term evaluates to 0, since at each of the two successors of w it is not certain that the name of the chief of state is

A Logic for Probabilistic XML documents

197

George Bush. We can also express the query “What is the probability that there is a chief of state who is probably called George Bush and is possibly aged 55?”. Of course, this depends on what counts as probable and possible, but you could try this: ↓1 (↓1 (name, George Bush) > 0.6 ∧ ↓1 (age, 55) > 0.2) > 0.4

6

Finite Model Property

The complexity of the validity problem for the logic PXML1 — where ↓ applications can be nested, but ↓n is only allowed when n = 1, and all combinations of terms are linear — is shown to be NP-complete in [FH94, theorem 4.6]. A language which allows multiplication of terms but does not allow nesting of applications of ↓ is shown to be in PSPACE in [FHM90, theorem 5.3]. Our concern here is to prove the finite model property for the logics PXML× and PXML. In the following section we will deduce a complexity bound for both logics. Let θ be a satisfiable formula of either logic. By a standard unravelling argument, we can suppose that there is a tree with nodes W , edges E, root w and height r = deg(θ) such that for all x, y ∈ W if f (x, y) > 0 then (x, y) ∈ E. Such a model is called a tree-like model with edges E. For any node v ∈ M, the set of successors of v, is defined by sucM (v) = {u ∈ M : (v, u) ∈ E}. For any node v ∈ W \ {w} there is a unique parent node π(v) satisfying (π(v), v) ∈ E, but π(w) is not defined. The set of ancestors of v is defined to be {v, π(v), π(π(v)), . . . , w}. Define the height h(v) of a node v in M by (i) h(w) = r and (ii) if h(π(v)) = i then h(v) = i − 1. The subtree generated by v in M is the subtree whose nodes are all the nodes below (or equal to) v in M with edges, valuation and weight obtained by restriction from M. Let P be the set of primitive subformulas of θ and for i ≤ r let Pi ⊆ P be the set of primitive subformulas of θ of degree not more than i (thus Pr = P ). PXML× formulas. Let θ be a PXML× formula, let M = (W, V, f ) and suppose M, w |= θ. As just remarked we can assume that M is a tree like model of depth r = deg(θ). We will define a new model Mp = (W p , V p , f p ) (the superscript p is for ‘pruned’) by deleting some nodes from W , restricting the valuation to W p and by restricting and altering the weight function f . Define an equivalence relation ∼ on sucM (w) by x ∼ x0 ⇐⇒ ∀λ ∈ Pr−1 (M, x |= λ ↔ M, y |= λ)

(1)

Write [x] for the equivalence class containing x. Since |Pr−1 | ≤ |θ| there are at most 2|θ| equivalence classes. Let ρ be a choice function picking one member from each equivalence class, ρ : sucM (w) → sucM (w) such that (i) ρ(x) ∈ [x] and (ii) x ∼ x0 → ρ(x) = ρ(x0 ). Let W p ⊆ W be defined by v ∈ W p ⇐⇒ (v = w or ∃x ∈ sucM (w) : ρ(x) is an ancestor of v) i.e. we keep the root and the subtrees generated by the chosen elements ρ(x) : x ∈ sucM (w) p p p but delete all other nodes. Define the valuation P V by V (q) = V (q)∩W , for any proposition q. For the weight function, let f p (w, ρ(x)) = x0 ∼x f (w, x0 ) for x ∈ sucM (w), and for y, z ∈ W p if y 6= w then f p (y, z) = f (y, z). This completes the definition of Mp . Lemma 1. Let Mp be defined from M as above. Mp , w |= θ. For each subset X ⊆ Pr−1 there is at most one successor y of the root w satisfying all of X but none of Pr−1 \ X. Proof. First we check that if ψ is any subformula of θ of degree r − 1 or less, then [↓ ψ]M,w = p [↓ ψ]M ,w . For any x ∼ x0 ∈ sucM (w) and any φ ∈ Pr−1 we have M, x |= φ ⇐⇒ M, x0 |= φ, by definition of ∼. By structured formula induction it follows that for P any subformula ψ of θ of degree r − 1 or less, M, x |= ψ ⇐⇒ M, x0 |= ψ. Hence [↓ ψ]M,w = x∈sucM (w):M,x|=ψ f (w, x)

198

Robin Hirsch and Evan Tzanis

then: [↓ ψ]M,w = =

X

X

ρ(x)∈sucM (w):M,ρ(x)|=ψ

x0 ∼ρ(x)

X

f (w, x0 )

f p (w, ρ(x)) = [↓ ψ]M

p

,w

ρ(x)∈sucMp (w):Mp ,ρ(x)|=ψ

It follows by structured term induction that if τ is any term occurring in θ then [τ ]M,w = p [τ ]M ,w . Let λ be any primitive formula occurring in θ. If λ is a proposition then by definition of V p , M, w |= λ ⇐⇒ Mp , w, |= λ. If λ is the primitive formula (τ > 0) then by the previous p paragraph [τ ]M,w = [τ ]M ,w and therefore M, w |= (τ > 0) ⇐⇒ Mp , w |= (τ > 0). It now follows, by a simple structured formula induction, that for any boolean combination φ of primitive subformulas of θ, M, w |= φ ⇐⇒ Mp , w |= φ. In particular, M, w |= θ ⇐⇒ Mp , w |= θ. For any X ⊆ Pr−1 the set of all successor nodes to the root w of M satisfying all of X but none of Pr−1 forms a ∼-equivalence class. By construction of Mp , all but one of these will be deleted in Mp . Theorem 1. Let θ be a PXML× formula of degree r. If θ is satisfiable then it has a tree-like model of height r and where for each node v of height i + 1 and each X ⊆ Pi there is at most one successor node y to v such that M, y |= X but for all λ ∈ Pi \ X, M, y |= ¬λ. M has size at most 2|θ|×(r+1) . The theorem is proved by repeated use of lemma 1. PXML formulas For PXML formulas, we cannot simply take a model and ‘prune’ it, as above, because two child nodes might be equivalent, but their successors might give different contributions to a term ↓n ψ, say, and the root formula can have access to nodes further down the tree. Instead, for PXML formulas, we make use of the fact that PXML allows only linear combinations of terms ↓n ψ. Instead of ‘pruning’ a tree like model for a PXML formula θ, we replace a set of ‘equivalent’ successor nodes by a single weighted average. The linearity of our terms makes this work. Now we explain this more carefully. Let M, w |= θ. As above we can assume that the edges with non-zero weight in M form a subgraph of a tree T of depth r = deg(θ) with root w and that every branch has length r. Define the parent, successors, ancestors and height of a node, as before. As before, let P be the set of primitive subformulas of θ and for i ≤ r let Pi ⊆ P be the set of primitive subformulas of θ of degree not more than i. Let v be a node of M of height i where 0 < i ≤ r. Define an equivalence relation ∼ on sucM (v) by (1), as before. Recall that there are at most 2|θ| ∼ equivalence classes. We will define Mw = (W w , f w , V w ) from M = (W, f, V ) by replacing each equivalence class by a single weighted average (here, the superscript w is for ‘weighted’): W w = W \ sucM (v) ∪ {[x] : x ∈ sucM (v)}, i.e. we replace the (possibly infinite) set of successors of v by the (finite) set of ∼-equivalence classes. Let V w (p) ∩ (W \ sucM (v)) = V (p) ∩ (W \ sucM (v)), for all propositions p (i.e. no change to the valuation except on successors of v). If p occurs in θ and x ∈ sucM (v) let [x] ∈ V w (p) ⇐⇒ x ∈ V (p). Since p ∈ Pi , if x0 ∈ [x] we have x ∈ V (p) ⇐⇒ x0 ∈ V (p), so this definition of V w (p) is well-defined. If p does not occur in θ the definition is arbitrary, say [x] 6∈ V w (p) in this case, for all x ∈ sucM (v). Let y, z ∈ W w . ½ f (y, z) if z ∈ W w I. If y ∈ W \ ({v} ∪ sucM (v)) let: f (y, z) = 0 otherwise

A Logic for Probabilistic XML documents

199

½P

f (v, x0 ) if z = [x], some x ∈ sucM (v) 0 otherwise III. If y = [x] for some x ∈ sucM (v), let ( f (π(z), z) · P f0(v,π(z)) if π(z) ∈ [x] 0 w x ∼x f (v,x ) f ([x], z) = 0 otherwise II. If y = v let: f w (v, z) =

x0 ∼x

Lemma 2. Let M, w |= θ and let v ∈ M be any node of height i > 0. The structure Mw (defined above) satisfies: w

1. For any world y, if y 6∈ sucM (v) and y is not an ancestor of v then [τ ]M ,y = [τ ]M,y (any term τ ) and M, y |= ψ ⇐⇒ Mw , y |= ψ (any formula ψ). w 2. For any subterm τ of θ of degree less than i and any x ∈ suc(v) we have [τ ]M ,[x] = P

f (v,x0 )·[τ ]M,x 0 x0 ∼x f (v,x )

0

. For all subformulas φ of θ of degree less than i and all x ∼ x0 ∈ suc(v) we have: M, x |= φ ⇐⇒ M, x0 |= φ ⇐⇒ Mw , [x] |= φ. The map [x] 7→ {λ ∈ Pi : Mw , [x] |= λ} is an injection from sucMw (v) into ℘(Pi ). w 3. For any subterm τ of degree i or less, occurring in θ, we have [τ ]M ,v = [τ ]M,v . For any subformula ψ of degree i or less, occurring in θ, we have: Mw , v |= ψ ⇐⇒ M, v |= ψ 4. Let u be an ancestor of v of height j ≥ i. For any subterm τ of θ of degree j or less, we w have [τ ]M,u = [τ ]M ,u . For any subformula φ of θ of degree j or less, we have M, u |= φ ⇐⇒ Mw , u |= φ. x0P ∼x

For a detailed proof see [HT07]. Theorem 2. If a PXML formula θ is satisfiable then it is satisfiable in a tree-like model where for each node v of height i + 1 and each X ⊆ Pi there is at most one successor of v satisfying all of X but none of Pi \ X. Such a model has size not more than 2|θ|×(deg(θ)+1) . Proof. Let M, w |= θ. By lemma 2 (4) applied to the root w, we can replace M by a tree like model Mw |= θ of height r and root w, and by lemma 2 (2), for each X ⊆ Pr−1 there is at most one successor of w satisfying all of X but none of Pr−1 \ X. Now apply lemma 2 to each successor of w in Mw and continue to work down the tree, applying the lemma to each node in turn, until we obtain a model M∗ such that the unique successor property holds at every node. By lemma 2(4), this will not alter the truth of subformulas of θ when evaluated at the root w. Now every node has at most |℘(Pi )| ≤ |℘(P )| ≤ 2|θ| successors in M∗ and M∗ has depth r. It follows that M∗ has at most 2|θ|×(r+1) nodes.

7

Decidability

In the 1930s Tarski worked out how to eliminate quantifiers in the theory of real closed fields, although his result only gave a non-elementary decision procedure for polynomial equations over the reals [Tar51]. Several papers including [Col75,GV88] provided much more efficient algorithms for this decision problem. The following proposition is [Can88, theorem 3.3]. Proposition 1 {q(¯ x) ≥ 0} be a algorithm1 that sequence of real

(Tarski, Canny). Let x ¯ be a sequence of variables and let {p(¯ x) > 0 : p ∈ P }∪ set of polynomial inequalities with rational coefficients. There is a PSPACE determines whether there is an assignment x ¯ 7→ r¯ from the variables to a numbers r¯ satisfying each of the polynomial equations.

Fix some formula θ. Let Pi be the set of primitiveVsubformulas of θ of degree i or less. A subset V X ⊆ Pi is said to be i-consistent iff the formula X ∧ λ∈Pi \X ¬λ has a model. 1

The size of this instance is calculated by writing each rational coefficient as integers written as signed binary numbers.

m n

where m and n are

200

Robin Hirsch and Evan Tzanis

Lemma 3. Let θ be a PXML× or a PXML formula. θ is satisfiable iff it has a tree like model M of depth equal to the degree of θ and for each node v of height i + 1 in M the map y 7→ {λ ∈ Pi : M, y |= Pi } is a bijection from sucM (v) to {X ⊆ Pi : X is i-consistent}. Proof. By theorems 1 and 2. See [HT07] for detailed proof. Theorem 3. The validity problem for PXML× and PXML are decidable. There is an EXPSPACE algorithm to determine validity of either type of formula. Proof. Let θ be a PXML× or a PXML formula. We will show how to decide if θ is satisfiable or not (validity of a formula can then be determined by checking the satisfiability of the negated formula). By propositional manipulation, we can assume that θ is a conjunction of primitive formulas (converting θ to disjunctive normal form is an NP-complete problem, so this conversion can certainly be achieved in PSPACE). Assume inductively that we have an EXPSPACE algorithm that checks if formulas whose degree is strictly less than that of θ are satisfiable or not. By lemma 3, θ is satisfiable iff it is satisfiable at the root w of a tree-like model M of height equal to the degree of θ, such that for each node v of height i + 1 the map y 7→ {λ ∈ Pi : M, y |= λ} is a bijection sucM (v) → {X ⊆ Pi : X is i-consistent}.

(2)

Let T be a tree with root w, height equal to the degree r of θ and where for i < r if v is a node of height i + 1 then the successors of v are {(v, X) : X is i-consistent}. Our inductive hypothesis allows us to calculate which X are i-consistent, in EXPSPACE. This determines the tree T . By lemma 3, θ is satisfiable iff it has a model M = (T, f, V ), for some valuation V and some weight function f where v) > 0 implies that (u, v) is an edge of T , for all nodes V f (u,V (v, X) of height i, M, (v, X) |= X ∧ λ∈Pi \X ¬λ and M, w |= θ. The last two requirements determine V — (v, X) ∈ v(p) ⇐⇒ p ∈ X and w ∈ v(p) iff p is one of the conjuncts of θ. So far we have calculated the nodes, the height, the valuation and the successor relation in M, it remains to specify the weight function f . For each node v 6= w in M let xv be a variable representing the value f (π(v), v) in M. If (τ > 0) is a primitive formula in X, for some i-consistent set X, then the requirement M, v |= (τ > 0) is equivalent to a certain polynomial inequality in the variables corresponding to the nodes below v, as we explain next. The P term τ evaluated at v translates V to a M,v M,v polynomial as follows: [r] = r, [↓ ψ] = {x(v,X) : X ⊆ Pi is i-consistent, ( X ∧ P M,v M,v ¬ψ) is not consistent}, [↓n ψ] = y∈suc(v) xy · [↓n−1 ψ]M,y , [τ1 + τ2 ] = [τ1 ]M,v + M,v

[τ2 ]M,v , [r × τ1 ] = r · [τ1 ]M,v whereVr ∈ IR, n > 1. Note, in the second case, that we can determine in EXPSPACE whether X ∧ ¬ψ is consistent, by our induction hypothesis. In this way, M, v |= (τ > 0) corresponds to the polynomial inequality [τ ]M,v > 0. Similarly M, v |= ¬(τ > 0) translates to the polynomial inequality [τ ]M,v ≤ 0. Thus, θ is satisfiable iff θ is satisfiable in a model M based on the tree T , as defined above, for certain values of the variables xv : v ∈ M iff the variables xv : v ∈ M satisfy a set of polynomial inequalities. By proposition 1, this can be determined by an algorithm running in polynomial space of the size of the set of polynomial inequalities, i.e. exponential space in terms of |θ|.

8

Conclusion

In this paper a propositional modal logic PXML was given for probabilistic XML documents with updates. Our motivation was to introduce a logical formalism of the work of [NJ02]. We proved that PXML has the finite model property and that its validity problem can be solved in EXPSPACE.

A Logic for Probabilistic XML documents

201

References [AH94]

Martin Abadi and Joseph Y. Halpern. Decidability and expressiveness for first-order logics of probability. J. Symbolic Computation, 112(1):1–36, 1994. [ASB+ 95] Adnan Aziz, Vigyan Singhal, Felice Balarin, Robert K. Brayton, and Alberto L. Sangiovanni-Vincentelli. It usually works: The temporal logic of stochastic systems. In Proc. of Conference on Computer-Aided Verification, 1995. [Bacc90] Bacchus, F. Representing and Reasoning with Probabilistic Knowledge, A Logical Approach to Probabilities. Cambridge, Massachusetts: The MIT Press. [Can88] J Canny. Some algebraic and geometric computations in PSPACE. Proc. of the 20th SCM symposium on theory of computing, pages 460–467, 1988. [Col75] E Collins. Quantifier elimination for real closed fields by cylindrical algebraic decomposition. Automata theory and formal languages, 33:134–183, 1975. [DL06] Stephane Demri and Denis Lugiez Complexity of Modal Logics with Presburger Constraints. Laboratoire Specification et Verification, Research Report LSV-06-15, 2006. [FH94] Ronald Fagin and Joseph Y. Halpern. Reasoning about knowledge and probability. Journal of the ACM, 41(2):340–367, 1994. [FHM90] Ronald Fagin, Joseph Y. Halpern, and Nimrod Megiddo. A logic for reasoning about probabilities. Information and Computation, 87(1,2):78–128, 1990. [Fin72] K Fine. In so many possible worlds. Notre Dame Journal of Formal Logic, 13(4):516–520, 1972. [Gob70] L Goble. Grades of modality. Logique et Analyse, 13:323–334, 1970. [GV88] D Grigoryev and N Vorobjov. Solving systems of polynomial inequalities in sub-exponential time. Journal of symbolic computation, 5(1-2): 37 - 64, 1988. [Hal90] Joseph Y. Halpern. An analysis of first order logics of probability. Artificial Intelligence, 46:311–350, 1990. [Hal91] J Halpern. The relationship between knowledge, belief, and certainty. Annals of Mathematics and Artificial Intelligence, 4:301–322, 1991. [HJ94] Hans Hanssohn and Bengt Johnsson. A logic for reasoning about time and reliability. Formal Aspects of Computing, 6(5):512–535, 1994. [HT07] Robin Hirsch and Evan Tzanis. A probabilistic Logic for XML documents with applications to updates. (A complete version of the current paper) Available at: http://www.cs.ucl.ac.uk/staff/E.Tzanis/cie/paper.pdf [HL] A Hunter and W Liu. Measuring the quality of uncertain information using possibilistic logic. In Quantitative and Qualitative Approaches to Reasoning with Uncertainty, 3571:415– 426, 2005 [Koo03] Barteld Kooi. Knowledge, chance, and change. PhD thesis, University of Amsterdam, 2003, 2003. [Koz81] Dexter Kozen. Semantics of probabilistic programs. Journal of Computer and System Sciences, 22:328–350, 1981. [Koz85] Dexter Kozen. A probabilistic PDL. Journal of Computer and System Sciences, 30:162–178, 1985. [MvKA05] A. de Keijzer M. van Keulen and W. Alink. A probabilistic xml approach to data integration. In Proceedings of ICDE 05, 459 - 470, 2005. [NJ02] A Nierman and H Jagadish. Protdb: Probabilistic data in xml. In In Proceedings of VLDB02, Lecture Notes in Computer Science, volume 2590, pages 646–657, 2002. [Tar51] A Tarski. A decision method for elementary algebra and geometry. Univ. California Press, Berkeley, 1951. [vB03] J van Benthem. Conditional probability and update logic. Journal of Logic, Language and Information, 12: 409-421, 2003

Computational Power of Intramolecular Gene Assembly Tseren-Onolt Ishdorj1,3 , Ion Petre1,2 , and Vladimir Rogojin1,2 1

Computational Biomodelling Laboratory Department of Information Technology Åbo Akademi University Turku 20520 Finland {tishdorj,ipetre,vrogojin}@abo.fi 2 Turku Centre for Computer Science Turku 20520 Finland 3 Research Group on Natural Computing Department of CS and AI, Sevilla University Avda. Reina Mercedes s/n, 41012 Sevilla, Spain

Abstract. The process of gene assembly in ciliates, an ancient group of organisms, is one of the most complex instances of DNA manipulation known in any organism. Three molecular operations (ld, hi, and dlad) have been postulated for the gene assembly process, [3], [1]. We propose in this paper a mathematical model for contextual variants of ld and dlad on strings: recombinations can be done only if certain contexts are present. We prove that the proposed model is Turing-universal.

1

Introduction

Ciliates are an ancient group of eukariotes (about 2.5 billion years old). They are known to be the most complex unicellular organisms on the Earth. Their main special feature which differs them from other eukariotes is nuclear duality: ciliates have two types of nuclei (micronucleus and macronucleus) performing completely different functions. Micronuclei are used mainly to store genetical information for future generations, while macronuclei contain genes used to produce proteins during the life-time of a cell. Genomes are stored in these two types of nuclei in two completely different ways: micronuclear genes are highly fragmented and shuffled, fragments (coding blocks) are separated from each other by non-coding blocks, while in macronuclei each DNA-molecule contains usually one gene stored in assembled (non-fragmented) way. During sexual reproduction coding blocks from micronuclei get assembled into macronuclear genes. For details related to ciliates and the gene assembly process we refer to [6], [14], [15]. Two models were proposed for the gene assembly process in ciliates: the intermolecular model in [7], [9], [10] and the intramolecular model in [3] and [16]. They both are based on so called “pointers” - short nucleotide sequences (about 20 bp) lying on the borders between coding and non-coding blocks. Each coding block E starts with a pointer-sequence repeating exactly the pointer-sequence in the end of that coding block preceding E in the assembled gene. It is currently believed that the pointers guide the alignment of coding blocks during the gene assembly process. The bulk of the research on the intermolecular model concentrates on the computational power of the model, in various formulations. E.g., in [7], the so-called guided recombination systems were introduced, defining a context-based applicability of the model. The authors proved that this intermolecular guided recombination system with insertion/deletion operations is computationally universal. For this, they constructed for each Turing machine a guided recombination system, so as for each computation of the Turing machine, there is a corresponding

Computational Power of Intramolecular Gene Assembly

203

sequence of recombinations in the guided recombination system. Crucially, the input of the recombination system has to be given in a large enough number of copies. Most of the research on the intramolecular model concentrates on the combinational properties of the gene assembly process, including the number and the type of operations used in the assembly, parallelism, or invariants. In this paper we initiate a study of the intramolecular model from the perspective of the computability theory. Using a similar approach as in [7], we introduce a context-based version of the intramolecular model and prove that it is Turing universal. We prove that any Turing machine may be simulated through intramolecular recombination systems: for any Turing machine M there exists a recombination system G such that for any word w, w is accepted by M , if and only if ϕ(w) is accepted by G, for a suitable encoding ϕ. Unlike in the intramolecular case, no multiplicities are needed in this case, since the intramolecular model conjectures that all useful (genetic) information is preserved on a single molecule throughout the assembly.

2

Preliminaries

We assume the reader to be familiar with the basic elements of formal languages and Turing computability [17], DNA computing, [13]. We present here only some of the necessary notions and notation. An alphabet is a finite set of symbols (letters), and a word (string) over an alphabet Σ is a finite sequence of letters from Σ; the empty word we denote by λ. The set of all words over an alphabet Σ is denoted by Σ ∗ . The set of all non-empty words over Σ is denoted as Σ + , i.e., Σ + = Σ ∗ \ {λ}. The length |x| of a word x is the number of symbols that x contains. The empty word has length 0. Given two words x and y, the concatenation of x and y (denoted as xy) is defined as the word z consisting of all symbols of x followed by all symbols of y, thus |z| = |x| + |y|. The concatenation of a word x with itself k times is denoted as xk , and x0 = λ. We denote by |x|S the number of letters from the subset S ⊆ Σ occurring in the word x and by |x|a the number of letters a in x. If w = xy, for some x, y ∈ Σ ∗ , then x is called a prefix of w and y is called a suffix of w; if w = xyz for some x, y, z ∈ Σ ∗ , then y is called a substring of w. A rewriting system M = (S, Σ ∪ {#}, P ) is called a Turing machine (we use also abbreviation TM), [17], where: (i) S and Σ ∪ {#}, where # ∈ / Σ and Σ 6= ∅, are two disjoint sets referred to as the state and the tape alphabets; we fix a symbol from Σ, denote it as t and call it “blank symbol”. (ii) Elements s0 and sf of S are the initial and the final states respectively. (iii) The productions (rewriting rules) of P are of the forms (1) (2) (3) (4) (5) (6) (7)

si a −→ sj b si ac −→ asj c si a# −→ asj t # csi a −→ sj ca #si a −→ #sj t a sf a −→ sf asf −→ sf

where si and sj are states in S, si 6= sf , and a, b, c are in Σ. A Turing machine M is called deterministic if: – each word si a from the left side of the rule (1) is not a subword of the left sides from rules (2)–(5), and

204

Tseren-Onolt Ishdorj, Ion Petre, and Vladimir Rogojin

– each subword si a from the left side of rules (2) and (3) is not subword from the left side of rules (4) and (5), and viceversa, each subword si a from the left side of rules (4) and (5) is not subword of the left side of rules (2) and (3), and – for each left side ui of the rules (1)–(5) it corresponds exactly one right side vi . A configuration of the Turing machine M is presented as a word #w1 si w2 # over Σ∪{#}∪S, where w1 w2 ∈ Σ ∗ represents the contents of the tape, #-s are the boundary markers, and the position of the state symbol si indicates the position of the read/write head on the tape: if si is positioned at the left of a letter a, this indicates that the read/write head is placed over the cell containing a. The TM M changes from one configuration to another one according to its set of rules P . We say that the Turing machine M halts with a word w if there exists a computation such that, when started with the read/write head positioned at the beginning of w, the TM eventually reaches the final state, i.e., if #s0 w# derives #sf # by successive applications of the rewriting rules (1)–(7) from P . The language L(M ) accepted by the TM M is the set of words on which M halts. If TM is deterministic, then there is the only computation possible for each word. The family of languages accepted by Turing machines is equivalent to the family of languages accepted by deterministic Turing machines. Using an approach developed in a series of works (see [11], [12], [4], and [8]) we use contexts to restrict the application of molecular recombination operations, [13], [1]. First, we give the formal definition of splicing rules. Consider an alphabet Σ and two special symbols, #, $, not in Σ. A splicing rule (over Σ) is a string of the form r = u1 #u2 $u3 #u4 , where u1 , u2 , u3 , u4 ∈ Σ ∗ . (For a maximal generality, we place no restriction on the strings u1 , u2 , u3 , u4 . The cases when u1 u2 = λ or u3 u4 = λ could be ruled out as unrealistic.) For a splicing rule r = u1 #u2 $u3 #u4 and strings x, y, z ∈ Σ ∗ we write (x, y) `r z if and only if x = x1 u1 u2 x2 , y = y1 u3 u4 y2 , z = x1 u1 u4 y2 , for some x1 , x2 , y1 , y2 ∈ Σ ∗ . We say that we splice x, y at the sites u1 u2 , u3 u4 , respectively, and the result is z. This is the basic operation of DNA molecule recombination. A splicing scheme [5] is a pair R = (Σ, ∼), where Σ is the alphabet and ∼, the pairing relation of the scheme, ∼⊆ (Σ + )3 × (Σ + )3 . Assume we have two strings x, y and a binary relation between two triples of nonempty words (α, p, β) ∼ (α0 , p, β 0 ), such that x = x0 αpβx00 and y = y 0 α0 pβ 0 y 00 ; then, the strings obtained by the recombination in the context from above are z1 = x0 αpβ 0 y 00 and z2 = y 0 α0 pβx00 . When having a pair (α, p, β) ∼ (α0 , p, β 0 ) and two strings x and y as above, x = x0 αpβx00 and y = y 0 α0 pβ 0 y 00 , we consider just the string z1 = x0 αpβ 0 y 00 as the result of the recombination (we call it one-output-recombination), because the string z2 = y 0 α0 pβx00 , we consider as the result of the one-output-recombination with the respect to the symmetric pair (α0 , p, β 0 ) ∼ (α, p, β). 2.1

Intramolecular Gene Assembly Operations

The intramolecular operations excise non-coding blocks from the micronuclear DNA-molecule, interchange positions of some portions of the molecule or invert them, so as to obtain after some rearrangements the DNA-molecule containing a continuous succession of coding blocks, i.e., the assembled gene. Contrary to the intermolecular model, all the molecular operations in the intramolecular model are performed within a single molecule. We recall bellow the three intramolecular operations conjectured in [3] and [16] for the gene assembly, which were proved to be complete [2], i.e., any sequence of coding and non-coding blocks can be assembled to the macronuclear gene by means of these operations (for details related to the intramolecular model we refer to [1]): • ld excises a non-coding block flanked by the two occurrences of a same pointer in the form of a circular molecule, as shown in Figure 1.

Computational Power of Intramolecular Gene Assembly

205

• hi inverts part of the molecule flanked by the two occurrences of a same pointer, where one pointer is the inversion of the other, as shown in Figure 2. • dlad swaps two parts of the molecule delimited by the same pair of pointers, as shown in Figure 3.

ld(i)

ld(ii)

ld(iii)

Fig. 1. Loop Recombination: (i) the molecule folds on itself aligning pointers in the direct repeat to form the loop, (ii) enzymes cut on the pointer sites, (iii) hybridization happens. As the result, a portion of the molecule in the loop is excised in the form of a circular molecule.

hi(i)

hi(ii)

hi(iii)

Fig. 2. Hairpin Recombination: (i) the molecule folds on itself aligning pointers in the inverted repeat to form the hairpin, (ii) enzymes cut on the pointer sites, (iii) hybridization happens. As the result, a portion of the molecule in the hairpin is inverted.

dlad(i)

dlad(ii)

dlad(iii)

Fig. 3. Double-Loop Recombination: (i) the molecule folds on itself aligning equal pointers from the repeated pair to form a double loop, (ii) enzymes cut on the pointer sites, (iii) hybridization happens. As the result, portions of the molecule in the loops interchange their places.

3

The Contextual Intramolecular Operations

We define the contextual intramolecular translocation and deletion operations as the generalization of dlad and ld operations, respectively. We follow here the style of contextual intermolecular recombination operations used in [7]. We consider a splicing scheme R = (Σ, ∼). Definition 1. The contextual intramolecular translocation operation with respect to R is defined as trlp,q (xpuqypvqz) = xpvqypuqz, where there are such relations (α, p, β) ∼ (α0 , p, β 0 ) and (γ, q, δ) ∼ (γ 0 , q, δ 0 ) in R, that x = x0 α, uqy = βu0 = u00 α0 , vqz = β 0 v 0 , xpu = x00 γ, ypv = δy 0 = y 00 γ 0 and z = δ 0 z 0 . We say that operation trlp, q is applicable, if the contexts of the two occurrences of p as well as the contexts of the two occurrences of q are in the relation ∼. Substrings p and q we call pointers. In the result of application of trlp, q strings u and v, each flanked by pointers p and q, are swapped. If from the non-empty word u we get by trlp,q operation word v, we write u ⇒trlp,q v and say that u is recombined to v by trlp,q operation.

206

Tseren-Onolt Ishdorj, Ion Petre, and Vladimir Rogojin

Definition 2. The contextual intramolecular deletion operation with respect to R is defined as delp (xpupy) = xpy, where there is a relation (α, p, β) ∼ (α0 , p, β 0 ) in R that x = x0 α, u = βu0 = u00 α0 , and y = β 0 y 0 . In the result of applying delp , the string u flanked by two occurrences of p is removed, provided that the contexts of those occurrences of p are in the relation ∼. If from the nonempty word u we get by delp word v, we write u ⇒delp v and say that the word u is recombined to v by delp operation. We define the set of all contextual intramolecular operations under the guiding of ∼ as follows: e = {trlp,q , delp | (α, p, β) ∼ (α0 , p, β 0 ), (γ, q, δ) ∼ (γ 0 , q, δ 0 ) R for some α, α0 , β, β 0 , γ, γ 0 , δ, δ 0 , p, q ∈ Σ + }. Now, we define an intramolecular recombination (AIR) system as the language accepting device that captures series of dispersed homologous recombination events on a single micronuclear molecule with a scrambled gene. Definition 3. An accepting intramolecular recombination system is a quadruple G = (Σ, ∼ , α0 , wt ), where R = (Σ, ∼) is the splicing scheme, α0 ∈ Σ ∗ is the start word, and wt ∈ Σ + is the target word. The language accepted by G is defined as L(G) = {w ∈ Σ ∗ | α0 w ⇒∗Re wt }.

4

The Computational Power of Intramolecular Contextual Recombinations

Here we show, that by using intramolecular contextual operations one can express any deterministic Turing machine. We prove that for any Turing machine M over an alphabet Σ, we associate a recombination system R over an alphabet Σ 0 . Also, for any w ∈ Σ ∗ , we associate a word w0 ∈ Σ 0∗ such that w ∈ L(M ) iff w0 ∈ L(R). Intuitively, R simulates M in the following way: w0 encodes both the word w, as well as all rules of M in a large enough number of copies. It is important to have a large number of copies because in every step of the simulation, R “consumes” one rule of M , which is then never “recovered”. Theorem 1. For any deterministic Turing machine M = (S, Σ ∪ {#}, P ) there exists an intramolecular recombination system GM = (Σ 0 , ∼, α0 , wt ) and a string πM ∈ Σ 0∗ such that kw 2 for any word w over Σ ∗ there exists kw ≥ 1 such that w ∈ L(M ) if and only if w#5 πM # ∈ L(GM ). Proof. Consider a deterministic Turing machine M = (S, Σ ∪ {#}, P ) containing m rewriting rules in P . Each rule of P we identify uniquely by an integer 1 ≤ i ≤ m, and a rule identified as i we represent as i : ui → vi . The configuration of the Turing machine can be represented by the string # wl sq awr #, where a ∈ Σ, sq ∈ S and wl , wr ∈ Σ ∗ . We define a recombination system GM = (Σ 0 , ∼, α0 , wt ) and a string πM for the Turing machine M in the following way: Σ 0 = S ∪ Σ ∪ {#} ∪ {$i | 0 ≤ i ≤ m + 1}, α0 = #4 s0 , wt = #4 sf #3 , Y πM = $ 0 ( 1≤i≤m

p,q∈Σ∪{#}

$i pvi q$i )$m+1 .

Computational Power of Intramolecular Gene Assembly

207

For a rewriting rule i : ui → vi of the Turing machine M and all c1 , c2 , d1 , d2 , d3 , p, q ∈ Σ ∪{#} we define the relations: (i) (c1 c2 , p, ui qd1 d2 d3 ) ∼ ($i , p, vi q$i ) and (ii) (c1 c2 pui , q, d1 d2 d3 ) ∼ ($i pvi , q, $i ). Also we define the relation (iii) (###sf #, #, ###$0 ) ∼ ($m+1 , #, #). We have to prove the following claim: a word w ∈ Σ ∗ is accepted by M if and only if there is kw such kw , that word w#####πM ## is accepted by GM . Let a word w be accepted by the given Turing machine M , by the derivations #s0 w# ⇒i1 #wl1 sj1 wr1 # ⇒i2 #wl2 sj2 wr2 # ⇒i3 . . . ⇒k #wlk sjk wrk # ⇒k+1 . . . ⇒n #sf #. We kw prove that there is an integer kw big enough such that the word w#5 πM ## is ackw 5 kw cepted by the recombinations α0 w# πM ## = ####s0 w#####πM ## ⇒trlp1 ,q1 ### #wl1 sj1 wr1 #####π1 ## ⇒trlp2 ,q2 ####wl2 sj2 wr2 #####π2 ## ⇒trlp3 ,q3 . . . ⇒trlpk ,qk ####wlk sjk wrk #####πk ## ⇒trlpk+1 ,qk+1 . . . ⇒trlpn ,qn ####sf #####πn ##, where wli , wri ∈ Σ ∗ , sji ∈ S and πi ∈ Σ 0∗ for all 1 ≤ i ≤ n and πi+1 differs from πi only by a substring ui which replaces substring vi in πi . Since for each k < n there is a rule ik applicable to #wlk sjk wrk #, then #wlk sjk wrk # = wl0k puik qwr0 k , where sjk is in uik , p, q ∈ Σ ∪ {#} and wl0k , wr0 k ∈ (Σ ∪ {#})∗ . We suppose, that the string πk contains at least one copy of the substring pvik q, i.e., πk = $0 $1 ω 0 $ik pvik q$ik ω 00 $m $m+1 . Then there are two relations in our recombination system such as trlp,q operation is applicable to the string ###wl0k puik qwr0 k ####πk ##. Indeed, these relations are (i) (c1 c2 , p, uik qd1 d2 d3 ) ∼ ($ik , p, vik q$ik ), and (ii) (c1 c2 puik , q, d1 d2 d3 ) ∼ ($ik pvik , q, $ik ). In this way, we can obtain the string wl00k pvik qwr00k $0 $1 ω 0 $ik puik q$ik ω 00 $m+1 ## = #4 wlk+1 sjk+1 wrk+1 #5 πk+1 #2 from the string of ###wl0k puik qwr0 k ####πk ## = wl00k puik qwr00k $0 $1 ω 0 $ik pvik q$ik ω 00 $m+1 ##, where wl00k = ###wl0k = wl000k c1 c2 and wr00k = wr0 k #### = d1 d2 d3 wr000k . In this way, for each derivation step #wk # ⇒ik #wk+1 # from the Turing machine M we have the corresponded recombination step ####wk #####πk ## ⇒trlpk ,qk ####wk+1 #####πk+1 ##, in the recombination system GM . Now, we have to provide the number kw of copies of the πM big enough, so as for each derivation #wk # ⇒ik #wk+1 # we would have at least a copy of the substring vik in the substring πk . Such number kw exists and it is Turing computable. Indeed, this can be for instance kw ≥ n, i.e., the number of derivations of M in order to accept the word w. In this way, if w is accepted by M by the derivations #s0 w# ⇒ . . . ⇒ #sf #, then kw we can have recombination of ####s0 w#####πM ## to ####sf #####πn ## by kw trl operations in GM . In order to accept w#####πM ## in GM , we have to recombine ####sf #####πn ## to the target wt = ####sf ###. This can be done by the deletion operation in the relation (iii): ####sf #####πn ## ⇒del# ####sf ###.

208

Tseren-Onolt Ishdorj, Ion Petre, and Vladimir Rogojin

kw Now, we prove, that for each word ####s0 w#####πM ## accepted by the recombination system GM , Turing machine M accepts word w too. kw Assume, that there is such w ∈ Σ, that ####s0 w#####πM ## is accepted by GM for some kw > 0, but it is not accepted by M . That means, there are recombination operations possible which do not correspond to the derivation rules from M , i.e., there are possible recombinations of the form ####w0 #####π 0 ## ⇒Re w00 , where w0 ∈ (Σ ∪ S)∗ , |w0 |S = 1, π 0 , w00 ∈ Σ 0∗ and the recombination is not of the form ###ω 000 pui qω iv ####ω v $i pvi q$i ω vi ## ⇒trlp,q ###ω 000 pvi qω iv ####ω v $i pui q$i ω vi ##, where ω 000 , ω iv ∈ (Σ ∪ {#})∗ , ω v , ω vi ∈ Σ 0∗ and p, q ∈ Σ ∪ {#}. Such recombinations exist.

Assume, a relation of the form (i) or (ii) is applicable to the string #ω vii c1 c2 pui qd1 d2 d3 ω viii #$0 ω ix $i pvi q$i ω x $m+1 ##, where ω vii , ω viii ∈ (Σ ∪{#})∗ , ω ix , ω x ∈ Σ 0∗ and c1 , c2 , d1 , d2 , d3 , kw p, q ∈ Σ ∪ {#}, which is obtained from ####s0 w#####$0 πM ## only by translocations corresponding to the rules from P . Relation (ii) is not applicable to the string because we do not have substring ###sf # in #ω vii c1 c2 pui qd1 d2 d3 ω viii #$0 ω ix $i pvi q$i ω x $m+1 ##. Here we may have either:

Case del: $ = #ω vii c1 c2 pui qd1 d2 d3 ω viii #$0 ω ix $i pvi q$i ω x $m+1 ## ⇒delp #ω vii c1 c2 pvi q$i ω x $m+1 ## = $0 in the relation of the type (i), or #ω vii c1 c2 pui qd1 d2 d3 ω viii #$0 ω ix $i pvi q$i ω x $m+1 ## ⇒delq #ω vii c1 c2 pui q$i ω x $m+1 ## = $00 in the relation of the type (ii). Since relations of types (i) and (ii) both consider pair of pointers, one of which is from the left side and another one is from the right side of the substring #####$0 of string $, substring #####$0 is deleted, we obtain either $0 or $00 and after that it is not possible to reach by the recombination the string where the relation (iii) is applicable. Moreover, after the deletion operation either in the relation (i) or relation (ii), it is not possible to remove from the string symbol $m+1 in the relations (i) and (ii). Indeed, in any recombination in the relations (i) and (ii) of strings $0 and $00 the suffix $m+1 ## is not affected. Case trl: $ vii

=

#ω vii c1 c2 pui qd1 d2 d3 ω viii #$0 ω xi $i pvi q$i ω xii $i pvi q$i ω xiii $m+1 ## xii

viii

xi

xiii

⇒trlp,q

000

ω c1 c2 pvi q$i ω $i pvi qd1 d2 d3 ω #$0 ω $i pui q$i ω $m+1 ## = $ in the relations (i) and (ii), where ω xi , ω xii , ω xiii ∈ Σ 0∗ . Assume uj is the substring of pvi q. There is no context applicable to the string $000 . Indeed, according to the definition of the Turing machine from above, the maximal length of the suffix containing S-symbol as the prefix in the right side of a derivation rule is 3 (type (3) vi = ai sji t # or type (5) vi = #sji t ai , we represent vi 0 00 000 as vi = a0i sij a00i a000 i , where ai , ai , ai ∈ (Σ ∪ {#})) and in the rule of the type (7) (asf → sf ) S-symbol is the rightmost-symbol in the left side of the rule. There are no other types of rules where S-symbol is the rightmost in the left side of the rule. In this way, we consider that sji = sf . I.e., we have substring pvi q$i = pa0i sf a00i a000 i q$i . Relations (i) and (ii) are not applicable. Indeed, to the right from S-symbol we need to have at least 4 symbols not equal to $i in order to satisfy the left condition of (i) and (ii) (i.e., (c1 c2 , p, ui qd1 d2 d3 ) and (c1 c2 pui , q, d1 d2 d3 )). Similarly, we can show that to the left from S-symbol we need to have at least 3 symbols not $i in order to satisfy the left conditions of the relations (i) and (ii). There are no other places in the string $000 where left conditions of (i) and (ii) are satisfied, i.e., relations (i) and (ii) are not applicable as soon as the translocation involving symbols $i is used.

There are no other recombinations possible in the relations (i), (ii) and (iii). It follows then, that as soon as we have recombination not corresponding to a rule from P , the target wt kw cannot be reached, i.e., word w#####πM ## is accepted by GM if and only if w is accepted by M . u t

Computational Power of Intramolecular Gene Assembly

5

209

Final Remarks

In [7] the equivalence between a Turing machine language and a set of multisets of words was explored. Since we are working with the intramolecular model, we can prove here a universality result in a standard way, showing the equivalence of two families of languages. Acknowledgments. The work of T.-O.I. is supported by the Center for International Mobility (CIMO) Finland, grant TM-06-4036 and by Academy of Finland, project 203667. The work of I.P. is supported by Academy of Finland, project 108421. The work of V.R. is supported by Academy of Finland, project 203667. V.R. is on leave of absence from Institute of Mathematics and Computer Science of Academy of Sciences of Moldova, Chisinau MD-2028 Moldova. We are grateful to Artiom Alhazov for useful discussions.

References 1. Ehrenfeucht, A., Harju, T., Petre, I., Prescott, D. M., and Rozenberg, G., Computation in Living Cells: Gene Assembly in Ciliates, Springer (2003). 2. Ehrenfeucht, A., Petre, I., Prescott, D. M., and Rozenberg, G., Universal and simple operations for gene assembly in ciliates. In: V. Mitrana and C. Martin-Vide (eds.) Words, Sequences, Languages: Where Computer Science, Biology and Linguistics Meet, Kluwer Academic, Dortrecht, (2001) pp. 329–342. 3. Ehrenfeucht, A., Prescott, D. M., and Rozenberg, G., Computational aspects of gene (un)scrambling in ciliates. In: L. F. Landweber, E. Winfree (eds.) Evolution as Computation, Springer, Berlin, Heidelberg, New York (2001) pp. 216–256. 4. Galiukschov, B.S., Semicontextual grammars, Mathematika Logica i Matematika Linguistika, Talinin University (1981), 38–50 (in Russian). 5. Head, T., Formal Language Theory and DNA: an analysis of the generative capacity of specific recombinant behaviors. Bull. Math. Biology 49: 737–759, 1987. 6. Jahn, C. L., and Klobutcher, L. A., Genome remodeling in ciliated protozoa. Ann. Rev. Microbiol. 56 (2000), 489–520. 7. Kari, L., and Landweber, L. F., Computational power of gene rearrangement. In: E. Winfree and D. K. Gifford (eds.) Proceedings of DNA Bases Computers, V American Mathematical Society (1999) pp. 207–216. 8. Kari, L., and Thierrin, G., Contextual insertion/deletions and computability. Information and Computation 131 (1996) pp. 47–61. 9. Landweber, L. F., and Kari, L., The evolution of cellular computing: Nature’s solution to a computational problem. In: Proceedings of the 4th DIMACS Meeting on DNA-Based Computers, Philadelphia, PA (1998) pp. 3–15. 10. Landweber, L. F., and Kari, L., Universal molecular computation in ciliates. In: L. F. Landweber and E. Winfree (eds.) Evolution as Computation, Springer, Berlin Heidelberg New York (2002). 11. Marcus, S., Contextual grammars, Revue Roumaine de Matématique Pures et Appliquées, 14 (1969), pp. 1525–1534. 12. Păun, Gh., Marcus Contextual Grammars, Kluwer, Dordrecht, (1997). 13. Păun, Gh., Rozenberg, G., Salomaa, A., DNA Computing - New computing paradigms, SpringerVerlag, Berlin, (1998). 14. Prescott, D. M., The DNA of ciliated protozoa. Microbiol. Rev. 58(2) (1994), 233–267. 15. Prescott, D. M., Genome gymnastics: unique modes of DNA evolution and processing in ciliates. Nat. Rev. Genet. 1(3) (2000), 191–198. 16. Prescott, D. M., Ehrenfeucht, A., and Rozenberg, G., Molecular operations for DNA processing in hypotrichous ciliates. Europ. J. Protistology 37 (2001) 241–260. 17. Salomaa, A., Formal Languages, Academic Press, New York (1973).

Turing Degrees & Topology Iraj Kalantari and Larry Welch Department of Mathematics, Western Illinois University, Macomb IL 61455 USA

Abstract. This paper continues our study of computable point-free topological spaces and the metamathematical points in them (see [4], e.g.). For us, a point is the intersection of a sequence of basic open sets whose closures are compact and nested. We call such a sequence a sharp filter. A function fF from points to points is generated by a function F from basic open sets to basic open sets such that sharp filters map to sharp filters. We restrict our study to those functions that have at least all computable points in their domains. We follow Turing’s approach in stating that a point is computable if it is the limit of a computable sharp filter; we then define the Turing degree Deg(x) of a general point x in an analogous way. While a result of J. Miller [6] shows that not all points in all of our spaces have Turing degrees, we show a certain class of points do. We further show that in IRn all points have Turing degrees, and that these degrees are the same as the classical Turing degrees of points. We also prove the following: For a point x that has a Turing degree and lies either on a computable tree T or in the domain of a computable function fF , there is a sharp filter on T or in dom(F ) that converges to x and has the same Turing degree as x. Furthermore, all possible Turing degrees occur among the degrees of such points for a given computable function fF or complete, computable, binary tree T. For each x ∈ dom(fF ) for which x and f (x) have Turing degrees, Deg(fF (x)) ≤ Deg(x). Finally, the Turing degrees of the sharp filters convergent to a given x are closed upward in the partial order of all Turing degrees.

1

Introduction

The computable numbers were first described by Turing in [9] as the real numbers whose binary decimal expansions are computable. Various other representations of computable numbers have been considered, including decimal expansions in other base systems, continued fractions, and the like. The set of computable numbers is the same in all of these representations. In fact, if to each real number we assign a degree of unsolvability, commonly called a Turing degree, equal to the Turing degree of its binary decimal expansion, it turns out that the Turing degree of its expansion in any other base system or of its continued fraction expansion is the same; that is, the Turing degree of a real number is stable with respect to its representation. (Of course, some of the rationals have two binary decimal expansions, but when this occurs both expansions are computable. Since no other real number has more than one binary decimal expansion, each real number has a unique Turing degree.) There is a problem in a decimal representation of a real number, though, and that is that given a sequence of computable real numbers converging to a computable real number we cannot in all cases make a finite determination as to the decimal representation of the limit. In particular, if a computer program produces a binary decimal output that begins 0.111 . . ., it may not be clear from the program code whether the 1s continue forever to make the final output equal to 1. If this does not occur, we will see it in the output sooner or later. But if the sequence of 1s produced is infinitely long, there might not be a finite way of proving that it

Turing Degrees & Topology

211

is. In fact, decimal expansions are even more mischievous than this, for they can cause serious interference with an intuitively acceptable definition of computable functions. When using, say, base 10 decimal expansions to name numbers, multiplication by 3 is not a computable function. Turing therefore suggested representing a real number as the unique member of the intersection of a sequence of nested finite open intervals that shrink in length to 0. In a sense, therefore, he suggested using a point-free way of naming points; that is, he suggested taking intervals as the principal objects and naming points with sequences of intervals. This does not overcome the basic difficulty of determining, say, whether 0.111 . . . = 1, but it can be useful for the study of computable functions and computable real analysis, because the basic arithmetic operations (and many others) are computable in this setting. In several of our previous papers we have studied computable topological spaces from a similar point-free perspective, taking basic open sets to be the primitive objects and letting a point be the unique member of the intersection of a (properly formed) nested sequence of basic open sets. In this paper we define and investigate Turing degrees of the points so obtained. In Section 2 we give the necessary background from our previous work. Section 3 sets forth the definition of our Turing degrees of points. We are able to conclude there that all points of a particular type in any of our spaces have Turing degrees. We expand that result in Section 4 to show that the Turing degree of a point in IR, as defined using our point free approach, is the same as the Turing degree of its binary expansion, so that all points of IR, and indeed of IRn , have Turing degrees. We then visit a result of J. Miller to note that not all points in all spaces of our type have Turing degrees. In Section 5 we study the Turing degrees of points in the domain or range of a computable quantum function. Several authors, including Rettinger and Zheng (see [7]), have done extensive research into the computabililty of real numbers, with particular reference to monotone computability. Their work has led them to investigate k-monotone and other reducibilities. In this paper we do not address those issues, but look at some of the more basic aspects of Turing reducibility. Next, by using trees whose branches are nested sequences of basic open sets that converge to distinct points, we can prove some interesting theorems about our spaces. If we assume our basic open sets have some of the properties characteristic of the usual computable subbasis of IR that consists of intervals with rational endpoints, we can make still further observations. Sections 6 and 7 introduce the concepts we need related to trees and subbases. In Section 8 we note that if a point is the limit of a branch of one of our computable trees, then its Turing degree is the same as that of the branch. We also show that all possible Turing degrees are represented among the Turing degrees of the points to which the branches of a complete, computable, binary tree converge, and that the same is true for the points in the domain of a computable function, provided all computable points are in that domain. Finally, in Section 9 we show that the Turing degrees of all nested sequences of open sets converging to a point are closed upward in the partial ordering of all Turing degrees.

2

Background, Definitions & Basics

In this section, we recall some key definitions and results from our previous work [1] and [3], and fix the basic topological and recursion theoretic properties of the spaces of our study at the end. Basic properties of our spaces. Throughout this paper, we study hX, ∆i where X is a 1st countable, connected space containing at least two points, and ∆, its subbasis, is comprised of basic open sets each of which is connected and has compact closure. These assumptions imply that the space X is regular, second countable, and of second category. Fundamental examples of these spaces are IRn , for any n ≥ 1, with appropriate subbases. Familiar examples are IR with the subbasis of open intervals with rational endpoints, IR2 with the subbasis of open rectangles whose corners have rational coordinates and whose sides are parallel to the

212

Iraj Kalantari and Larry Welch

axes, and IR2 with the subbasis of open balls with rational radii whose centers have rational coordinates. A convenient notation. We adopt the notation α ⊂ ◦ β to mean α ⊆ β (α’s closure is a subset of β). Equivalently, we will also use β ⊃ ◦ α. Points, sharp filters, and enumerations. In this subsection, we summarize our machinery for capturing (computable) points through (computable) sharp filters, specify our topological and recursion theoretic settings, and describe an acceptable enumeration capturing all computable sharp filters. Our use of sharp filters to name points can be viewed as an application of the concept of representation encountered in the ‘Type 2 Theory of Effectivity’ that has been developed by Weihrauch and others (see [10]). A sharp filter can also be thought of as an example of what Spreen calls a strong base that converges to a point (see [8]); but where Spreen assumes the prior existence of points, we take a point-free approach. So we define a sharp filter in a way that makes it a close cousin of Martin-Löf’s maximal approximation (see [5]). But where a maximal approximation is a maximal filterbase in a topological basis, and hence generates a maximal filter in the topology, a sharp filter is a discrete, nested filterbase which also generates a maximal filter in the topology. The nesting is given by property 1 of the definition below; property 2 is essentially the same as what Martin-Löf uses to distinguish an approximation from a maximal approximation. Definition 1. BHDLet X be a topological space with subbasis ∆ = {δn : n ∈ ω}. A sequence A = {αi : i ∈ ω} of basic open sets is a sharp filter in ∆ if (1) (∀i)(αi+1 ⊂ ◦ αi ), and (2) (∀β, γ ∈ ∆)[(β ⊂ ◦ γ) ⇒ ∃i[(αi ∩ β = ∅) ∨ (αi ⊆ γ)]]. We say α resolves hβ, γi if (β ⊂ ◦ γ) ⇒ [(α ∩ β = ∅) ∨ (α ⊆ γ)]. We refer to a pair such as hβ, γi with β, γ ∈ ∆ as a target on X. (Note that if β is not a subset γ, then α resolves hβ, γi trivially.) When a sequence {αi : i ∈ ω} satisfies clause (2) for fixed β and γ, we say that it resolves hβ, γi. We will refer to property (2) as the resolution property. Let π0 , π1 : ω → ω be computable functions such that, for the standard ‘pairing function’ h·, ·i, if n = hn0 , n1 i, then π0 (n) = n0 , and π1 (n) = n1 . We say γ ∈ ∆ resolves targets 0 to n on X if γ resolves hδπ0 (i) , δπ1 (i) i for 0 ≤ i ≤ n. T We say A converges to x, or x is the limit of A, and write A & x, if αi = {x}. Definition 2. hX, ∆i is said to be semi-computably presentable if ∆ = {δn : n ∈ ω} is countable and for α, β ∈ ∆, the predicates ‘α ⊆ β’, ‘ α ⊆ β’, and ‘α ∩ β = ∅’ are decidable. For hX, ∆i a semi-computably presentable space, ∆ = {δi : i ∈ ω}, and let A = {αi : i ∈ ω} be a sharp filter in ∆. Then A is computable if there is a computable function f : ω → ω such that for every i, αi = δf (i) . For x ∈ X, we say x is computable if there is a computable sharp filter A with A & x. Correspondences and functions. Here we recall the basic definitions for correspondences and functions and the interrelationship between them. Definition 3. BHDTwo sharp filters A = {αi : i ∈ ω} and B = {βi : i ∈ ω} are equivalent if A and B converge to the same point; that is, if [(∀i)(∃j)(αi ⊂ ◦ βj )] ∧ [(∀i)(∃j)(βi ⊂ ◦ αj )]. In such a case we write A ≡ B. Definition 4. BHDA partial function F : ∆X → ∆Y is a quantum correspondence if (1) F is monotone, and (2) (∀B a computable sharp filter in ∆X )(∃A a computable sharp filter in ∆X ) [(A ≡ B) ∧ ((∀i)F (αi ) ↓ ∧ (F (A) is a sharp filter in ∆Y )]. If F is also partial computable, we say it is a computable quantum correspondence. In [1] we thoroughly study computable quantum correspondences and describe the motivation for their concept and the satisfying behavior of such objects despite their anomalies.

Turing Degrees & Topology

213

Definition 5. BHDA correspondence F is honest if for every sharp filter A in X (computable or not), with A ⊆ dom(F ), there is a sharp filter B ⊆ A where F (B) is a sharp filter in Y . In [2] we show that while computable dishonest correspondences exist, every computable quantum correspondence is replaceable with an ‘equivalent’ honest computable quantum correspondence. Definition 6. BHDFor a computable quantum correspondence F we refer to the function it generates on the space, fF , as a computable quantum function. Trees and points. In this subsection we define what a tree is in our setting. It is important to point out that in our study of trees both the ‘addresses’ of nodes (strings of 0s and 1s) and the content of the nodes (basic open sets) play crucial roles. Definition 7. BHDLet Σ be the set of all finite binary strings on the set {0, 1}. We usually denote members of Σ by σ or τ . Length of σ is denoted by lh(σ). If σ is a substring of τ , we write σ ⊆str τ . If σ is lexicographically before τ , we write σ ≤lex τ . A tree of sharp filters T is the range of a partial function Θ : Σ → ∆ satisfying the following conditions: (1) Θ(∅)↓; (2) for all σ, τ ∈ Σ, if Θ(τ )↓ and σ ⊆str τ , then Θ(σ)↓ and Θ(τ ) ⊂ ◦ Θ(σ); (3) if σ 6⊆str τ , τ 6⊆str σ, Θ(σ)↓ and Θ(τ )↓, then Θ(σ) ∩ Θ(τ ) = ∅; and (4) if b : ω → Σ is a total function such that for every n, lh(b(n)) = n, Θ(b(n))↓, and Θ(b(n + 1)) ⊂ ◦ Θ(b(n)) (i.e., if Θ ◦ b is a branch through T), then Θ ◦ b is a sharp filter. (Note that this condition requires that b(n) is a substring of b(n + 1).) T is a complete tree of sharp filters if Θ is a total function. For an infinite branch b = {βi : i ∈ ω} through T, which is a sharp filter and converges to a point, we denote that point by xb . For notational convenience, for σ ∈ Σ we denote Θ(σ) by θσ ; thus we have T = {Θ(σ) : σ ∈ Σ 0 } = {θσ : σ ∈ Σ 0 }, where Σ 0 = dom(Θ) is a subset of Σ and a tree under the ordering ⊆str . For a tree T, let T = {xb : b is a branch in T}. Definition 8. T is a Π10 tree if there is a computable procedure which determines, for each n ∈ ω and each finite sequence hδ0 , · · · , δn i of members of ∆, whether there is σ ∈ dom(Θ) with lh(σ) = n, such that (∀τ ⊆str σ)(∀k ≤ n)[(lh(τ ) = k) ⇒ (θτ = δk )], and if so, which computes σ.

3

Computable Trees & Turing Degrees

In our setting, when in the metalanguage we refer to a ‘point’, we mean the equivalence class of all of the sharp filters converging to that point. In this section, we define a filter-based Turing degree of a (metalanguage) point and show that any pseudo-irrational point, a point that is not on the boundary of any δ ∈ ∆, has a Turing degree. Definition 9. BHDLet ∆ = {δn : n ∈ ω}. For some f : ω → ω, let A = {δf (i) : i ∈ ω} be a sharp filter. Then deg(A), the Turing degree of A, is the Turing degree of the set {hi, f (i)i : i ∈ ω}. (Here h·, ·i is a Cantor pairing function.) This of course is the classical definition of Turing degree of a set; the following definition is the one whose consequences we will pursue in the rest of this paper. Because a point, in our point-free setting, is simply an equivalence class of sharp filters, the degree of a point is best defined in terms of the degrees of those sharp filters, and this definition seems to be natural. In particular, as we shall see in the next section, it agrees with the generally accepted definition of Turing degree for points in IR.

214

Iraj Kalantari and Larry Welch

Definition 10. BHDLet x ∈ X. The filter-based Turing degree of x is Deg(x) = min{deg(A) : (A is a sharp filter) ∧ (A & x)}, provided this is well-defined. One of the theorems of this section shows that in a general space of our type many points do indeed have Turing degrees. As a preliminary definition we give the following. Definition 11. BHDLet x ∈ X. We define χx : ∆X → {0, 1} by ½ ¾ 0, if x ∈ / δ, χx (δ) = 1, if x ∈ δ. In the first of the next three results, for a given point x, we construct a ‘canonical’ sharp filter converging to it that is Turing reducible to χx . Then we show that χx is Turing reducible to any sharp filter converging to x if x is pseudo-irrational, and end with concluding that every pseudo-irrational point has a Turing degree in our filter-based sense. Theorem 1. BHDFor any x ∈ X, there is a canonical sharp filter Ax converging to x such that Ax ≤T χx . Proof Sketch. Choose a sharp filter Ax = {αn : n ∈ ω} such that χx (αn ) = 1 and αn resolves the first n targets. • Theorem 2. BHDIf x is pseudo-irrational, then x has a Turing degree. Indeed, for such a pseudo-irrational x, we have Deg(x) = deg(χx ). Proof Sketch. If B is any sharp filter converging to x then χx ≤T B . Hence Ax ≡ χx , so Deg(x) = deg(Ax ) = deg(χx ). •

4

On the Turing Degree of a Point

We can apply the ideas of the previous sections to any computable metric space. In this section we present contrasting results about Turing degrees in such spaces. First, we apply our approach to IRn with the subbasis ∆IRn comprised of all n-dimensional rectangles whose corners have rational coordinates and whose sides are parallel to the axes. We use our filter-based definition of a Turing degree of a point, as per Definition 10, BHDand prove that every x in IRn has a Turing degree in our sense. This fact becomes even more pleasing when we show that our filter-based Turing degree of x, Deg(x), is exactly the same as the classical Turing degree based on the binary expansion of x, deg(x). Do all points of all spaces of our type have Turing degrees? At the end of this section, we examine a result of J. Miller [6] that demonstrates that there is a point f in the space of all continuous functions on [0, 1] that does not have a Turing degree in our sense. Theorem 3. BHDEvery real number has a Turing degree. Proof Sketch. Every rational number is computable, and by Theorem 2 BHDevery irrational number has a Turing degree. • Corollary 1. BHDFor each n = 1, 2, · · · , every point of IRn has a Turing degree. Proof Sketch. We show this by induction on n. We have seen in Theorem 3 BHDthat this is true of IR. Suppose it is true of IRn and consider IRn+1 . Let x ∈ IRn+1 . If all coordinates of x are irrational, then x is psuedo-irrational, so x has a Turing degree. Otherwise, if the pth coordinate of x is rational, project x along the pth coordinate axis and let y be the resulting point in IRn . Choose a sharp filter A0 & y such that deg(A0 ) = Deg(y). Next, construct a sharp filter B0 ≤T A0 such that B0 & x. Finally, notice that for any sharp filter B converging to X, A0 ≤T B. Thus Deg(x) = deg(B0 ). •

Turing Degrees & Topology

215

∞ Definition 12. BHDFor any x ∈ [0, 1], choose a1 , a2 , . . . ∈ {0, 1} such that x = Σi=0 ai · 2−i and let fx : ω → {0, 1} be the function such that for each i, fx (i) = ai . Define the degree of the binary expansion of x to be the Turing degree of fx ; that is, let: deg(x) = deg(fx ).

Remark. Note that for every x ∈ [0, 1], deg(x) is well-defined. This is because if x has more than one binary expansion, then it has exactly two binary expansions, both computable, so that deg(x) = 0. It is natural to ask if Deg(x) = deg(x) for every x ∈ IR. In the next theorem, we show that this equality is in fact the case by proving that it holds for x ∈ [0, 1]. Theorem 4. BHDFor all x ∈ [0, 1], Deg(x) = deg(x). Proof Sketch. We first note that if x ∈ Q then Deg(x) = 0 = deg(x). Now suppose x is irrational. We can use the binary expansion of x to obtain a sharp filter A = {αn : n ∈ ω} such that each αn is an interval of length slightly greater than 2−n . Conversely, we can start with a sharp filter B = {βn : n ∈ ω} and use the fact that the lengths of the intervals βn shrink to 0 to find the binary expansion of x. • This theorem in essence reveals that our definition of the Turing degree of a point is on the mark for spaces like IRn . In some spaces though, this in inadequate, since J. Miller [6] has shown that there are spaces in which not all points have Turing degrees in our sense. The outline of his counterexample is as follows. Let M be the metric space C [0, 1] (the continuous functions on the interval [0, 1]) with the metric d (f, g) = maxx∈[0,1] |f (x) − g (x)|. Let QC[0,1] be the set of all polygonal functions in M whose line segments have rational coordinates at their endpoints. For a given f ∈ QC[0,1] and a given r ∈ Q+ (where© Q+ is the set of positive ª rational numbers), let Bf,r = {g ∈ M : |f − g| < r}. Let ∆M = Bf,r : f ∈ M ∧ r ∈ Q+ . Then hM, ∆M i is a semi-computably presented resolvable space. The following theorem shows that not all points in this space have Turing degrees in our sense. Theorem 5. (J. Miller [6]) BHDThere is a function f ∈ hM, ∆M i such that no sharp filter converging to f has least Turing degree among all such sharp filters, so Deg (f ) is undefined.

5

Turing Degrees, Points & Functions

Now that we have defined Turing degrees of sharp filters and points, and have seen that pseudo-irrational points in any of our spaces and all points in IRn have Turing degrees, we shall investigate what can be said about the degrees of points that lie in the domains or ranges of computable quantum functions. Theorem 6. BHDLet F be a computable quantum correspondence from X to Y , and let x ∈ dom(fF ) be a point whose Turing degree exists. Then there is a sharp filter A ⊆ dom(F ) such that A & x and deg(A) = Deg(x). Proof Sketch. Assume a computable enumeration of dom(F ) is given. Let B be a sharp filter converging to x such that deg(B) = Deg(x). Construct A in stages, using B as an oracle to select appropriate members of dom(F ) as they are enumerated, so that A ≡ B. • From this theorem we can derive an observation about the Turing degree of a point in the range of a computable quantum correspondence and its relation to the Turing degree (if any) of its preimage. Corollary 2. BHDLet F be a computable quantum correspondence from X to Y , and suppose x ∈ X, y ∈ Y are such that Deg(x) and Deg(y) exist and fF (x) = y. Then Deg(y) ≤ Deg(x). Proof Sketch. We may assume F is honest. Let x ∈ dom(fF ) and as per the previous theorem pick A & x such that deg(A) = Deg(x) and A ⊆ dom(F ). Then F (A) is a sharp filter converging to y, and clearly Deg(y) ≤ deg(F (A)) ≤ deg(A) = Deg(x), since F is partial computable. •

216

6

Iraj Kalantari and Larry Welch

Free Trees

In our previous papers we have required that a tree of sharp filters T = {θσ : σ ∈ Σ} satisfy the conditions that each σ ∈ Σ be a string on the alphabet {0, 1} and that if σ, τ ∈ Σ, σ 6= τ , and lh(σ) = lh(τ ), then θσ ∩ θτ = 0. In this paper (and most likely in our next papers too) we will abolish these requirements, which we had adopted only for convenience. In their place we will impose the requirement that our alphabet is called computable, and we will occasionally use the following definition. Definition 13. BHDLet α ∈ ∆. T is a free tree of sharp filters in α if it is the range of a partial function Θ: Σ → ∆ satisfying the following conditions: (1) Θ(∅) = α; (2) For all σ, τ ∈ Σ, if Θ(τ )↓ and σ 6⊆str τ , then we have Θ(σ)↓ and Θ(τ ) ⊂ ◦ Θ(σ); and (3) if b : ω → Σ is a total function such that for every n, lh(b(n)) = n, Θ(b(n))↓, and b(n) ⊆str b(n + 1), then Θ ◦ b = b is a sharp filter. For such a b as in clause (3), with βi = Θ ◦ b(i) for i ∈ ω, we refer to (β0 , β1 , · · · ) as an infinite branch of T, and write (β0 , β1 , · · · ) ∈ T to indicate this fact. Similarly, for any b : ω → Σ, total or not, with βi = Θ ◦ b(i) for i ∈ {0, · · · , n}, we write (β0 , β1 , · · · , βn ) ∈ T, and refer to (β0 , β1 , · · · , βn ) as a finite branch of T. (The reader is asked to compare this with the definition of a tree in Section 2.) Definition 14. BHDLet T = {θσ : σ ∈ Σ} be a free tree of sharp filters. A partial function f : Σ → ω is a separation function for T if Σ ⊆ dom(f ) and for every σ ∈ Σ, f (σ) > lh(σ) and there is an open set Gσ ⊆ X such that (1) for every τ ∈ Σ with lh(τ ) = f (σ) and τ ⊇ σ, θτ ⊆ Gσ ; and (2) for every τ ∈ Σ with lh(τ ) = f (σ) and τ 6⊇ σ, θτ ⊆ (X − Gσ )◦ . (Here and elsewhere, for A ⊆ X, A◦ indicates the interior of the set A.) A free tree that has a separation function is separated, and a free tree with a computable separation function is computably separated. Definition 15. BHDLet b = Θ ◦ b be a branch through T, where b : ω → Σ − {∅}. We write b = {βn : n ∈ ω} to mean βn = Θ ◦ b(n) for every n ∈ ω. If b(n) = σ and βn = Θ ◦ b(n) = α, we say βn = α at σ. Theorem 7. BHDLet b0 = Θ ◦ b0 , b1 = Θ ◦ b, be branches through a separated tree T with b0 6= b1 . Then ∩b0 6= ∩b1 . Proof Sketch. Consider b0 and b1 as infinite lists of nested basic open sets going up the tree T. Use a separation function for T to show that the sets listed in b0 are disjoint from those of b1 above a certain level in the tree. • Corollary 3. BHDFor b0 , b1 , σ0 , and m as in the previous theorem, if b00 : ω → Σ − {∅} is an extension of b0 d(m) and b01 : ω → Σ − {∅} an extension of b1 d(m) such that b00 = Θ ◦ b00 and b01 = Θ ◦ b01 are branches through T, then ∩b00 ∈ Gσ0 and ∩b01 ∈ X − Gσ0 . Proof Sketch. ∩b00 ∈ b0 (m − 1) and ∩b01 ∈ b1 (m − 1). •

7

Dense and Unconfined Subbases

Let ∆ = {δn : n ∈ ω}. Because of the decidability of the relation ‘α ⊆ β’, we may, if we wish, assume the enumeration of ∆ is one-to-one. Definition 16. ∆ is a dense subbasis for X if the enumeration {δn : n ∈ ω} is one-to-one and ∆ has the following property: (∀α, β ∈ ∆) [α ⊂ ◦ β ⇒ (∃γ ∈ ∆) [α ⊂ ◦γ ∧ γ ⊂ ◦ β]] .

Turing Degrees & Topology

217

Definition 17. BHDIf ∆ is a dense subbasis for X, let Ins : (ω × ω) × ω → ω be a total computable inserting function with the following properties for all i, j, r ∈ ω: (1) if δi ⊃ ◦ δj then δi ⊃ ◦ δIns(i,j;r) ⊃ ◦ δIns(i,j;r+1) ⊃ ◦ δj ; and (where h·, ·, ·i is the usual Cantor encoding) (2) if hi, j, ri < hi0 , j 0 , r0 i, then Ins(i, j; r) < Ins(i0 , j 0 ; r0 ). It should be noted that because of the use of Cantor encoding such an inserting function is a one-to-one function whose range is computable (this will be relevant in our forthcoming argument in Theorem 11). BHDThus, if δi ⊃ ◦ δj , then δi ⊃ ◦ δIns(i,j;0) ⊃ ◦ δIns(i,j;1) ⊃ ◦ δIns(i,j;2) ⊃ ◦ ··· ⊃ ◦ δIns(i,j;r) ⊃ ◦ ··· ⊃ ◦ δj . Remark. We shall use Ins to conduct successive interpolations between adjacent members of a sharp filter, so as to produce another sharp filter which is a supersequence of the given one, in which certain information is encoded by the interpolants. Definition 18. ∆ is an unconfined subbasis for X if the enumeration {δn : n ∈ ω} is oneto-one and ∆ has the following properties: (1) X ∈ / ∆; and (2) (∀α ∈ ∆)(∃β ∈ ∆)[α ⊂ ◦ β] It is clear that ∆IRn , the subbasis we adopted for IRn in Section 4, is a dense, unconfined subbasis.

8

Turing Degrees, Trees & Functions

In this section we investigate the Turing degrees of points on computably separated trees in the first two theorems, and apply our findings to establish a further result on domains of computable quantum functions. The first theorem in this section guarantees us that points on a computably separated Π 01 tree have Turing degrees, and that the degree of such a point is equal to the degree of the branch converging to it. Theorem 8. BHDLet T be a computably separated Π 01 tree of sharp filters and let b = {βn : n ∈ ω} be an infinite branch through T converging to xb . Then xb has a Turing degree, and in fact Deg(xb ) = deg(b). Proof Sketch. Let A be a sharp filter converging to xb . Let f be a computable separation function for T. We show that b ≤T A by using A as an oracle to enumerate b, since it is the unique branch of T that converges to x. At level n of T, we use f together with A to determine how the branches extend past level n, and therefore to determine which basic open set at level n is on b. • Next, we show that a complete tree of sharp filters is rich in Turing degrees. Theorem 9. BHDLet T be a complete binary computable, computably separated tree of sharp filters. Then for any Turing degree a, there is a point x lying in T with Deg(x) = a. Proof Sketch. A complete binary tree has branches of all Turing degrees. • Finally, we use Theorem 9 BHDto show that any computable quantum function has a domain rich in Turing degrees as well. Theorem 10. BHDLet F be a computable quantum correspondence from X to Y . Then for every Turing degree a and every α ∈ ∆ there is a point x ∈ α ∩ dom(fF ) such that Deg(x) = a. Proof Sketch. Build a complete computable binary tree in dom(F ) ∩ α. By Theorem 9 BHDthat tree has a branch of Turing degree a. • In a space X containing points without Turing degrees, none of those points can be on a computably separated Π10 tree (as per Theorem 8). BHDTherefore the tree-based techniques of this paper (presented in this section) are not intricate enough to capture such enigmatic points.

218

9

Iraj Kalantari and Larry Welch

Upward Closure

In this section, we show that, for spaces with a dense subbasis ∆, the Turing degrees of sharp filters converging to a given point are closed upward. Given the definition of the Turing degree of a point, this result states that if x has a Turing degree then the degrees of the sharp filters converging to x form an upper cone in the Turing degrees. Using the standard definition for ‘⊗’, that is, if g, h : ω → ω, then g ⊗ h is a function from ω to ω where g ⊗ h(2n) = g(n) and g ⊗ h(2n + 1) = h(n), we have the following. Theorem 11. BHDLet ∆ be dense. Let A be a sharp filter converging to a point x ∈ X, and let f : ω → ω be a total function. Then there is a sharp filter C converging to x such that A is a subsequence of C and C ≡T A ⊗ f . Proof Sketch. Let A = {αn : n ∈ ω}. Exploiting the fact that ∆ is dense, use the function Ins of Definition 17 BHDto encode f (n) via the insertion of a basic open set between αn and αn+1 . Let C be the resulting sharp filter, so C dovetails A with these encodings. • Corollary 4. BHDLet ∆ be dense. The Turing degrees of the sharp filters converging to a point x ∈ X are closed upward. Proof Sketch. Given x ∈ X, a sharp filter A converging to x, and a set B whose Turing degree is greater than or equal to deg (A), form a sharp filter C converging to x such that C ≡T A ⊗ χB ≡T B. •

References 1. Kalantari, I. & Welch, L., Point-free topological spaces, functions, and recursive points; filter foundation for recursive analysis. I, Annals of Pure And Applied Logic, 93 (1998), 125–151. 2. Kalantari, I. & Welch, L., Recursive and nonextendible functions over the reals; filter foundation for recursive analysis. II, Annals of Pure and Applied Logic, 98 (1999), 87–110. 3. Kalantari, I. & Welch, L., A blend of methods of recursion theory and topology, Annals of Pure & Applied Logic, 124 (2003), 141–178. 4. Kalantari, I. & Welch, L., Specker’s theorem, cluster points, and computable quantum functions, in Logic in Tehran, (A. Enayat, I. Kalantari, M. Moniri, eds.), Lecture Notes in Logic, 26 (2006), Association for Symbolic Logic 134–159. 5. Martin-Löf, P., Notes on constructive mathematics, Almqvist and Wiksell, 1970. 6. Miller, J., Degrees of unsolvability of continuous functions, Journal of Symbolic Logic, 69 (2004), 555–584. 7. Rettinger, R & Zheng, X, On the hierarchy and extension of monotonically computable real numbers Journal of Complexity, 19 (2003), 672–691. 8. Spreen, D., On effective topological spaces, Journal of Symbolic Logic, 63 (1998), 185–221. 9. Turing, A.M., On computable numbers, with an application to the “Entscheidungsproblem”, Proc. London Math. Soc., Ser. 2 42 (1936), 230–265; (corr. ibid. 43 (1937), 544-546). 10. Weihrauch, K., Computable Analysis, Springer-Verlag Berlin/Heidelberg, 2000.

The Fabric of Small Turing Machines Gregory Lafitte1 and Christophe Papazian2 1

2

Laboratoire d’Informatique Fondamentale de Marseille (LIF), CNRS – Université de Provence, 39 rue Joliot-Curie, 13453 Marseille Cedex 13, France, [email protected] , http://www.lif.univ-mrs.fr/˜lafitte Laboratoire I3S, CNRS – Université de Nice Sophia-Antipolis, Les Algorithmes, Bât. Euclide B, 2000 route des Lucioles, BP 121, 06903 Sophia Antipolis Cedex, France, [email protected]

Abstract. We study the behaviour of small one-tape Turing machines starting with a blank tape. Rado’s Busy Beaver problem was stated in 1962 as to find the most complex Turing machine, one that halts when started with a blank tape, among a class of machines, typically the class of all n-states m-symbols machines. In this paper, we solve Rado’s busy beaver problem for the class of two-states three-symbols Turing machines and provide a complete detailed description of the structure of this class. In the process, we analyse the fabric of n-states m-symbols Turing machines where n + m ≤ 6, and conjecture some elements of the structure of the fabric of the classes of Turing machines with a fixed, but unbounded, number of transitions. Finally, we present small Turing machines with a complex and (for some of them) unintelligible behaviour.

1

Introduction

Turing machines are one of the basic discrete objects or combinatorial structures of choice in order to study computation in a theoretical setting. When studying those Turing machines, several questions come inevitably to mind : what is the power of having a supplemental state? a supplemental symbol? or a supplemental transition? Those questions tackle the problem of the organisation of computation and particularly how does each basic quantum of computation, e.g., a transition for Turing machines, relate to other quanta to compute a small algorithm, i.e., with a small number of quanta of computation. For theoretical computer science (but also for applied fields of computer science), an important quest is to find a small Turing machine that computes a certain function, that solves a certain problem or that implements a certain algorithm. Many fields, such as sensor networking, search nowadays for algorithms with minimal energy or memory usage. Embarking onto this quest, we have systematically studied one-tape Turing machines, with n states and m symbols (n + m ≤ 6), started with a blank tape. We follow the quest initiated by Tibor Rado in 1962, quest which he coined the Busy Beaver problem : finding the Turing machine, one that halts with the biggest number of non-blank cells or the number of steps when started with a blank tape, among a class of machines, typically the class of all n-states m-symbols machines. Finding the most complex Turing machine, systematically for any class, is undecidable. Nevertheless, many researchers have contributed in solving this problem, most notably Allen Brady, Heiner Marxen and Jürgen Buntrock. One of the striking aspect of the Busy Beaver function is its ability, if we could compute some of its values, to solve a whole bunch of mathematical conjectures. Our study underlines a certain structure of such Turing machine classes and gives rise to several conjectures about the fabric of such classes. In this paper, we solve the Busy Beaver problem for two states and three symbols and give the best lower bounds (yet found) of the

220

Gregory Lafitte and Christophe Papazian

Busy Beaver functions for the classes of three-states three-symbols and two-states five-symbols Turing machines. These lower bounds or exact values (for the two-states three-symbols case) show both the limitations and extreme power of small Turing machines. The main underlying difficulty in analysing thoroughly a given class of machines is the uncomputability of the Busy Beaver functions. We thus rely on the experimental aspect (i.e., using supercomputers) of our study to be able to analyse and describe an uncomputable phenomenon, as a whole. This experimental search gives us insights in the structure of the fabric and gives rise to theoretical questions showing us what phenomena could be analysed and proven about the fabric. We start by presenting the theoretical aspects of our study and by reviewing the history of results. We then describe our analysis of the fabric of some small Turing machines classes and obtain the exact values for the Busy Beaver functions for the two-states three-symbols class as well as a complete detailed description of the structure of this class. Finally, we present some small Turing machines with a complex and (for some of them) unintelligible behaviour.

2 2.1

A Theoretical and Historical Study of the Fabric Rado’s Busy Beaver Problem

In the sixties, Tibor Rado, a professor at the Ohio State University, thought of a simple noncomputable function besides the standard halting problem for Turing machines. Given a fixed finite number of symbols and states, select those Turing machines which eventually halt when run with a blank tape. Among these programs, find the maximum number of non-blank symbols left on the tape when they halt. Alternatively, find the maximum number of time steps before halting. This function is well-defined but uncomputable. Tibor Rado called it the Busy Beaver function. Before formally defining the Busy Beaver functions, we need to consider specific classes of Turing machines. Definition 1. The class C(n, m) is the set of all one-tape, with allowed tape head movements and , Turing machines with n states and m symbols, such that the halting state is not among the n states, no transition starting from the halting state exists and the blank symbol is one of the m symbols. For small values of n and m, we use the following notation for C(n, m) : ÏÆ, which denotes the class C(6, 7). Definition 2. The general Busy Beaver functions Σ and σ are defined by : Σ : P(T) → N C

7→ maxM ∈C

(

¯ ) ¯ΣM = number of non-blank symbols left on the tape ¯ ΣM ¯by machine M after halting when run with a blank ¯tape

σ : P(T) → N C

¯ n o ¯σ = number of time-steps made by M before halting 7→ maxM ∈C σM ¯ M when run with a blank tape

where T is the set of all one-tape Turing machines. Champions for a class C are machines in C that witness the values of Σ(C) and/or σ(C). Best lower bounds known are also called champions until they are dethroned. Σ(ÌÁ) is thus the maximum number of non-blank symbols left on the tape by halting one-tape Turing machines with 3 states (+ the halting state) and 2 symbols, when run with a blank tape. The term “Busy Beaver function” will refer to either (n, m) 7→ Σ(C(n, m)) or (n, m) 7→ σ(C(n, m)).

The Fabric of Small Turing Machines

221

In 1962, Tibo Rado published an article [Rad62] about it. With his student Shen Lin, they actually tackled the three-states two-symbols problem. The study resulted in a dissertation for Lin in 1963 and an article [LR65] in 1965. There has been several other articles since, notably [Bra66], [Bra83], [Dew84], [Her88], [MB90] and [MS90]. In [Dew84], A. K. Dewdney discusses efforts to calculate the Busy Beaver function. This is a very interesting endeavor for a number of reasons, the first one being that the Busy Beaver function measures the capability of computer programs as a function of their size. Dewdney describes successful attempts to calculate the initial values Σ(ÊÁ), Σ(ËÁ), Σ(ÌÁ),. . . of Σ, albeit the fact that Σ is an uncomputable function. Theorem 1. The following functions are uncomputable : (n, m) 7→ ς(C(n, m)), n 7→ ς(C(n, m)) (where m is fixed), m 7→ ς(C(n, m)) (where n is fixed) where ς is either Σ or σ. The Busy Beaver functions are actually linked to the Chaitin-Kolmogorov function K, which gives the algorithmic information theoretic complexity of a given integer (see [LV97]). The Busy Beaver function can then be seen as the function ΣK which, to k, gives the greatest x such that K(x) ≤ k. With this in mind, one better understands the importance of the Busy Beaver function and the global impact of computing its initial values, on the rest of computer science. The Busy Beaver function is also of considerable metamathematical interest. In principle, it would be extremely useful to know larger values of ΣK (C(n, m)). For example, this would enable one to settle the Goldbach conjecture and the Riemann hypothesis, and in fact any conjecture which can be refuted by a numerical counterexample. Let P be a computable predicate of a natural number, so that for any specific natural number n, it is possible to compute by a k-states k 0 -symbols Turing machine MP whether or not P (n) is true or false. How could one use the Busy Beaver function to decide if the conjecture, that P is true for all natural numbers, is correct? An experimental approach is to use a fast computer to check whether or not P is true, say for the first billion natural numbers. In order to convert this empirical approach into a proof, it would suffice to have a bound on how far it is necessary to test P before settling the conjecture in the affirmative if no counterexample has been found, and of course rejecting it if one was discovered. ΣK provides this bound since it suffices to examine the first ΣK (C(k + O(1), k 0 )) natural numbers in order to decide whether or not P is always true. As Gregory Chaitin [Cha87] puts it : “ Note that the algorithmic information content of a famous conjecture P is usually quite small; it is hard to get excited about a conjecture that takes a hundred pages to state.” For all these reasons, it is really quite fascinating to contemplate the successful efforts which have been made to study and classify small Turing machines classes (and thus calculate some of the initial values of Σ(C(n, m))). To obtain a finer structure of the fabric of Turing machines, we can consider the Turing machines classes where we limit the number of possible transitions. Definition 3. The class Ct (n, m) is the set of all one-tape (with allowed tape head movements and ) Turing machines with n states, m symbols, and at most t transitions that do not end with the halting state, such that the halting state is also not among the n states, no transition starting from the halting state exists and the blank symbol is one of the m symbols. For small values of n and m, we use the following notation for Ct (n, m) : Ï36 Æ, which denotes the class C36 (6, 7). We will see in the following section that Σ(Ct (n, m)) has quite a complex structure. Using the same techniques as before, we can show that (n, m, t) 7→ Σ(Ct (n, m)) is uncomputable. Theorem 2. Are uncomputable : (n, m, t) 7→ ς(Ct (n, m)), (n, t) 7→ ς(Ct (n, m)) (where m is fixed), (m, t) 7→ ς(Ct (n, m)) (where n is fixed) where ς is either Σ or σ. This naturally prompts the question : with n states, m symbols and t transitions, does one get the best value possible for Σ (or σ) with t transitions? This very interesting question yields the following open question :

222

Gregory Lafitte and Christophe Papazian

Conjecture 1. t 7→ (nt , mt ) where ς(Ct (nt , mt )) = max(n,m)∈N2 {ς(Ct (n, m))}, where ς is either Σ or σ, are uncomputable. Note that these two functions obviously do not grow faster than any computable function can and would thus be, if one day the conjecture is confirmed, very peculiar natural uncomputable functions. 2.2

History of Results

For the history and current results, see also [Mic04], [Kud96], [Rog96], [Bra], [Mar] and [Mic].

2

Symbols

3 4 5 6 7 8 9 .. . 18

3

States 2 3 4 5 6 7 8 9 10 . . . 19 4, 6 6, 21 13, 107 4098 , 47 .10 6 12 .10 864 , 3 .10 1730 U 1962 1965 1983 1990 2001 9, 38 95524079 , 4345 .10 12 ? ? U 1988 2006 2006 2005 2005 2050 , 39 .10 5 ? ? U 2005 2005 2005 172 .10 9 , 7069 .10 18 ? U 2006 2005 ? Legend U 2005 x, y Busy beaver result x , y Best lower bound known Deterministic class U Collatz-like detected w/Universal machine U

The Fabric of Some Small Turing Machines Classes

Each class C(x, y) has (2(x + 1)y)xy different machines. We will now see the inner structure, or fabric, of the smallest classes. The proofs for the values of the Busy Beaver functions for classes ËÁ, ÌÁ and ÍÁ were found by Rado, Lin [LR65] and Brady [Bra83]. Here, we aim at showing the overall general structure. However, for classes ÎÁ, ÌÂ, ËÃ and greater, the structure is not completely known, but we give serious hints about the general shape. Finally, we give in this paper the first exhaustive study of the structure of ËÂ. Hence, we provide a proof of the Busy Beaver function values for this class. 3.1 The Trivial Class ËÁ This is the simplest case : 2 states and 2 symbols. It can be summarized by figure 1. Firstly, we want to remove any symmetry in the structure. We then consider permutations of states and permutations of symbols, and for any machine M , we define the set M of all machines that are obtained from M by such permutations. We consider a total order on states and symbols. For each such set, we solely consider the machine for which, in the sequence of transitions obtained on the empty tape, a state (respectively a symbol) is only used if all smaller states (respectively symbols) have already been used. Fig. 1. Structure of ËÁ Furthermore, we take the first transition (State A on 0) as writing 1, going right and entering state B. The obtained machine makes one transition, writing one non-blank and entering the undefined transition (State B on 0). This is the center node of the graph and the root of the tree. Each edge represents a different way of defining the current unknown transition. In this way, we obtain new (more complex) machines which end on other undefined transitions. In this tree, no edge leads to machines with a bad behaviour (non-halting). As can be seen with this figure, there are only 10 different machines (the leaves), and the champion writes 4 non-blanks in 6 steps.

The Fabric of Small Turing Machines

3.2

The Smallest Classes : ÌÁ and ËÂ

Fig. 2. Structure of ÌÁ 3.3

223

Analyzing the structure of ÌÁ, we obtain a much more complex tree, as can be seen from figure 2. This class contains 1041 complete1 machines (leaves in the tree). As expounded here, the class ÌÁ is the only class that does not have a machine that is both champion for the number of steps and the number of non-blanks. This may be due to the fact that the very limited number of possible transitions forces computations in a very particular way. We do not have enough proven results for bigger classes yet to be affirmative. As we provide in this paper the first exhaustive study of ËÂ, we give a detailed description of this class in section 4.

The ÍÁ and ËÃ Classes

Classes ÍÁ and ËÃ are much larger than the previous, hence much more complex. However, the class ÍÁ has been thoroughly studied, as every machine in this class can be proven to halt or not. This is not the case with class ËÃ which happens to be the smallest class containing machines with a “chaotic” behaviour. It is not yet known how to deal with such machines, as no way has yet been found to predict their behaviour. Such a complex behaviour could be an obstacle to find a proof that such a machine halts. This is the main obstacle to a proof of the exact value of Busy Beaver functions on larger classes. The non-computability of the Busy Beaver Fig. 3. Halting machines of ÍÁ function already hinted to that fact. Unpredictibility arises in ÍÁ with machines containing at least 7 defined transitions. Figure 3 shows the results for the Busy Beaver competition in the class ÍÁ. There are 145213 halting machines in this class. We represent all halting machines using as coordinates the number of steps before halting and the number of non-blanks written. Figure 4 shows the results for the Busy Beaver competition in the class ËÃ. As one can see, there are larger numbers than for the previous class. We observe that most of the large champi√ ons are in a small zone around the line y = x (scales are logarithmic). This holds also for larger classes, as the tape-head of the machines during computation tends to repeatedly go from one end of the tape to the other end, writing only a few more symbols each time it reaches the end. This is the most common behaviour. We have found 95075 halting machines in this class. This is the Fig. 4. Halting machines of ËÃ first class for which the Busy Beaver problem has not already been solved. There is still a possibility of obtaining a much greater champion. 1

By complete, we mean a machine that has been completely defined and with only one halting transition.

224

Gregory Lafitte and Christophe Papazian

We remark that Σ(ËÃ) À Σ(ÍÁ) and σ(ËÃ) À σ(ÍÁ). Such a relation of domination seems to hold for all larger classes. Let # (C(x, y)) be the number of different machines that halt in the class C(x, y). Our results point out the following conjecture. Conjecture 2. ∀x ≥ 2, ∀y > x   # (C(x, y)) < # (C(y, x)) Σ (C(x, y)) > Σ (C(y, x))  σ (C(x, y)) > σ (C(y, x))

3.4

A More Complex Class : ÌÂ

Fig. 5. Halting machines of ÌÂ 3.5

In ÌÂ, 7,298,526 halting machines were obtained. Figure 5 shows all our current results. One can observe new strange behaviours, such as a machine that halts after 2,315,619 steps, writing only 31 non-blanks. This machine stands in the orange zone, near the curve y = log x. This zone should tend to be more crowded as we look at larger classes. Intuitively, this is the zone of the machines that deal with non-unary information. Studying larger classes, we observe more and more complex behaviours which diminish the chances to one day settle the halting problem on those classes. These complex behaviours populate new areas in such a graph for larger and larger classes.

Summary of our Experimental Results

Figure 6 shows all current results for classes C(x, y) with x + y ≤ 7. Each column shows the results of one class. The level n in one column corresponds to Cn (x, y) (machines with n nonhalting transitions defined). In each level, the two main numbers show the maximum number of non-zeros and the maximum number of steps of this subclass. If there are two other small numbers, it means that both best results are not obtained by a single machine, but by two different machines. For example, in Ì3 Á (ÌÁ with 3 non-halting transitions), the best result is 4 non-blanks and 6 steps, and there is one machine that halts after 6 steps, writing 4 non-blanks. In the same class, with 4 non-halting defined transitions, the best result is 5non-blanks and 17 steps. No machine holds both records. Instead, there is one machine halting after 8 steps, writing 5 non-blanks, and another machine halting after 17 steps, writing only 4 non-blanks. The blue line in each column separates the fifth and the sixth levels for readability. The triangles show the smallest classes that reach the record for a particular number of non-halting Fig. 6. Summary of our experimental re- transitions. For example, ËÁ with 3 non-halting sults transitions obtains a record of 4 non-blanks and 6 steps. This is not only the record for ËÁ but the record for all machines with exactly 3 non-halting defined transitions with any number of states and symbols. In such a case, there is a little triangle. This represents the first values of the functions of conjecture 1. Finally, the pink-yellow bar is the level of provability. This means that results under the pink limit are proven by our algorithms, and that results over the limit are only best bounds known.

The Fabric of Small Turing Machines

4

225

The Complete Structure of ËÂ

As can be seen from figure 7, the class ËÂ seems to be quite similar to the class ÌÁ. However, there are only 630 complete machines in this class, and the champion obtains much better results both in the number of steps and the number of non-blanks. This is a general behaviour which we observe when comparing C(x, y) and C(y, x) where x < y. Figure 7 is an illustration of the proof that the machine halting with 9 non-blanks and after 38 steps is the champion of ËÂ. We have shown that the machines in this figure are the only ones that halt. For most of the other machines, the fact that they do not halt can be trivially proven but for some, tailored-by-hand tests are required. All of these tests (trivial and tailored-by-hand) have been implemented. Our program can output the long proof that the above-mentioned machine is the champion. The reader must remember that we have put a linear order on states and symbols, preventing us from having two machines that are symmetric in structure. Moreover, the transition that leads to the final state always write a one and goes right. Thus, in this class, from our 630 machines, we can build 630 other halting machines that write a blank just before halting. Under those hypotheFig. 7. Structure of ËÂ sis, we have: 630 halting machines: If we look carefully at those machines, we can say that the Busy Beaver champion of this class is the machine that halts after 38 steps leaving 9 non-blank symbols on the tape. But we have many other results. For example, we can say that we have a machine that halts on an empty tape (with only blank symbols) after 13 steps, and there is no other machine that halts on an empty tape after more steps. 1967 non halting machines: Among this set, we have 1628 linear machines (the head evolves with a periodic behaviour infinitely on the left or on the right of the tape), 201 static machines (the head always comes back to the same place in a periodic way), 115 “square-root” machines (the head goes infinitely often from the left end of the tape to the right end and goes back, increasing the size of the tape after each trip) and 23 “logarithmic” machines (counter like machines, that use some binary coding on the tape). For a complete detailed description of the structure of ËÂ, see the appendix.

5

Some Small Turing Machines with a Complex Behaviour

Figure 8 and the appendix present machines with a peculiar behaviour. They are represented as graphs: transitions are labeled edges and states are nodes. The initial state is double circled. Each color stands for one symbol (black is the blank symbol). For example, a transition from state A to B reading blank writing red, and going right is drawn as a black edI. The ge for node A to B with label ¥ Fig. 8. Chaotic and complex machines machines are labeled with C for chaotic behaviour, E for counter and H for halting (but complex) machines.

Appendix : Structure of ËÂ The following description presents the complete structure of ËÂ : from the halting to the non-halting machines with different kinds of behaviour.

226

Gregory Lafitte and Christophe Papazian n

Halting machines : in the following diagram, s – t means that there are n different machines that halt after s steps leaving t non-blank symbols. 12 6 – 1 24 6 – 2

4 7 – 1 36 7 – 2 40 7 – 3

11 8 – 1 34 8 – 2 86 8 – 3 24 8 – 4

6 9 – 1 17 9 – 2 41 9 – 3 26 9 – 4

1 10 – 18 10 – 28 10 – 45 10 – 14 10 –

1 2 3 4 5

1 11 – 4 11 – 8 11 – 10 11 – 9 11 –

1 2 3 4 5

1 12 – 4 12 – 14 12 – 13 12 – 11 12 – 6 12 –

1 2 3 4 5 6

1 13 – 1 13 – 3 13 – 9 13 – 8 13 – 1 13 –

1 2 3 4 5 6

5 14 – 2 14 – 8 14 – 12 14 – 3 14 –

2 3 4 5

1 15 – 3 1 15 – 4 1 15 – 5

6

1 16 – 1 16 – 1 16 – 2 16 –

3 4

1 17 – 3 1 17 – 4

5 6

1 17 – 6

1 15 – 7

1 21 – 4 1 18 – 5 2 18 – 6 2 18 – 7

1 19 – 6 1 19 – 7

1 22 – 4

1 20 – 5 2 20 – 6

2 26 – 6 1 24 – 7 1 29 – 8 1 38 – 9

We now present the structure of the machines that never halt starting with a blank tape. n Linear machines : in the following diagram, s • o means that there are n different machines that we call linear : after a finite number of steps, they have a periodical behaviour; every s steps they do the same thing and make an overall move of o cells on the tape. One notes that s and o have obviously the same parity. In the middle line, there are all the static machines. 1 4 • −4

1 54 • −4

4 3 • −3 6 2 • −2 202 1 • −1

39 7 • −3 139 4 • −2

115 3 • −1 86 2•0

3 1 • +1

112 6 • −2 149 5 • −1

91 4•0 158 3 • +1

3 2 • +2

47 8 • −2 19 7 • −1

12 6•0 154 5 • +1

57 4 • +2

14 9 • −3

4 11 • −3 11 10 • −2

8 12 • −2

5 9 • −1

2 11 • −1

11 9 • +1

6 11 • +1

8 8•0 38 7 • +1

142 6 • +2

1 1 16 • −2 18 • −2

4 12 • 0

65 8 • +2 45 7 • +3

2 14 • −2 1 13 • −1

13 10 • +2 28 9 • +3

2 13 • +1 8 12 • +2

4 11 • +3

1 15 • +1 1 14 • +2

4 13 • +3

1 16 • +2 1 15 • +3

n τ l / σl r τ . σr

7 1 3 / −1 4 / −2 1 . +1 5 . +1

Square root machines : in the following diagram, means that there are n different machines that we call square root : the head of each machine 71 18 goes back forth between the left end and the the right end of the already 1 / −1 2 / −2 1 . +1 2 . +2 5 used part of the tape, increasing the tape usage at each trip. When the 1 / −1 2 . +2 head is moving leftward (rightward), it does so in a periodical way : every 11 1 / −1 τ l (τ r ) steps, the machine does the same thing and makes an overall move 3 . +1 of σ l (σ r ) cells on the tape. Logarithmic machines : these are machines that behave like counters; some information is encoded in base b and most of the time the machine keeps increasing by one its information. There are four different logarithmic bahaviours in ËÂ: 14 machines that count in big endian (most significant bit is on the left) using base 2; 4 machines that count in little endian using base 2; 4 machines that count in big endian using base 3; and 1 machine that counts in big endian using base 2, using two cases for each bit : each time the head increases on the left the size of the tape usage, one least significant bit is added on the right. Among the non-halting machines, this machine has the most complex behaviour. Peculiar machines : in figure 9, we present three peculiar ËÂ machines from the aforementioned categories (from left to right) : the slowest linear machine, the slowest square root machine and the most complex logarithmic machine. 2 2 / −2 1 . +1

Fig. 9. Three peculiar ËÂ machines

Acknowledgements We are thankful to Allen Brady, Heiner Marxen, Pascal Michel and Terry and Shawn Ligocki for their enthusiastics remarks on the champions discovered and their own contributions to

The Fabric of Small Turing Machines

227

the Busy Beaver game. Our experimental search would not have been possible without the help of the IBM BladeCenter of Jose Rolim’s group at the University of Geneva, where the first author was once a postdoc. We acknowledge the tremendous help and support in many ways of Pasquale Di Cesare, Alessandro Curioni, Jean-Louis Lafitte and Michel Roethlisberger. We thank Melvyn Lafitte for all the great ideas and his humongous support. Nothing of this paper would have been possible without the encouragements of all the Sycomore research team, especially Jacques Mazoyer, Julien Cervelle, Marianne Delorme, Bruno Durand, Nicolas Ollinger, Gaétan Richard and Guillaume Theyssier. Jacques Mazoyer was no doubt the person who introduced us all to the Busy Beaver problem.

References [Bra]

Brady (A. H.). « Busy beaver problem of Tibor Rado (’Busy Beaver Game’) ». http: //www.cs.unr.edu/~al/BusyBeaver.html. [Bra66] Brady (A. H.), « The conjectured highest scoring machines for Rado’s σ(k) for the value k = 4 », IEEE Transactions on Elec. Comput., vol. EC-15, 1966, p. 802–803. [Bra83] Brady (A. H.), « The determination of the value of Rado’s noncomputable function σ(k) for four-state Turing machines », Mathematics of Computation, vol. 40, 1983, p. 647–665. [Cha87] Chaitin (G. J.), Open Problems in Communication and Computation, chap. Computing the Busy Beaver Function, p. 108–112. New York, Springer-Verlag, 1987. [Dew84] Dewdney (A. K.), « Computer recreations: A computer trap for the busy beaver, the hardestworking Turing machine », Scientific American, vol. 251, n◦ 2, August 1984, p. 19–23. [Her88] Herken (R.), The Universal Turing Machine: A Half-Century Survey. Oxford University Press, Oxford, England, 1988. [Kud96] Kudlek (M.), « Small deterministic Turing machines », Theoretical Computer Science, vol. 168, n◦ 2, 1996, p. 241–255. [LR65] Lin (S.) et Rado (T.), « Computer studies of Turing machine problems », Journal of the Association for Computing Machinery, vol. 12, n◦ 2, April 1965, p. 196–212. [LV97] Li (M.) et Vitányi (P.), An Introduction to Kolmogorov Complexity and its Applications. Springer, 1997. [Mar] Marxen (H.). « Busy beaver ». http://www.drb.insel.de/~heiner/BB/. [MB90] Marxen (H.) et Buntrock (J.), « Attacking the busy beaver 5 », Bulletin of the EATCS, vol. 40, 1990, p. 247–251. [Mic] Michel (P.). « Busy beavers ». http://www.logique.jussieu.fr/~michel/. [Mic04] Michel (P.), « Small Turing machines and generalized busy beaver competition », Theoretical Computer Science, vol. 326, n◦ 1–3, 2004, p. 45–56. [MS90] Machlin (R.) et Stout (Q. F.), « The complex behavior of simple machines », Physica, vol. 42D, 1990, p. 85–98. [Rad62] Rado (T.), « On non-computable functions », Bell System Technical Journal, vol. 41, May 1962, p. 877–884. [Rog96] Rogozhin (Y.), « Small universal Turing machines », Theoretical Computer Science, vol. 168, n◦ 2, 1996, p. 215–240.

On a Constructive Proof of Completeness of the Implicational Propositional Calculus Domenico Lenzi Dipartimento di Matematica E. De Giorgi Università degli studi, 73100 Lecce, Italy [email protected]

Abstract. In this paper we present a complete axiomatization of the implicational fragment of the classical Propositional Calculus and a constructive proof of its weak completeness, which is an adaptation of the classical one due to Kalmár. In Notes and Comments we will show that, by adding two simple axioms containing also the negation connective, we get another complete axiomatization of the classical implicational propositional calculus.

1

Preliminaries and remarks

The well-formed formulas (briefly, wf(s)) that we will consider in this paper are the statement forms of Propositional Calculus built up from a set of propositional variables (statement letters) by appropriate applications of the unique connective implication ⊃ (the implicational Propositional Calculus). Here we will use ⊃ also as an operation. For a easier reading, in a wf we shall often replace pairs of round brackets with square ones. On the set of the wfs we shall consider a formal theory in which the set L0 of axioms is given by the wfs below (a, b, c and similar letters shall represent arbitrary wfs). Modus Ponens (briefly, M P ) is the unique inference rule. a ⊃ (b ⊃ a);

(1)

[a ⊃ (b ⊃ c)] ⊃ [(a ⊃ b) ⊃ (a ⊃ c)]; [(a ⊃ b) ⊃ b] ⊃ [(b ⊃ a) ⊃ a].

(2) (3)

If we add to L0 a set of axioms H and if b1 , b2 , . . . , bh = b is a proof of this new theory, we write H ` b (` b, if H = ∅) and say that b1 , b2 , . . . , bh = b is a proof of b from H. We will write K, a1 , a2 , . . . , ap ` b, whenever H = K ∪ {a1 , a2 , . . . , ap }. Sometime H ` b can be derived from suitable results, without having an explicit formal proof. Often this is unsatisfactory. Therefore, whenever we have a formal proof b1 , b2 , . . . , bh = b of b, one says that b is a constructive theorem from H and b1 , b2 , . . . , bh = b is a constructive proof of b. We point out that until the end of this section we will not use the axiom schema (3). The (constructive) theorem a ⊃ a is well known (see [5], Lemma 1.7, p. 31). Moreover, the Deduction Theorem (briefly, DT ) holds (i.e.: if H is a set of wfs, and a and b are wfs, then H, a ` b implies H ` a ⊃ b (cf. [5], p. 32). In the following we will consider only constructive theorems and constructive proofs. If b, c and d are wfs, by DT and M P we immediately have the following theorems: (b ⊃ c) ⊃ [(c ⊃ d) ⊃ (b ⊃ d)]; (b ⊃ c) ⊃ [(d ⊃ b) ⊃ (d ⊃ c)]; [b ⊃ (c ⊃ d)] ⊃ [c ⊃ (b ⊃ d)]; [c ⊃ (c ⊃ d)] ⊃ (c ⊃ d); b ⊃ [(b ⊃ c) ⊃ c].

(4) (5) (6) (7) (8)

On a Constructive Proof of Completeness

229

We point out that (4) represents the so-called law of hypothetical deduction. If a and b are wfs we set a ≤ b (see [1], p. 4) whenever ` a ⊃ b. Then (see (1) and theorems (4), (5), (6)), we get: a ≤ b ⊃ a; if b ≤ c, then c ⊃ d ≤ b ⊃ d; if b ≤ c and c ≤ d, then b ≤ d;

(9) (10) (11)

if b ≤ c, then d ⊃ b ≤ d ⊃ c; b ⊃ (c ⊃ d) ≤ c ⊃ (b ⊃ d).

(12) (13)

By ` a ⊃ a and by (11), ≤ represents a reflexive preorder relation on the set of wfs. Then we will consider also the equivalence relation ∼ = associated with ≤. Thus, since in (13) we can interchange a and b, we get: a ⊃ (b ⊃ c) ∼ = b ⊃ (a ⊃ c).

(14)

Remark 1. Let b and i be wfs and ` i. Then the properties below immediately follows: ` b ⊃ i; theref ore b ≤ i; if ` i ⊃ b (i.e. i ≤ b), then ∼ b. i⊃b=

` b;

(15) (16) (17)

By (13), (a ⊃ b) ⊃ (a ⊃ c) ≤ a ⊃ [(a ⊃ b) ⊃ c]. Furthermore, by (10), from b ≤ a ⊃ b we get (a ⊃ b) ⊃ c ≤ b ⊃ c, then a ⊃ [(a ⊃ b) ⊃ c] ≤ a ⊃ (b ⊃ c) by (12). Therefore, by transitivity, (a ⊃ b) ⊃ (a ⊃ c) ≤ a ⊃ (b ⊃ c). Hence by (2) we get: a ⊃ (b ⊃ c) ∼ = (a ⊃ b) ⊃ (a ⊃ c).

(18)

We see that ∼ = is a congruence with respect to ⊃. In fact, if a ∼ = b, then by (10) a ⊃ c ∼ =b⊃c ∼ and by (12) c ⊃ a = c ⊃ b. Whence the claim.

2

The completeness proof

Now we set a ∨ b = (a ⊃ b) ⊃ b. Since ∼ = is a congruence with respect to ⊃, ∼ = is a congruence also with respect to ∨ considered as an operation on the set of the wfs. As an immediate consequence of the axiom schema (3), we have: a∨b∼ = b ∨ a.

(19)

Therefore, since d ⊃ (c ⊃ d) is an axiom - and hence ` d ⊃ (c ⊃ d) - by (17) of Remark 1 we obtain: [(c ⊃ d) ⊃ d] ⊃ d ∼ = [d ⊃ (c ⊃ d)] ⊃ (c ⊃ d) ∼ = c ⊃ d.

(20)

We point out that by axiom schema (3) and by theorem (7), the following property (the so called Peirce’s law) holds: ` [(c ⊃ d) ⊃ c] ⊃ c.

(21)

Moreover, by (18) we immediately get the following property: a ⊃ (b ∨ c) ∼ = (a ⊃ b) ∨ (a ⊃ c).

(22)

230

Domenico Lenzi

Remark 2. If a ≤ c and b ≤ c, then [by (10) and (19)] a ∨ b = (a ⊃ b) ⊃ b ≤ (c ⊃ b) ⊃ b ∼ = (b ⊃ c) ⊃ c ≤ (c ⊃ c) ⊃ c ∼ = c (see (17) of Remark 1). Then, if F is the set of the wfs, [a ∨ b]∼ = is the least upper bound of [a]∼ = and [b]∼ = under the order relation induced on the quotient set F/ ∼ = by ≤. This means that the quotient structure (F/ ∼ =, ∨/ ∼ =) is a semilattice; hence, for any a, b, c ∈ F , (a ∨ b) ∨ c ∼ = a ∨ (b ∨ c). Lemma 1. If b, c and d are wfs, we have the following theorems: (b ∨ d) ⊃ [(c ⊃ d) ⊃ ((b ⊃ c) ⊃ d)]; (b ⊃ d) ⊃ [(b ⊃ c) ∨ d]; (c ∨ d) ⊃ [(b ⊃ c) ∨ d]; (b ⊃ d) ⊃ [(b ∨ d) ⊃ d].

(23) (24) (25) (26)

Proof. (23): Since b ∨ d = (b ⊃ d) ⊃ d, by DT and by M P the assertion is an easy consequence of (4). (24): Since (b ⊃ d) ⊃ [(b ⊃ c) ∨ d] ∼ = (b ⊃ d) ⊃ [(d ⊃ (b ⊃ c)) ⊃ (b ⊃ c)], by (4) we get the assertion. (25): c ≤ b ⊃ c ≤ (b ⊃ c) ∨ d and d ≤ (b ⊃ c) ∨ d; thus c ∨ d ≤ (b ⊃ c) ∨ d. Whence the claim. (26): By (14) we have (b ⊃ d) ⊃ [(b ∨ d) ⊃ d] ∼ = (b ∨ d) ⊃ [(b ⊃ d) ⊃ d] = (b ∨ d) ⊃ (b ∨ d). Whence the claim. Proposition 1. Let a be a wf, whose statement letters are taken from the following: B1 , B2 , . . . Bi , . . . Bn . Moreover, let us assign the truth value 1 to the letters B1 , B2 , . . . Bi and the truth value 0 to the remaining ones. Then we have the following two properties:1 If a takes the value 1, B1 , B2 , . . . Bi ` a ∨ Bi+1 ∨ · · · ∨ Bn ; If a takes the value 0, B1 , B2 , . . . Bi ` a ⊃ (Bi+1 ∨ · · · ∨ Bn ).

(27) (28)

Proof. The proof is by induction on the number of occurrences of connectives in the wf a. If a is without connectives, the claim is trivial. Indeed in this case a is a statement letter B1 , hence the claim reduces to B1 ` B1 and ` B1 ⊃ B1 . On the other hand, if a = b ⊃ c, we have only to prove that the claim holds whenever it is true with respect to b and c. Case 1: a takes the value 0. In this case b takes the value 1 and c takes the value 0. Therefore we have: B1 , B2 , . . . Bi ` b ∨ Bi+1 ∨ · · · ∨ Bn and B1 , B2 , . . . Bi ` c ⊃ (Bi+1 ∨ · · · ∨ Bn ). Thus in this case the claim is an immediate consequence of (23), where d is replaced by Bi+1 ∨ · · · ∨ Bn . Case 2a: a takes the value 1, with b taking the value 0. Therefore we have: B1 , B2 , . . . Bi ` b ⊃ (Bi+1 ∨ · · · ∨ Bn ). In this case the claim is an immediate consequence of (24). Case 2b: a takes the value 1, with c taking the value 1. Therefore we have: B1 , B2 , . . . Bi ` c ∨ Bi+1 ∨ · · · ∨ Bn In this case the claim is an immediate consequence of (25). We conclude with the following Proposition 2. Let a be a tautology. Then ` a. 1

In the sequel the brackets will be omitted by association on the right with respect to ∨. We remark that in property (28) the wf a ⊃ (Bi+1 ∨ · · · ∨ Bn ) is meaningful. In fact, since a takes the value 0, at least one statement letter of a takes the value 0.

On a Constructive Proof of Completeness

231

Proof. (see in [2] the proof of Proposition 1.13, p. 36, due to Kalmár). Since a is a tautology, it takes the value 1 for any truth value assignment to its statement letters. Let B1 , B2 , . . . Bi , . . . Bm−1 , Bm be a representation of the statement letters of a and consider an arbitrary truth value assignment to these letters; moreover, consider another truth value assignment, which differs from the first one only on Bm . Then, without loss of generality, we assume that B1 , B2 , . . . Bi are the letters different from Bm taking the value 1 and Bi+1 , . . . Bm−1 the letters different from Bm taking the value 0. Thus, by Proposition 1 and by DT , we have: 2 B1 , B2 , . . . Bi ` Bm ⊃ (a∨Bi+1 ∨· · ·∨Bm−1 ) and B1 , B2 , . . . Bi ` Bm ∨ a ∨ Bi+1 ∨ · · · ∨ Bm−1 . Then by (26) of Lemma 1 and by DT we have: B1 , B2 , . . . Bi ` a ∨ Bi+1 ∨ · · · ∨ Bm−1 . If m = 1, the proof is completed. If not, we repeat the same process with respect to m − 1. Therefore in m steps we get ` a. Notes and Comments. Above axiom schema (1) represents the so called law of simplification; moreover from our axiomatization both the law of hypothetical deduction (4) and the Peirce’s law (21) follow. Therefore a constructive proof can be obtained also as an immediate consequence of a constructive proof due to A. Tarski, with the contribution of P. Bernays (see [4] Theorem 29, p. 145, and [3]). For a related proof see [6]. Nevertheless the above proof becomes the first part of a constructive proof of the embedding theorem of an implicational algebra into (the implicational part of) a suitable boolean algebra (see [2]). If we consider the wfs having the connectives ⊃ or ∼ (the negation), then we can easily obtain a complete axiomatization of the classical implicational Propositional Calculus by adding to the above axioms schemas (1), (2) and (3) two axiom schemas that we present in the following synthetic manner: ∼a⊃b∼ = (a ⊃ b) ⊃ b.

(29)

We easily get the properties below. Property (32) is a particular case of (30); (33) follows from (29) and (20), as ∼ = is a right congruence with respect to ⊃; (34) follows from (33); (35) follows from (34) and (32); (36) follows from (30) and (35); property (37) follows from (36) and (14). ∼a⊃b∼ =∼ b ⊃ a; `∼ a ⊃ (a ⊃ b), by (14) and by a ⊃ (∼ a ⊃ b) ∼ = a ⊃ (∼ b ⊃ a); ∼ ∼∼ a ⊃ a =∼ a ⊃∼ a; (∼ a ⊃ b) ⊃ b ∼ = a ⊃ b; ∼ (∼ a ⊃∼ b) ⊃∼ b = ∼ (∼ b ⊃∼ a) ⊃∼ a ∼ a ⊃∼ b = = b ⊃∼ a; ∼ ` a ⊃∼∼ a, hence ∼∼ a = a; ∼ b ⊃∼ a ∼ =∼∼ a ⊃ b, hence ` a ⊃ (∼ b ⊃∼ (a ⊃ b); ` (a ⊃ b) ⊃ ((∼ a ⊃ b) ⊃ b).

∼ b ⊃∼ a ∼ = a ⊃ b;

(30) (31) (32) (33) (34) (35) (36) (37) (38)

Properties (31), (35), (37) and (38) ensures that the axiom schemas (1)-(5) give a complete axiomatization of the classical implicational Propositional Calculus (see [5] pp. 35-36: Lemma 1.12 and Proposition 1.13).

References 1. Diego A.: Sur les algèbres de Hilbert. Gauthier–Villars, Paris (1966). 2

By (19) and by Remark 2, Bm ∨ a ∨ Bi+1 ∨ · · · ∨ Bm−1 ∼ = a ∨ Bi+1 ∨ · · · ∨ Bm−1 ∨ Bm .

232

Domenico Lenzi

2. Lenzi D.: A prime ideal free proof of the embedding theorem of implicational algebras into boolean algebras. To appear in “Proceedings of Unilog 2005” 3. Lukasiewicz, J., Tarski A.: Untersuchungen ueber den Aussagenkalkuel. Comptes rendus des séances de la société des Sciences et des Lettres de Varsavie (1930). 4. Lukasiewicz J.: Selected works. Series “Studies in Logic”. North Holland, Amsterdam (1970). 5. Mendelson E.: Introduction to Mathematical Logic. Van Nostrand, Princeton (1968). 6. Mints G.: A Short Introduction to Intuitionistic Logic. Kluwer Academic/Plenum Publishers (2000).

On the Computational Complexity of the Theory of Complete Binary Trees Zhimin Li12 , Libo Lo23 , and Xiang Li1 2

1 Institute of Computer Science, Guizhou University, Guiyang550025, China. School of Mathematics and Computer Science, Guizhou University for Nationalities, Guiyang550025, China. [email protected] 3 Faculty of Mathematics, Beijing Normal University, China.

Abstract. The first-order theory of complete binary Tree is decidable by the quantifier elimination (see [2]). In this paper, by using the game, F Rn (A, A) we demonstrate that the first-order theory of complete binary Tree can be decided within linear double cn exponential Turing time 22 and Turing space 2dn (n is the length of input, c and d are suitable constants).

1

Introduction

Tree-like structures are used in many fields of computer science such as functional and logic programming, constraint solving, database theory, data type specification and knowledge representation. In this paper, we are interested in the first-order theories of complete binary trees. J.Liu, D.Liao, L.Lo have proved the quantifier elimination theorem for complete binary trees in [2]. However, they did not discuss the computational complexity of their procedures. Whenever the reasoning about a class of data structures is involved, it is interesting to know what is the inherent computational complexity of this reasoning. This may be crucial in practical implementations of theorem provers, constraint solvers, systems of logic programming. Thus, it is important and significative to research the complexity of theory of complete binary trees. Our main tool is the game F Rn (A, B) in this paper and our main results is that the firstorder theory of complete binary trees can be decided within linear double exponential Turing cn time 22 and Turing space 2cn ( n is the length of input, c and d are suitable constants ).

2 2.1

Preliminaries The theory of complete binary trees

Definition 1. The theory of complete binary trees T is constituted of axioms in the first order language L = {E, R}, where E denotes a binary relation, R denotes constant quantity. We present these axioms as follows: – Theory of graph: (1) Non-loop: ∀x¬(xEx); (2) Undirected graph: ∀xy (xEy ↔ yEx). – Theory of Trees: (1) there is a unique root node, denote R. That is: ∃y1 y2 ((y1 6= y2 ) ∧ (REy1 ∧ REy2 ) ∧ (∀z (REz → (z ≡ y1 ∨ z ≡ y2 )))); (2) Acyclic graph: ¬σ3 , ¬σ4 , ..., ¬σn ,..., where σn (n > 3) denotes the below formula: ∃y1 y2 ...yn ((∧i6=j yi 6= yj ) ∧ (y1 Ey2 ) ∧ ... ∧ (yn Ey1 )).

234

Zhimin Li, Libo Lo, and Xiang Li

– Complete binary trees: ∀x 6= R, there exist exactly three distinct node which adjacent to x. ∃y1 y2 y3 ((xEy1 ∧ xEy2 ∧ xEy3 ) ∧ (∀z(zEx → (z ≡ y1 ∨ z ≡ y2 ∨ z ≡ y3 )))). Definition 2. We say that a model U =< A, E, R > is complete binary tree model, if it satisfies above theories, where A is the set of node, xEy denotes that x and y are adjacent, R is the unique root node of a tree. Also, we have below simple results: – A graph is a tree iff for every pair x,y of distinct nodes it contains exactly one path from x to y. We define a distance d on A by d(x, y) is the length of path from x to y. n−1 d(x, y) = n ↔ ∃z1 z2 ...zn−1 ((∧i6=j zi 6= zj ) ∧ (∧n−1 i=1 x 6= zi ) ∧ (∧i=1 y 6= zi ) ∧ (xEz1 ) ∧ (z1 Ez2 ) ∧ ... ∧ (zn−1 Ey))

– Every first-order sentences in the theory of complete binary trees is equivalent to a Boolean combination of the basic sentences, the set of basic sentences is {d(x, y) = n, d(R, x) = m, n, m=0,1,....}. We define the closed ball B(x, k) = {y ∈ A : d(x, y) ≤ k}, this ball centered at x of radius k ∈ IN. kxk = d(x, R), denotes the norm of x. 2.2

The Game F Rn (A, B)

Ferrante and Rackoff gave a version of Ehrenfeucht game in [4] which led to a decision procedure for a suitable model. This procedure involves a series of testings of a sentence which may be true or false when we substitute a sequence of elements of the model for the bound variables. To constrain such a sequence of elements in a certain range we need to associate with each element a norm. Let U =< A, E, R > be a model. For every element a ∈ A the norm of a is denoted by kak which is usually a natural number. Suppose that in the sentence φ0 = (Q1 x1 )...(Qn xn )ψ(x1 , ..., xn ) we have already substituted a1 , ... , ak ∈ A for variables x1 ,..., xk containing a sentence φk = (Qk+1 xk+1 )...(Qn xn )ψ(a1 , ..., ak , xk+1 , ..., xn ). The next step is to substitute ak+1 for xk+1 . We want to control the norm of ak+1 in a certain range. So we introduce a bounding function H(n − k, k, m) = h ∈ IN, where ||a1 ||, ... , ||ak || ≤ m. The number of elements of A with norms less than a fixed number and the bounding function will determine the number of steps needed for the decision procedure, so we always want to keep them as low as possible. a ¯k ≡n ¯bk denote that a ¯k and ¯bk satisfy the same set of sentences with quantifier-depth less than or equal to n. In a lot of models the discussion of the relation a ¯k ≡n ¯bk is rather difficult. It happens that ¯ a stronger(finer) relation a ¯k Eh,k bk is sometimes simpler to use because the definition of the latter may be simpler.

On the Computational Complexity of the Theory of Complete Binary Trees

235

With the help of a norm, a bounding function and an equivalence relation we can now play our game F Rn (A, B). We are given two models A and B and two player I and II. In the game F Rn (A, B) each player makes n moves. This time player II has n + 1 equivalence relations in mind. The game begins at ∅En,0 ∅. Suppose that at the end of the kth move the players have chosen (a1 , ..., ak ) ∈ Ak and (b1 , ..., bk ) ∈ B k satisfying a ¯k En−k,k ¯bk . For any ak+1 ∈ A chosen by player I, according to the equivalence relation player II can find an element bk+1 ∈ B(similarly if player I chooses bk+1 , then player II chooses ak+1 )such that a ¯k+1 En−k−1,k+1¯bk+1 . Player II wins the game if and only if a¯n E0,n¯bn → a ¯n ≡0 ¯bn Hence the winning strategy for player II is to satisfy the equivalence relation Eh,k at each step k. The following Propositions 1 and 2 encompasses the game F Rn (A, B). It is implicit in [4], more explicit in Luo [1]. It gives conditions that allow to the reduce the satisfaction by a relational structure of a prenex sentence to the satisfaction by this relational structure of a sentence with bounded quantifiers. Proposition 1. [4]. Let A, B be two models. Suppose that there is a family of equivalence relation En,k for player II to play the game F Rn (A, B). This means that an equivalence relation En,k is defined between the elements of A, B such that (i) a ¯k E0,k ¯bk → a ¯k ≡0 ¯bk for 1 ≤ k ≤ n. (ii) Given a ¯k Er+1,k ¯bk and for every ak+1 ∈ B such that a ¯k+1 Er,k+1¯bk+1 for 1 ≤ r ≤ n. And vice versa. Then we have (i) a ¯k+1 Er,k+1¯bk+1 → a ¯k ≡r ¯bk for any a ¯k ∈ Ak and a ¯k ∈ B k . (ii) T hn (A) = T hn (B). In some cases A = B and for any chosen sequences a1 , ... , ak and b1 , ... , bk where a¯k En,k b¯k and ||b1 ||, ... , ||bk || ≤ m, and for any ak+1 chosen by player I in A player II can find bk+1 ∈ A such that a ¯k+1 En−k−1,k+1¯bk+1 and ||bk+1 || ≤ H(n − k, k, m). Then we have a decision procedure for the sentences of T h(A) with quantifier depth n as in the following theorem. Proposition 2. [4] Let A be a model. Let F Rn (A, A) be an Ehrenfeucht game with equivalence relation En,k , norm ||a|| and bounding function H(n, k, m) satisfying the following conditions: (i) If a ¯k En+1,k ¯bk and ||b1 ||, ... , ||bk || ≤ m, then for every ak+1 ∈ A there exists a bk+1 ∈ A such that a ¯k+1 En,k+1¯bk+1 and ||bk+1 || ≤ H(n, k, m). (ii) For every non-negative real number r the number of elements a ∈ A with norm ||a|| ≤ r is F (r). Then the number of steps needed for deciding a first-order sentence φ with quantifier-depth n n is at most Πi=1 F (Li ), where L0 = 1 and Li+1 = H(n − i, i, Li ) for i = 0, ... , n − 1. Here a step means a testing procedure of an instance of φ without variables and quantifiers.

3

The Game F Rn (A, A) Over Tc and the Upper Bound for T h(Tc ).

In this section, Tc = (A, E, R) denotes the model of complete binary trees, and T h(Tc ) denotes the first-order theory of complete binary trees.

236

3.1

Zhimin Li, Libo Lo, and Xiang Li

The main definition of equivalence relation En,k on A.

We specify now the definitions of the relation En,k that we need here. Denote a subtree T S (a, k) with root a ∈ A and depth k ∈ IN. T (a, b, k) = T (a, k) T (b, k), where d(a, b) ≤ k. T (a, b, k)♥T (a0 , b0 , k): T (a, b, k) ∼ = T (a0 , b0 , k) and P W (a, a0 ) ∼ = P W (b, b0 ). P W (x, y) is a path from x to y. Let a1 , a2 , b1 , b2 ∈ A and n ∈ IN. We write (a1 , a2 ) ≈n (b1 , b2 ) if (1) either d(a1 , a2 ) > 2n and d(b1 , b2 ) > 2n , (2) or d(a1 , a2 ) = d(b1 , b2 ) ≤ 2n , and T (a1 , a2 , 2n )♥T (b1 , b2 , 2n ). Let n ∈ A, k ∈ IN, and a ¯k , ¯bk ∈ INk . Then a ¯k En,k ¯bk if (1) For all i ∈ [1, k], (R, ai ) ≈n+1 (R, bi ) . (2) For all i, j ∈ [1, k], i 6= j implies (ai , aj ) ≈n (bi , bj ). 3.2

The Game F Rn (A, A) Over Tc .

The quantifier elimination process starts from the innermost part of a formula. It finds an equivalent quantifier free formula ψ(x1 , ..., xn ) for every formula (∃x)φ(x1 , ..., xn ) with φ quantifierfree. This allows us to eliminate all of the quantifiers one by one. The Ehrenfeucht game starts from the outermost part of the formula. It changes a formula with quantifiers to an equivalent set of formulas with fewer quantifiers. After taking away the quantifiers we obtain a set of formulas. The validity of the original formula depends on truth value of each of quantifier free formulas. Lemma 1. Let a ¯k , ¯bk ∈ Ak , such that a ¯k E0,k ¯bk . ¯ Then a ¯k ≡0 bk for all k ∈ [1, n], that is a ¯k and ¯bk satisfy the same set of quantifier-free sentences . Proof. By definition of a ¯k E0,k ¯bk , we have: (i) For all i ∈ [1, k], (R, ai ) ≈1 (R, bi ), that is, either ||ai || > 2 and ||bi || > 2, or ||ai || = ||bi || ≤ 2 and P W (R, ai ) ∼ = P W (R, bi ). (ii) For all i 6= j ∈ [1, k], we have (ai , aj ) ≡0 (bi , bj ), that is, either d(ai , aj ) > 1 and d(bi , bj ) > 1, or d(ai , aj ) = d(bi , bj ) = 1, and T (ai , aj , 1)♥T (bi , bj , 1). It is obvious that (1) if ai ≡ aj , then bi ≡ bj . (2) if ai Eaj , then bi Ebj and T (ai , aj , 1)♥T (bi , bj , 1) Thus, a ¯k and ¯bk satisfy the same set of quantifier-free sentences. Lemma 2. For any a ¯k , ¯bk ∈ Ak satisfying ¯ a ¯k En+1,k bk and, for all i ∈ [1, k], ||bi || ≤ m, and let ak+1 ∈ A. Then there exists bk+1 ∈ A such that a ¯k+1 En,k+1¯bk+1 and ||bk+1 || ≤ 2n+1 + m + 1. Proof. The hypothesis a ¯k En+1,k ¯bk means that (H1 ) For all i ∈ [1, k], either ||ai || > 2n+2 and ||bi || > 2n+2 , or ||ai || = ||bi || and P W (R, ai ) ∼ = P W (R, bi ). (H2 ) For all i, j ∈ [1, k], either d(ai , aj ) > 2n+1 and d(bi , bj ) > 2n+1 , or d(ai , aj ) = d(bi , bj ) ≤ 2n+1 , and T (ai , aj , 2n+1 )♥T (bi , bj , 2n+1 ).

On the Computational Complexity of the Theory of Complete Binary Trees

237

Then ak+1 is given, and we are looking for bk+1 such that a ¯k+1 En,k+1¯bk+1 , that is, such that (H1 ) and (H2 ) above are satisfied, and such that, moreover, we have (C1 ) For all i ∈ [1, k], either ||ai || > 2n+1 and ||ai || > 2n+1 , or ||ai || = ||bi || ≤ 2n+1 and P W (R, ai ) ∼ = P W (R, bi ). (C2 ) For all i, j ∈ [1, k], either d(ai , ak+1 ) > 2n and d(bi , bk+1 ) > 2n , or d(ai , ak+1 ) = d(bi , bk+1 ) ≤ 2n , and T (ai , ak+1 , 2n )♥T (bi , bk+1 , 2n ) We break this into three cases: Case1 ||ak+1 || ≤ 2n+1 . Then we should chose bk+1 such that ||ak+1 || = ||bk+1 || ≤ 2n+1 and the symmetric centre of ai and bi is the root R of complete binary tree. So, we have ||ak+1 || = ||bk+1 || ≤ 2n+1 and P W (R, ak+1 ) ∼ = P W (R, bk+1 ). Conditions (C1 ) is satisfied. For Condition (C2 ), put i ∈ [1, k]. Subcase1.1 ||ai || ≤ 2n+2 . Then, by (H1 ), ||ai || = ||bi ||, and P W (R, ai ) ∼ = P W (R, bi ). Condition (C2 ) is satisfied. Subcase1.2 ||ai || ≥ 2n+2 . Then by (H1 ), ||bi || ≥ 2n+2 . d(ai , ak+1 ) ≥ d(0, ai ) − d(0, ak+1 ) ≥ 2n+2 − 2n+1 > 2n and d(bi , bk+1 ) ≥ d(0, bi ) − d(0, bk+1 ) ≥ 2n+2 − 2n+1 > 2n . So condition (C2 ) is satisfied too. Case2 ||ak+1 || > 2n+1 and there is an h ∈ [1, k] such that ak+1 ∈ B(ah , 2n ), that is d(ah , ak+1 ) ≤ 2n . In complete binary tree, there is an isomorphism ρ: T (ah , 2n ) → T (bh , 2n ). We can find bk+1 ∈ T (bh , 2n ) such that bk+1 ≡ ρ(ak+1) . Next, we prove ||bk+1 || > 2n+1 by contradiction. Suppose ||bk+1 || ≤ 2n+1 , then kbh k ≤ kbk+1 k + d(bh , bk+1 ) ≤ 2n+1 + 2n > 2n+2 . By condition (H1 ), we have kah k = kbh k, in addition, T (ah , 2n ) ∼ = T (bh , 2n ). So, kak+1 k = kbb+1 k. This is a contradiction. Therefore, condition (2) holds. For condition (C2 ), let i ∈ [1, k], if i = h. d(ah , ak+1 ) = d(bh , bk+1 ) ≤ 2n and T (ah , ak+1 , 2n )♥T (bh , bk+1 , 2n ). If i 6= h, then we consider the below two subcases. Subcase2.1 d(ai , ah ) > 2n+1 . Then, by (H2 ), d(bi , bh ) > 2n+1 , so d(ai , ak+1 ) ≥ d(ah , ai ) − d(ah , ak+1 ) > 2n+1 − 2n > 2n . So is d(bi , bk+1 ) > 2n . Subcase2.2 d(ai , ah ) ≤ 2n+1 . Then, by (H2 ), d(bi , bh ) = d(ai , ah ) ≤ 2n+1 . So either d(ai , ak+1 ) = d(bi , bk+1 ) > 2n or d(ai , ak+1 ) = d(bi , bk+1 ) ≤ 2n and T (ai , ak+1 , 2n )♥T (bi , bk+1 , 2n ). Thus, Condition (C2 ) is satisfied. Case3 ||ak+1 || > 2n+1 and for all i ∈ [1, k], d(ai , ak+1 ) > 2n . Let bj ∈ {bi } satisfies ||bj || ≥ {||bi ||}. We put bk+1 such that ||bk+1 || = ||bj || + 2n + 1 ≤ 2n + m + 1, so d(bi , bk+1 ) ≥ ||bk+1 || − ||bi || = 2n + 1||bj || − ||bi || ≥ 2n + 1 > 2n . Now we (For (For (For

give a upper bound on ||bk+1 ||: Case1): ||bk+1 || ≤ 2n+1 . Case2): ||bk+1 || ≤ ||bh || + d(bh , bk+1 ) ≤ m + 2n . Case3): ||bk+1 || ≤ 2n + m + 1 In all case, we get ||bk+1 || ≤ 2n+1 + m + 1.

238

3.3

Zhimin Li, Libo Lo, and Xiang Li

The Upper Bound for Theory of Complete Binary Trees

We now turn to the upper complexity bound of theory of complete binary trees. Lemma 3. Let m0 = 1 and mk+1 = H(n − k, k, mk ) for k = 0, ... , n − 1. Then we have mk ≤ 2n+2 + k + 2 for n ≥ 2. Proof. By Lemma1 and Lemma2, the hypotheses of P roposition2 are satisfied, with m0 = 1, H(n, k, m) = 2n+1 + m + 1. By induction on i, we get easily mk = 2n−k+2 (2k − 1) + k + 2 ≤ 2n+2 + k + 2. Theorem 1. It takes at most 22 T h(Tc ).

cn

steps to decide a sentence φ with quantifier-depth n in

n−1 Proof. From P roposition2, it follows that for deciding φ we need to check Πk=0 F (mk ) quantifier free formulas, where m0 = 1 and mk+1 = H(n − k, k, mk ) for k = 0, ... , n − 1. The number of elements a ∈ with norm ||a|| ≤ mk is F (mk ) = 2mk +1 . Hence the the total number of steps of checking is at most n−1 mk +1 n−1 2n+2 +k+2 Πk=0 (2 ) ≤ Πk=0 (2 ) Pn−1 n+2 +k+2) = 2 k=0 (2 n(n−1) n+2 n+2 2 = 2n2 + 2 +2n ≤ 2n2 +n n cn ≤ 25n2 ≤ 22 , c is a proper constant.

We now turn to the upper space complexity bound of the first-order theory of T h(Tc ). The next lemma is easily proved by induction: Lemma 4. Let (Q1 x1 )...(Qn xn )ψ(x1 , ..., xn ) is a prenex form of a formula in Tc = hA, E, Ri, k ∈ [1, n], Qk ∈ {∀, ∃}, ψ(x1 , ..., xn ) is quantifier-free formula. Let mk = 2n+2 + k + 2. Then Tc = hA, E, Ri |= (Q1 x1 )...(Qn xn )ψ(x1 , ..., xn ) if and only if Tc = hA, E, Ri |= (Q1 x1 ¹ m1 )...(Qn xn ¹ mn )ψ(x1 , ..., xn ). It follows from above lemma that the principal measure of complexity is the number of quantifiers in the prenex form of a formula. To decide a formula of length n, we never need to consider trees higher than 2dn , where d is a proper constant. dn An arbitrary tree of height 2dn may have up to 22 vertices and can be represented by an dn incidence matrix in space 22 . here, we use the complexity measure based on deterministic Turing machines. The working dn tape consume cells up to log 22 = 2dn . Therefore, we have below theorem: Theorem 2. It takes at most 2dn space to decide a sentence φ with quantifier-depth n in T h(Tc ).

Acknowledgement The first author thanks his supervisors professor Xiang Li and Libo Luo for their helpful guidance and discussions.

References 1. Libo. Lo, On the computational complexity of the theory of Abelian groups, Annals of Pure and Applied Logic 37 (1988) 205-248. North-Holland. 2. J. Liu, D. Liao, L. Lo. Quantifier elimination for complete binary trees, Acta Mathematica Sinica (In Chinese), Vol. 56, No. 1, (2003). 3. A.Ehrenfeucht, An application of games to the completeness problemma for formalized theories, Fund.Math.49(1961)129-141. 4. J. Ferrante and C. Rackoff, The computational complexity of logical theories (Springer, Berlin, 1979).

A Multiple-Evaluation Genetic Algorithm for Numerical Optimization Problems Chih-Hao Lin and Jiun-De He Department of Management Information Systems Chung Yuan Christian University No. 200, Jhongbei Rd., Jhongli City 320, Taiwan Email:[email protected]; [email protected]

Abstract. This paper proposes a novel genetic algorithm, named a multiple-evaluation genetic algorithm (MEGA). By mimicking the genetic engineering on biological organisms, the MEGA uses gene-evaluation and inheritance mechanisms to improve both the exploration and exploitation abilities. The proposed gene-evaluation mechanism individually evaluates the influence of each gene and widely applies in the crossover and mutation operators. The proposed inheritance mechanism clones the characteristic of the ancestors and records on inheritance genes. The MEGA modifies the traditional coding, crossover and mutation procedures and solves several well-known numerical problems. Experimental results show that the proposed algorithm is more efficient and effective than several existing algorithms.

1

Introduction

In recent years, genetic algorithms (GAs) have received significant attention regarding their potential as a novel optimization technique, originally motivated by the Darwinian principle of evolution. Since John Holland first proposed GA in 1975, the field grew quickly and the technique has been successfully applied to a wide range of real-world problems of considerable complexity. Although there are several researches that emphasize the efficiency of GAs [1], few studies pay attention to make optimal use of Holland’s Schema Theorem, and the building block hypothesis [2]. For example, Yao et al. proposed the fast evolutionary programming (FEP) to solve the premature convergence deficiency [3]. The FEP had modified mutation operation by using a Cauchy instead of Gaussian mutation as its primary search operator. The results showed that the FEP could solve some of the multimodal optimization problems and convergence to near optimal solutions. To improve slow finishing deficiency, R. Hinterding proposed a Gaussian mutation with self-adaptation in 1995 [4] and then Z. Tu and Y. Lu developed the stochastic genetic algorithm (StGA) in 2004 [5]. All of these methods were efficiency in convergence. The StGA in particular can not only achieve the accuracy, but also reduce the computational effort. However, the majority of GA modifications use mathematic methods to improve the local search phase [1][6]. By imitating the biological process of genetic engineering, this paper proposes a multipleevaluation GA (abbreviated as “MEGA”). The main objective is to explore the gene contribution contained in a chromosome and to inherit advantage of ancestors without reducing the population diversity. The performance of the proposed MEGA was experimented by carrying out optimization on 11 well-known benchmark functions. Compared with the observations reported in the literature, the MEGA is able to achieve more accurate results and considerable reduction on computing effort in almost all the cases. That is, the proposed MEGA maintains a superior efficiency and effectiveness for such benchmark problems than others are.

240

2

Chih-Hao Lin and Jiun-De He

Multiple-Evaluation Genetic Algorithm (MEGA)

This paper proposes an efficient optimization algorithm, named a multiple-evaluation genetic algorithm (MEGA). The emphases of this paper are to develop (1) a novel chromosome structure that introduces memory genes to record the inheritance characteristic, (2) a new evaluation mechanism to calculate the influence of each gene, and (3) a modified recombination scheme that consists of evaluation-based crossover and mutation operations to improve both the exploration and exploitation abilities. 2.1

Coding Mechanism

Because of the numerical property of test functions, the coding mechanism represents decision − → variables as solution genes and joins together to form the variable vector, X j = (xj1 , xj2 , · · · , xjn ) ∈ 0 δ(x) = and Θ(x) = 0 otherwise 0 otherwise are in H1 . (III) [Mycka and Costa, 2004, Loff et al., 2007] If a function f is in Hi , then I[f ] is in Hmax(i,1) . (IV ) [Mycka, 2003] There are real recursive tuple coding functions in H3 , i.e., for every n and 1 6 i 6 n, we have γn , γn,i ∈ H3 and such that γn (γn,1 (x), . . . , γn,n (x)) = x, γn,i (γn (x1 , . . . , xn )) = xi and γn,i (0) = 0. One can conclude from (I,II) above that the functions given by ( 1 |x| = (2θ(x) − 1)x and lt(x, y) = θ(y − x) − δ(y − x) = 0

if x < y, otherwise,

−1 are also in H1 and from (IV) that γm (x) = (γm,1 (x), . . . , γm,m (x)) is in H3 .

Definition 4. The sup and inf operators are given by sup[f ](¯ x) = sup f (¯ x, y) and inf [f ](¯ x) = inf f (¯ x, y). y∈IR

y∈IR

More exactly, sup[f ](¯ x) (or inf [f ](¯ x)) is the value z such that for all y ∈ IR some w > 0 verifies z − w = f (¯ x, y) (resp. z + w = f (¯ x, y)). This means that sup[f ](¯ x) is undefined if f (¯ x, y) is undefined for some y. In order to show that REC(IR) is closed for sup and inf , we use a function similar to the remainder function for the natural numbers in order to create a periodic function, and take the supremum or infimum limit of that function. Definition 5. (y mod z) is the number in [0, |z|) — or (−|z|, 0] if y < 0 — such that y = n|z| + (y mod z) for some n ∈ ZZ. Proposition 3. mod is in H1 . Proof. Set (y mod z) = U21 (I[h](y, sg(y)|z|, yz + 1)), where ( (y, z) if |y| < |z|, h(y, z) = = (y − z × (1 − lt(|y|, |z|)), z), (y − z, z) otherwise, and sg(y) = 2Θ(y) − 1 − δ(y). 4

u t

To build the (n + 1)th level, we take the functions in the previous level and their limits, and close the resulting set under the remaining operators. The hierarchy thus becomes organized by the rank of the infinite limit operators.

A Functional Characterisation of the Analytical Hierarchy

251

Proposition 4. If f ∈ Hn , then sup[f ], inf [f ] ∈ Hn+2 . Proof. Given an (n+1)-ary function f ∈ Hn we define the new periodic function F ∈ Hmax(n,1) , given by F (¯ x, y, z) = f (¯ x, (y mod z)), and then we set s f± (¯ x) = lim lim sup F (¯ x, ±y, z). z→∞ y→∞

s s Now we have sup[f ](¯ x) = max(f+ (¯ x), f− (¯ x)), where max(x, y) = lt(x, y) × y +(1 − lt(x, y)) ×x. We proceed in the same way for inf [f ]. u t

4

The analytical hierarchy

We present the analytical hierarchy of predicates, and relate it with the η-hierarchy. Definition 6. The analytical hierarchy of predicates consists of three IN-indexed families of predicates over natural numbers and functions of natural numbers: 1. Σ01 is the class of predicates that can be given using number quantifiers over a recursive predicate, and Π01 = Σ01 . 1 is the class of predicates given by ∃y φ(¯ x, y, a ¯), with φ in Πn1 . 2. Σn+1 1 x, y, a ¯), with φ in Σn1 . 3. Πn+1 is the class of predicates given by ∀y φ(¯ 1 1 1 4. ∆n = Σn ∩ Πn . We write Σω1 to stand for ∪n∈IN Σn1 , and in the same way for Πω1 and ∆1ω . We will use abundantly the following result. 1 is closed for existential quantification over functions. Proposition 5. (a) Σn+1 1 (b) Πn+1 is closed for universal quantification over functions. 1 1 are closed for existential and universal quantification over natural numbers. and Σn+1 (c) Πn+1 1 (d) If P ∈ Σn then some P ? also in Σn1 is such that ∀a P ⇐⇒ ∀x P ? . (e) If P ∈ Πn1 then some P ? also in Πn1 is such that ∃a P ⇐⇒ ∃x P ? .

Definition 7. We say that a function f : IRm → IRn is in Σk1 if the (n + m)-ary predicate of expression z¯ = f (¯ x) is in Σk1 . Similarly for Πk1 and ∆1k . We assume below that the surjection from functions of natural numbers to real numbers mentioned in page 248 allows us to obtain, for a given number n, the first n digits of the coded real number. Proposition 6. The functions 1n , 1¯n , 0n , Uni , +, ×, / and floor, as well as the predicates of equality and inequality over the reals, are in ∆10 . Proof. We’ll begin by showing that there is a recursive way to decide the predicate over the reals given by the expression ‘x and y are not different up to the nth digit’, which we write x =n y. An algorithm to decide this predicate needs to solve the ambiguity of the representation of a real number by binary expansion, and we can make it work the following way: given two real numbers x, y and a natural number n, we obtain the first n digits of the two reals and verify if they are the same. If they are, then we decide that x =n y. If the digits are not equal we consider the first different digit — one is 0 and the other 1 — and check if the digits after the 0 are all 1s and the digits after the 1 are all 0s.5 If so, then we decide that x =n y, and we decide that x6=n y otherwise. The predicate of real number equality is then given by: ∀n x =n y, which is in Π10 ⊂ ∆10 . For the function +, we define a predicate, of expression z =n x + y, that 5

e.g. x = 101.110000 and y = 101.101111, where the first different digit is underlined.

252

Bruno Loff

decides if z = x + y for the first n digits of z, x and y. This function computes the sum of the truncations of x and y to the nth fractionary digit and checks if resulting rational number coincides with z to the nth digit using the method shown above. If so, the function is valued 1, and 0 otherwise. Now we have that z = x + y if and only if ∀n z =n x + y, which is ∆10 . The proof is similar for the remaining operations, except that the number of required significant digits varies. u t A single real number can code any finite tuple of real numbers by alternating the digits of the real numbers in the tuple (this is, incidently, how the γ functions of Proposition 2 work). In this sense, we write yn,i to stand for the ith real number in the n-ary tuple coded by y (again, yn,i = γn,i (y)). For an m-ary tuple y¯, we write y¯n,i to stand for the tuple ((y1 )n,i , . . . , (ym )n,i ). Then it is not hard to see that if some n-ary predicate P is in ∆1n ,6 then the (n + 1)-ary predicate P ? given by P ? (¯ y , n) ⇐⇒ ∀i 6 nP (¯ yn,i ) is also in ∆1n . Proposition 7. All real recursive functions belong to the analytical hierarchy, in the sense of Definition 7. Proof. The result is proved by induction on the structure of REC(IR) presented in Proposition 1. Proposition 6 gives us the result for the atomic functions. Proposition 5 will suffice for the remaining operators. If the real recursive functions f and g are in Σn1 , then C[f, g] is in Σn1 , since: z¯ = C[f, g](¯ x) ⇐⇒ ∃¯ y z¯ = f (¯ y ) ∧ y¯ = g(¯ x). Let f be a real recursive n-ary function with n components in Σn1 . Then I[f ] is in Σn1 , since z¯ = I[f ](¯ x, y) if and only if ∃w∃n ¯ [n = byc ∧ w ¯n,1 = f (¯ x) ∧ (∀i 6 n)[w ¯n,i+1 = f (w ¯n,i )] ∧ z¯ = w ¯n,n ] . 1 1 , since z¯ = ⊆ Σn+6 If f is a real recursive (n + 1)-ary function in Σn1 , then ls[f ] ∈ Πn+5 lim supy→∞ f (¯ x, y) if and only if

∀δ > 0∃²∀²? > ²∃s > ²? ∀s? > ²? [f (¯ x, s) > f (¯ x, s? ) ∧ |¯ z − f (¯ x, s)| < δ] . We can do identically for lim inf. We also have that z¯ = lim f (¯ x, y) ⇐⇒ ∀δ > 0∃²∀²? > ² [|¯ z − f (¯ x, ²? )| < δ] , y→∞

1 1 . If f1 , . . . , fn are in Σn1 then v[f1 , . . . , fn ] is trivially also in resulting in l[f ] ∈ Πn+3 ⊆ Σn+4 1 Σn . u t

It was proven by Loff et at. [2007, Proposition 4.9] that defining a function f ∈ Hn with iteration instead of differential recursion requires n + 7 nested limits. 1 Corollary 1. For every natural number n we have Hn ⊆ Σ6n+43 .

Definition 8. The characteristic of a predicate P over INm × IRn is the total function χP : INm × IRn → {0, 1} such that χP (¯ a, x ¯) = 1 if and only if P (¯ a, x ¯) holds. We say that such a predicate P has a real recursive characteristic f if f is a real recursive function such that, for every a ¯ ∈ INm , x ¯ ∈ IRn , χP (¯ a, x ¯) = f (¯ a, x ¯). We write P ∈ Hk if there is a real recursive characteristic of P in Hk . 6

1 1 Remember that Σn−1 ∪ Πn−1 ⊂ ∆1n , which can be proven by adding extra quantifiers to predicates 1 1 in Σn−1 ∪ Πn−1 [see Odifreddi, 1989, p. 381].

A Functional Characterisation of the Analytical Hierarchy

253

Proposition 8. All predicates in the analytical hierarchy have real recursive characteristics. Proof. Mycka and Costa [2004, p. 855] show that all Π11 predicates have real recursive characteristics in at most H6 , and so all predicates in ∆10 ⊂ Π11 have real recursive characteristics. We now show that if P is an (n + 1)-ary predicate with a real recursive characteristic χP , then there are real recursive characteristics of the predicates given by ∀yP (¯ x, y) and ∃yP (¯ x, y). We have shown in Proposition 4 that if a function is real recursive, then so is its supremum and infimum over the positive or negative infinite interval. So we have that ∀yP (¯ x, y) if and only if inf [χP ](¯ x) = 1 and that ∃yP (¯ x, y) if and only if sup[χP ](¯ x) = 1. This way we conclude that all analytical predicates have real recursive characteristics. u t Proposition 9. If f is an n-ary vector function with m real components such that Γf , given by ( 1 if z¯ = f (¯ x) and Γf (¯ z, x ¯) = 0 otherwise, is in Hi , then f is in Hmax(i+2,3) . Proof. Remember the tuple coding functions from proposition 2. We get an (n+1)-ary function ( z if z = γm (f (¯ x)), ? −1 Γf (¯ x, z) = z × Γf (γm (z), x ¯) = 0 otherwise. Consider Γf?? (¯ x) = sup Γf? (¯ x) + inf Γf? (¯ x); z∈IR

z∈IR

should f (¯ x) be defined, one has that Γf?? (¯ x) = γm (f (¯ x)), and if f (¯ x) is undefined, then we −1 ?? have Γf (γm (Γf (¯ x)), x ¯) = 0. So set f (¯ x) =

−1 γm (Γf?? (¯ x)) −1 Γf (γm (Γf?? (¯ x)), x ¯)

. u t

Corollary 2. For every natural n we have ∆1n ⊆ H2n+8 . Our first main result follows from Corollaries 1 and 2: Theorem 1. REC(IR) is the class of functions given by an analytical predicate, i.e., REC(IR) = {f | the predicate given by z¯ = f (¯ x) is in ∆1ω }. Now we prove a well-known theorem. As is now expected, results about real recursive functions imply their counterparts in the analytical hierarchy. Proposition 10. The analytical hierarchy does not collapse, i.e., there is no number n such that ∆1ω ⊆ ∆1n . Proof. If the analytical hierarchy collapsed to the level ∆1n for some n, then one could find a universal analytical predicate, Ψ ∈ ∆1n , with a real recursive characteristic χΨ ∈ H2n+6 . By Proposition 9 one concludes that the η-hierarchy collapses to level H2n+8 . It was shown in [Loff et al., 2007, Theorem 6.2] that the η-hierarchy does not collapse, and so neither does the analytical hierarchy. u t

254

5

Bruno Loff

Analogue of Post’s theorem

While there is no Post theorem which relates the analytical hierarchy with natural recursive functions, we will show below an analogue of Post’s theorem that relates the analytical hierarchy and real recursive functions. We begin by defining relativised versions of REC(IR) and of the η-hierarchy. We will use infimums and supremums instead of infinite limits because they are conceptually closer to quantification. Definition 9. Let F be a set of vector functions over the real numbers. The class of real recursive vector functions relativised to F, REC(IR, F), is given by REC(IR, F) = [1n , ¯1n , 0n , Uni , +, ×, /, F; C, I, sup, inf , v]. If P is a predicate over the reals, then class of real recursive vector functions relativised to P is REC(IR, {χP }). This definition is justified by the following equivalence, which states that supremums and infimums can be used to obtain infinite limits. Proposition 11. REC(IR, F) = [1n , ¯1n , 0n , Uni , +, ×, /, F; C, I, l, li, ls, v]. Proof. One trivially has that REC(IR, F) ⊆ [1n , ¯1n , 0n , Uni , +, ×, /, F; C, I, l, li, ls, v], by Proposition 4. We must now show that REC(IR, F) is closed for l, ls, li. If f is an (n + 1)-ary function in REC(IR, F), then the (n + 1)-ary function h, given by h(¯ x, y) = sup f (¯ x, z) = sup f (¯ x, y + z 2 ), z>y

z∈IR

is also in REC(IR, F). This is enough to show the closure, since lim sup f (¯ x, y) = inf sup f (¯ x, z) y→∞

y∈IR z>y

and one can obtain l and li from ls [see Mycka, 2003].

u t

Now we define a relativised η-hierarchy, by counting the number of supremums and infimums needed to define a function in REC(IR, F). We will distinguish between even and odd levels of the hierarchy. The even levels will be obtained by allowing one application of sup or inf and the odd levels will be the closure of the even levels for the remaining operators. Definition 10. The nth level of the η-hierarchy relativised to a set of real functions F, HF n, is inductively defined by n ¯n n n HF 0 = {1 , 1 , 0 , Ui , +, ×, /} ∪ F F HF 2n+1 = [H2n ; C, I, v] F HF 2n+2 = {f, sup[f ], inf [f ] | f ∈ H2n+1 }.

Since y = χQ (¯ x) ⇐⇒ (y = 1 ∧ Q(¯ x)) ∨ (y = 0 ∧ ¬Q(¯ x)), we get the following. Proposition 12. Q ∈ ∆1n if and only if χQ ∈ ∆1n . The new hierarchy gives the following analogue of Post’s (1948) theorem:

A Functional Characterisation of the Analytical Hierarchy

255

Theorem 2. For every n > 1 1 (1) If Q ∈ ∆1n and χP ∈ HQ 2 then P ∈ ∆n+1 . R 1 1 (2) If P ∈ ∆1n+1 then χP ∈ HQ 2 ∩ H2 for some Q ∈ Πn and R ∈ Σn . 1 Proof. To prove (1), we show that every function in HQ 2 is in ∆n+1 . By Propositions 6 and 12 Q 1 1 we conclude that H0 ⊆ ∆n ⊂ Σn . The proof shown for Proposition 7 is sufficient to show that functions obtained by composition, iteration or aggregation — the functions in HQ 1 — are in Σn1 . Now suppose a function is given by inf [f ] for some function f ∈ Σn1 . See that

z = inf f (¯ x, y) ⇐⇒ ∀y[z 6 f (¯ x, y)] ∧ ∀t > z∃u[t > f (¯ x, u)] ⇐⇒ y∈IR

∀y∀t∃u∃v∃w[v > 0 ∧ w > 0 ∧ z + v = f (¯ x, y) ∧ (t > z ⇒ t − w = f (¯ x, u))] 1 1 1 gives a predicate in Πn+1 , and we can do similarly for sup. So HQ 2 ⊆ Πn+1 . But if χP ∈ Πn+1 1 1 1 then both P and ¬P are in Πn+1 and so we get P ∈ ∆n+1 . To prove (2) take P in ∆n+1 . This means that P (¯ x) ⇐⇒ ∃yQ(¯ x, y) ⇐⇒ ∀yR(¯ x, y) for some Q ∈ Πn1 , R ∈ Σn1 . So immediately Q we get χP = sup[χQ ] ∈ H2 and χP = inf [χR ] ∈ HR u t 2

6

Concluding remarks

We have seen that the inductive closure of some very basic functions for the operations of solving differential equations and taking infinite limits gives us exactly the same expressive power as the analytical hierarchy. Effectively, this will trivialise the proof that some given function is real recursive. For instance, χQ is real recursive simply because z = χQ (x) ⇐⇒ (z = 1 ∧ ∃a∃b ax = b) ∨ (z = 0 ∧ ¬∃a∃b ax = b) gives an analytical predicate. Alas, the analogue of Post’s theorem that we obtained is not as good as one would wish: an equivalence would be better. We cannot seem to be able to settle the question if a predicate P is in ∆1n+1 then can we find a predicate Q in ∆1n such that P ⇐⇒ ∀Q? This would provide the intended result.

7

Acknowledgements

I would like to thank José Félix Costa for being a dedicated teacher, mentor and friend, as well as an admirable, awe-inspiring and honest man.

References 1. Peter Clote. Handbook of Computability Theory, volume 140 of Studies in Logic and the Foundations of Mathematics, chapter 17 – Computation Models and Function Algebras, pages 589–681. Elsevier, 1999. 2. Ezzat Ramadan Hassan and Witold Rzymowski. Extremal solutions of a discontinuous scalar differential equation. Nonlinear Analysis, 37(8):997–1017, September 1997. 3. Ker-I Ko. Complexity Theory of Real Functions. Birkäuser, 1991. 4. Bruno Loff, José Félix Costa, and Jerzy Mycka. Computability on reals, infinite limits and differential equations. Applied Mathematics and Computation, 2007. In press. 5. Cris Moore. Recursion theory on the reals and continuous-time computation. Theoretical Computer Science, 162(1):23–44, 1996.

256

Bruno Loff

6. Yiannis Moschovakis. Abstract first order computability I and II. Transactions of the American Mathematical Society, 138, 1969. 7. Yiannis Moschovakis. Descriptive set theory. North–Holland, 1980. 8. Jerzy Mycka. µ-recursion and infinite limits. Theoretical Computer Science, 302:123–133, June 2003. 9. Jerzy Mycka and José Félix Costa. Real recursive functions and their hierarchy. Journal of Complexity, 20(6):835–857, December 2004. 10. Piergiorgio Odifreddi. Classical Recursion Theory, volume 125 of Studies in Logic and the Foundations of Mathematics. North Holland, 1989. 11. Rodrigo Pouso. On the cauchy problem for first order discontinuous ordinary differential equations. Journal of Mathematical Analysis and Applications, 264(1):230–252, December 2001.

How can Natural Brains Help us Compute? Carlos Lourenço 1

2

Faculty of Sciences of the University of Lisbon - Informatics Department, Campo Grande, 1749-016 Lisboa - Portugal, [email protected], http://www.di.fc.ul.pt/˜csl Instituto de Telecomunicações - Security and Quantum Information Group, Av. Rovisco Pais, 1, 1049-001 Lisboa - Portugal

Abstract. A model of biologically inspired natural computing is reviewed. Recurrent neural networks are set up so as to take advantage of emergent spatiotemporal chaotic regimes. Seminal work explaining the emergence of complexity in initially homogeneous physical and biological systems can be attributed to Alan Turing himself. Dynamical complexity provides a variety of computational modes and rich input-output relations in a dynamical perturbation scheme. Our model is initially proposed as an ’operational’ device most suitable for the processing of spatially distributed input patterns varying in continuous time. Formalizations leading to hypercomputation can be envisaged.

1

Introduction

There is an obvious similarity between our title and that of Barry Cooper’s recent article [1] concerning computing with natural paradigms as well as the related subject of computing by Nature itself. But we specifically intend to pay homage to Alan Turing’s efforts in understanding computation taking place in that special natural system which is the brain. Thus our focus here is on that particular system which we take as inspiration for a computing paradigm. We would also like to bring out a third way by which Turing –albeit probably unintentionally– may have contributed to the field of computing in the whole. The first ’Turing way’ needs little discussion: it is embodied in the Turing machine and is practically synonymous with classical computation. The second Turing way has been appreciated only recently and can be viewed as an early proposal of connectionist methods, in the form of so-called ’unorganized machines’ [2] (see also the review of Turing’s anticipation of neural networks in [3]). It must be said that this incursion of Turing into connectionism went largely unnoticed. In these first two ways, Turing sought analogues with human reasoning. In doing so, he actually tried to capture some aspects of the brain’s workings. The aforementioned third way can only now begin to be appreciated in view of the growing interest in natural computing paradigms and methods. The scientific community, from both the natural and the computational sciences, regards an increasing number of ’naturally’ occurring phenomena as qualifying as computation. These may provide additions to the collection of computational paradigms where e.g. the neural networks referred above have a more established status. The apparent lack of a formal definition of computation that might encompass all the alternative forms of ’unconventional’ computing, as compared to the solid definitions of classical computability, has been a matter of criticism. The formal difficulties can sometimes be compensated by the promise, or actual demonstration, of practical applications where some operational advantage is obtained as compared to the operation of classical digital machines. Our approach in this paper will be of the latter, ’operational’, kind. This does not mean we cannot briefly point e.g. to the hypercomputational possibilities of the model(s) under discussion. ’Hypercomputation’ is used here in the sense of computing non-Turing-computable functions, which is something we regard as theoretically possible with the models we propose.

258

Carlos Lourenço

Our work is within the context of dynamical systems, which does allow a number of possible reasonable formal definitions of continuous-time analog dynamical computation, be it exact computation or otherwise. Generalization of classical concepts such as input data, memory, program, and output, may be less than obvious, but are nonetheless possible. Notwithstanding, such formalization will not be attempted in the present publication. At this point we return to Turing’s third contribution to computation, which at the time appeared to have no direct implication for computation itself. Turing’s last ground breaking contribution to science, shortly before his premature death, was his attempt of explaining morphogenesis in living systems. To be technically more precise, it was actually an attempt of explaining the presence of spatial patterns in living tissue, and the slow time-variation thereof. One seminal paper, “The chemical basis of morphogenesis”[4], was published during his lifetime. That article seems to have had a more profound influence among theoretical biologists —at least among those who could understand the math— and, lately, among chemists and physicists, than it has had for computer science. As it comes, it is not even considered relevant for computer science, apart from the need for computer-aided numerical simulations that it brought about. In a broad sense, computation might have been served by, say, Turing having achieved a proper explanation of morphogenesis in neural tissue. This would have, at least partially, contributed to the understanding of the physical substrate of ’computation’ in living beings. However, that was not quite the case. A proper explanation of morphogenesis could indeed only be achieved with the advent of molecular biology and the discovery of DNA. Yet, Turing did provide a powerful explanation of the emergence of patterns in an initially homogeneous spatially extended system, as he proposed the so-called reaction-diffusion mechanism. The latter was presented as an explanation of naturally occurring patterns based solely on the laws of physics, but could also serve as a recipe for the technological creation of such type of patterns if desired. Experimental verification of Turing’s principles can be found e.g. in [5]. Reaction-diffusion may be briefly described as the reaction between an activator substance and an inhibitor, accompanied by the spatial diffusion of both substances at different rates. For appropriate values of the reaction and diffusion rates, the interplay between both mechanisms can give rise to spatial patterns, denoting an inhomogeneity in, say, the activator’s concentration. Here there is no explicit term in the original equations that describe the system, and which might point to the spatial structures that do arise. For instance, emergent correlations and observed wavelengths are not explicit beforehand. Such quantities tend to be intrinsic, i.e., dependent upon the substances and associated intensive parameters, and not (at least in a first approximation) upon imposed geometrical or boundary conditions constraints. What Turing provided was one of the first rigorous explanations of emergence itself, in terms acceptable to the natural science community. The fact that emergent observable quantities are mostly intrinsic in reaction-diffusion systems provides a most elegant example of self-organization. In other systems, such as in hydrodynamics, emergent structures may be dependent upon externally imposed geometry and boundary conditions [6]. Turing was primarily interested in explaining essentially static patterns. However, he did consider concepts such as the state of the system —which is implicit in the mathematical description itself— and the evolution thereof, hence dynamics. This use of dynamics would concern mainly the slow evolution from a homogeneous state to some ’final’ pattern. Regularity was sought, be it along the spatial or the temporal dimension. For another example of this regularity, simple traveling waves were acknowledged as a possible solution of the dynamical equations. More modernly, non-convergent solutions are also considered, including the extreme case of spatiotemporal chaos, where the system may present different degrees of (ir)regularity along both the spatial and the temporal dimensions. The real world is nonlinear, and Turing gave an important and seminal contribution for the description of emergence in this world. Originally, the reaction-diffusion system’s dynamical evolution does not seem to have been proposed as a computational model or computational paradigm per se. However, in our research on computation, we acknowledge the influence of

How can Natural Brains Help us Compute?

259

the explanatory trend initiated by Turing concerning the emergence of complex dynamics in the natural world. We are particularly interested in evaluating the relevance of complex spatiotemporal dynamics for the computations that living organisms might perform. On the other hand, we seek to propose actual computation paradigms inspired by those observations, and which might therefore be classified as ’natural computing’. Our approach, chiefly operational at start, falls most naturally into the category of practical computation with natural paradigms. However, as noted above, the road is open for a formalization of the proposed type of analog computation. Namely, exact analog computation can be contemplated at the formal level. For now, let us call our starting point a Baconian one [1], that is, observing Nature itself as a first step.

2

The Chaotic Brain — What use Could it Have?

Around twenty years ago, the discovery of putative chaotic electrical signals in the brain [7, 8] elicited a discussion on the possible functional role of chaos in cognition. Advantages such as flexibility, or the possibility of performing nonlinear search for some data or concept, were highlighted based on very general arguments. In our own work, chaos is taken for a fact, and the question is then what actual use it may have, if any, for living brains. Furthermore, we ask how ’natural computing’ paradigms might be proposed as inspired by this observation of biology. 2.1

A Model

The computational model presented in [9] (see also references therein) has the double aim of explaining biological cognitive phenomena and proposing a possible computing device or paradigm. Models such as this one are indeed continuous-time recurrent neural networks where a range of complex spatiotemporal phenomena can be observed. Such complex behavior occurs due to nonlinear properties of the nodes and, especially, due the system being spatially extended. The degree of complexity is dependent upon parameters such as the system’s size and details of the connectivity. Spatiotemporal chaos is one of the possible regimes. Most interesting in view of computing are the parameter regions for which a certain temporal and spatial coherence is kept among nodes (or ’neurons’), that is, a form of low-dimensional spatiotemporal chaos. These have been called by some authors the ’edge of chaos’ regions. The equations describing the essential aspects of the neurons’ dynamics in [9] happen to be a discretization of the Ginzburg-Landau equation for oscillating reaction-diffusion systems. This normal-form approach abstracts away most of the details of chemical systems and becomes convenient in describing generic populations of (diffusively) coupled oscillating units. In our case, it was a first approach in trying to capture the essential dynamical features of neural populations. Assessment of the model’s computational capabilities initially includes analytical investigation of its dynamical structure. This is complemented by numerical simulations (on a digital computer. . . ) which are instrumental in the obtainment of practical results. 2.2

Computing with the Model

The idea which is reviewed and expanded in [9] consists in exploring the Unstable Periodic Orbits (UPO) structure of chaos. In dynamical terms, chaos is a ’reservoir’ containing a countably infinite number of UPOs. Such UPOs cannot be spontaneously observed. However, by using suitable control methods, they can be stabilized from within chaos in a flexible way and via perturbations of very small magnitude. One fruitful approach consists in viewing each of these

260

Carlos Lourenço

UPOs as a computational mode (or ’program’) which could be selectively stabilized according to the requirements of computational tasks. A dynamical systems viewpoint is adopted, in which the input data, the so-called program and the output data are all real functions of continuous time. This does not exclude the particular case of discrete or symbolic data, as well as the particular case of static input. An essential feature of the computing model is that a transient response —or eventually a permanent response, in the case of static input— is measured and is interpreted as the result of applying some function to the input data. In dynamical terms, the input data constitutes a time-dependent perturbation of the main system. Given this setting, the computed function is actually an operator. Through further processing stages, the device can also be made to compute a (scalar-valued) functional of the input data, or simply a discrete-valued functional of the same data. The computational task chosen for illustration in [9] is the processing of spatiotemporal visual input patterns. The intrinsic dynamics of the original system, either viewed as each of the UPOs or the collection thereof, is itself spatiotemporal and has therefore certain spatiotemporal symmetries. The exploration of the interplay between these symmetries and the ones of the input patterns is a key aspect of the practical application of this computing paradigm. The reader is referred to [9] for more details. Let us also note that such type of computation could be viewed as an instance of what is modernly called “reservoir computing”, for which we may cite Echo State Networks [10] and Liquid State Machines [11] as application-oriented examples. The latter are probably philosophically closer to a ’black-box’ model of computation.

2.3

A More ‘Neural’ Model

The diffusive nature of the connectivity in [9], along with a general lack of biological detail, albeit theoretically justifiable, faces difficulties among purist neurobiologists. To test our ideas in a more biologically realistic setting, and also with the purpose of exploring novel spatiotemporal chaotic regimes, we turned to the model in [12]. The latter closer incorporates neurophysiological features. Although it is not central to our discussion, we note that this model could have a physical implementation e.g. in the form of an electronic analog machine. When comparing the computational power of any such physical embodiment with that of the theoretical model which abstracts it, one would suggest that such issues as measurement and parameter precision would imply a lowering of the computational power of the physical version. The essential ’UPO reservoir’ property is once again established in [12] for the chaotic attractor. Preliminary examples of rather simple computation with this model are presented in [13], along with the proposal of different versions of ’chaotic’ computing via a dynamical perturbation scheme. The processing of more complex patterns is a possibility for subsequent exploration. Although more realistically neural, the present model can be compared in dynamical terms with the markedly ’reaction-diffusion’ model of Section 2.1. In the present neurons, the role of activation is assumed by neural excitation, whereas the role of inhibition is assumed by neural inhibition. The existence of an interplay between neural excitation and inhibition is well known in biology. Here it turns out essential in the generation of complex spatiotemporal patterns, which are indeed a mixture of regular and irregular behavior at different time-scales. The careful balancing of excitation and inhibition, as well as an appropriate setup of network connections and delays in signal transmission, provide a range of possible behaviors to be explored in view of computation.

How can Natural Brains Help us Compute?

2.4

261

A Digression: Chemical Computers

Over the years, practical applications of reaction-diffusion principles have been proposed as ’chemical computers’, namely featuring variants of the Belousov-Zhabotinskii reaction [14, 15]. In [14], elementary image processing is performed by perturbing chemical waves with light. In [15], logic gates are built out of chemical waves. These approaches differ from ours in that they feature a local type of processing, whereas we seek global dynamical responses for given input data. Also, information flow and the actual dynamical regimes that may be present in neural networks tend to be richer than with the simpler reaction-diffusion systems. Moreover, the ’gate design’ approach of [15], for instance, is a re-implementation of standard digital circuitry, although in a novel substrate. Regarding the essence of computation, no new paradigm is actually proposed. A clarifying distinction can also be made between our use of a global dynamics and the local processing in certain models which can be related to chemical computers, such as the Excitable Lattice model [16]. In the latter, particle-like waves represent quanta of information. Binary collisions between particle-like waves are used as implementations of logical gates, thus in close agreement with the basic idea illustrated e.g. in [15]. Our model in [13] is not originally intended to directly implement logical gates. However, an actual implementation of the XOR function is provided as an arbitrary illustration of yet another possible usage of the device. Let us recall that the primary purpose of our model is the processing of more complex spatiotemporal patterns, where some obvious advantage might be obtained over classical digital processing. In our case, whichever function is computed (including the XOR and other Boolean functions), the processing is globally done by the neural population.

3

Discussion

We propose a computational model which tries to capture the essentially dynamical, nonlinear way by which Nature itself ’computes’ whatever it may be that it computes most of the time. Our model supports a basic digital processing mode if required —as is certainly the case with Nature for certain tasks. However, it preserves a fully analog computation power, to be subsequently explored. Dynamical regimes as complex as spatiotemporal chaos are not avoided as if they were a nuisance. Rather, they are explicitly taken advantage of. A setting within neural networks is adopted, although markedly deviating from standard presentations of such networks. It is a valid endeavor to try to generalize concepts from classical computation into this new analog dynamical context. Even if a direct translation of concepts is not possible, questions such as the assessment of computational power remain very relevant, both in practical usage and in an exact analog computation setting. In retrospect, we also appreciate the seminal contribution of Alan Turing himself to the rigorous description of complexity in the natural world, eventually leading to our own work. Acknowledgments. The author acknowledges the partial support of Fundação para a Ciência e a Tecnologia and EU FEDER via Instituto de Telecomunicações, the former Center for Logic and Computation and the project ConTComp (POCTI/MAT/45978/2002), and also via the project PDCT/MAT/57976/2004. Research performed under the scope of project RealNComp (PTDC/MAT/76287 /2006).

References 1. Cooper, S.B.: How can Nature help us compute? In Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Stuller, J. (eds.): SOFSEM 2006: Theory and Practice of Computer Science – 32nd Conference

262

Carlos Lourenço

on Current Trends in Theory and Practice of Computer Science, Merin, Czech Republic, January 21– 27, 2006. Springer Lecture Notes in Computer Science, Vol. 3831. Springer-Verlag, Berlin Heidelberg New York (2006) 1–13 2. Turing, A.M.: Intelligent machinery, National Physical Laboratory Report, 1948. In Meltzer, B., Michie, D. (eds.): Machine Intelligence, Vol. 5, Edinburgh University Press, Edinburgh (1969) 3–23. Reprinted in Ince, D.C. (ed.): Collected Works of A. M. Turing: Mechanical Intelligence, chapter Intelligent Machinery, North-Holland, Amsterdam London (1992) 3. Teuscher, C., Sanchez, E.: A revival of Turing’s forgotten connectionist ideas: Exploring Unorganized Machines. In French, R.M., Sougné, J.J. (eds.): Connectionist Models of Learning, Development and Evolution – Proceedings of the Sixth Neural Computation and Psychology Workshop, NCPW6, Liège, Belgium, September 16-18, 2000. Springer-Verlag, London (2001) 153–162 4. Turing, A.M.: The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society of London, Series B – Biological Sciences 237 (1952), 37–72. Reprinted in Saunders, P.T. (ed.): Collected Works of A. M. Turing: Morphogenesis, North-Holland, Amsterdam London (1992) 5. Dulos, E., Boissonade, J., Perraud, J.J., Rudovics, B., De Kepper, P.: Chemical morphogenesis: Turing patterns in an experimental chemical system. Acta Biotheoretica 44 (1996) 249–261 6. Nicolis, G.: Introduction to Nonlinear Science, Cambridge University Press (1995) 7. Babloyantz, A., Salazar, J., Nicolis, C.: Evidence of chaotic dynamics of brain activity during the sleep cycle. Physics Letters A 111 (1985) 152–156 8. Rapp, P., Zimmerman, I., Albano, A., de Guzman, G., Greenbaun, N.: Dynamics of spontaneous neural activity in the simian motor cortex: The dimension of chaotic neurons. Physics Letters A 110 (1985) 335–338 9. Lourenço, C.: Attention-locked computation with chaotic neural nets. International Journal of Bifurcation and Chaos 14 (2004) 737–760. 10. Jaeger, H.: The “echo state” approach to analyzing and training recurrent neural networks. GMD Report 148, German National Research Center for Information Technology (2001) 11. Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Computation 14 (2002) 2531– 2560 12. Lourenço, C.: Dynamical computation reservoir emerging within a biological model network. To appear in Neurocomputing, 2007, doi:10.1016/j.neucom.2006.11.008 13. Lourenço, C.: Structured reservoir computing with spatiotemporal chaotic attractors. To appear in Verleysen, M. (ed.): 15th European Symposium on Artificial Neural Networks (ESANN 2007), April 25-27, 2007 14. Kuhnert, L., Agladze, K.I., Krinsky, V.I.: Image processing using light-sensitive chemical waves. Nature 337 (1989) 244–247 15. Steinbock, O., Kettunen, P., Showalter, K.: Chemical wave logic gates. Journal of Physical Chemistry 100 (1996) 18970–18975 16. Adamatzky, A.: Universal dynamical computation in multidimensional excitable lattices. International Journal of Theoretical Physics 37 (1998) 3069–3108

A Purely Arithmetical, yet Empirically Falsifiable, Interpretation of Plotinus’ Theory of Matter Bruno Marchal IRIDIA, Université Libre de Bruxelles [email protected]

Abstract. The self-analysis abilities of formal theories or theorem provers are outstanding. We show that the study of self-observing “ideal” machine leads to natural arithmetical interpretations of the hypostases that Plotinus discovered by looking inward. Those corresponding to his “Matter Theory” are compared with the logic of empirical current physics.

1

Incompleteness and Mechanism

There is a vast literature where Gödel’s first and second incompleteness theorems are used to argue that human beings are different of, if not superior to, any machine. The most famous attempts have been given by J. Lucas in the early sixties and by R. Penrose in two famous books [53, 54]. Such type of argument are not well supported. See for example the recent book by T. Franzèn [21]. There is also a less well known tradition where Gödel’s theorems is used in favor of the mechanist thesis. Emil Post, in a remarkable anticipation written about ten years before Gödel published his incompleteness theorems, already discovered both the main “Gödelian motivation” against mechanism, and the main pitfall of such argumentations [17, 55]. Post is the first discoverer1 of Church Thesis, or Church Turing Thesis, and Post is the first one to prove the first incompleteness theorem from a statement equivalent to Church thesis, i.e. the existence of a universal—Post said “complete”—normal (production) system2 . In his anticipation, Post concluded at first that the mathematician’s mind or that the logical process is essentially creative. He adds: “It makes of the mathematician much more than a clever being who can do quickly what a machine could do ultimately. We see that a machine would never give a complete logic; for once the machine is made we could prove a theorem it does not prove”(Post emphasis). 1

2

A case can be made that Babbage could have discovered it too, through his invention of a functional notation capable of describing his analytical engine. According to Lafitte 1932 the old Babbage was more proud of his language than of his analytical engine [32]. The informal but rigorous proof can be given in a footnote: let N be the set of natural numbers {0, 1, 2, 3, ...}. Let us say that a function from N to N is computable if we can describe in a finite way how to compute it, in a language with a checkable grammar. Church thesis asserts the existence of a universal language, that is, a language in which we can describe all, but not necessarily only, computable functions from N to N. Suppose now the existence of a complete theory T about Arithmetic, or, more easily, about machines or codes. We will get a contradiction. Given the checkability of the grammar, we can enumerate all the codes: C0 , C1 , C2 , C3 , ... accepting one input, in the universal language. If T is a complete theory, T would be able to decide, for each i, if Ci is defined on each n or not. From this, by the use of the complete theory T, we can now enumerate the total (always defined) computable function f0 , f1 , f2 , f3 , ... which, by Church thesis are all among the functions defined by the codes C0 , C1 , C2 , C3 , .... But then the diagonal function g defined by g(n) = fn (n) + 1 is computable. Thus, there is a number k such that g = fk . But then g(k) = fk (k) = fk (k) + 1. Given that the fk are total functions, fk (k) is a well defined number, and we can subtract it on both sides, so that 0 = 1. So, either there is no universal language and thus no universal machine capable of understanding them—and Church thesis is false—or there is no complete theory for numbers or machines.

264

Bruno Marchal

But Post quickly realized that a machine could do the same deduction for its own mental acts, and admits that: “The conclusion that man is not a machine is invalid. All we can say is that man cannot construct a machine which can do all the thinking he can. To illustrate this point we may note that a kind of machine-man could be constructed who would prove a similar theorem for his mental acts.” This has probably constituted his motivation for lifting the term creative to his set theoretical formulation of mechanical universality [56]. To be sure, an application of Kleene’s second recursion theorem, see [30], can make any machine self-replicating, and Post should have said only that man cannot both construct a machine doing his thinking and proving that such machine do so. This is what remains from a reconstruction of Lucas-Penrose argument: if we are machine we cannot constructively specify which machine we are, nor, a fortiori, which computation support us. Such analysis begins perhaps with Benacerraf [4], (see [41] for more details). In his book on the subject, Judson Webb argues that Church Thesis is a main ingredient of the Mechanist Thesis. Then, he argues that, given that incompleteness is an easy—one double diagonalization step, see above—consequence of Church Thesis, Gödel’s 1931 theorem, which proves incompleteness without appeal to Church Thesis, can be taken as a confirmation of it. Judson Webb concludes that Gödel’s incompleteness theorem is a very lucky event for the mechanist philosopher [70, 71]. Torkel Franzèn, who concentrates mainly on the negative (antimechanist in general) abuses of Gödel’s theorems, notes, after describing some impressive self-analysis of a formal system like Peano Arithmetic (PA) that: “Inspired by this impressive ability of PA to understand itself, we conclude, in the spirit of the metaphorical “applications” of the incompleteness theorem, that if the human mind has anything like the powers of profound self-analysis of PA or ZF, we can expect to be able to understand ourselves perfectly”. Now, there is nothing metaphorical in this conclusion if we make clear some assumption of classical (platonist) mechanism, for example under the (necessarily non constructive) assumption that there is a substitution level where we are turing-emulable. We would not personally notice any digital functional substitution made at that level or below [38, 39, 41]. The second incompleteness theorem can then be conceived as an “exact law of psychology”: no consistent machine can prove its own consistency from a description of herself made at some (relatively) correct substitution level—which exists by assumption (see also [50]). What is remarkable of course is that all machine having enough provability abilities, can prove such psychological laws, and as T. Franzèn singles out, there is a case for being rather impressed by the profound self-analysis of machines like PA and ZF or any of their consistent recursively enumerable extensions3 . This leads us to the positive—open minded toward the mechanist hypothesis—use of incompleteness. Actually, the whole of recursion theory, mainly intensional recursion theory [59], can be seen in that way, and this is still more evident when we look at the numerous application of recursion theory in theoretical artificial intelligence or in computational learning theory. I refer the reader to the introductory paper by Case and Smith, or to the book by Osherson and Martin [14] [46]. In this short paper we will have to consider machines having both provability abilities and inference inductive abilities, but actually we will need only trivial such inference inductive abilities. I call such machine “Löbian” for the proheminant rôle of Löb’s theorem, or formula, in our setting, see below. Now, probably due to the abundant abuses of Gödel’s theorems in philosophy, physics and theology, negative feelings about any possible applications of incompleteness in those fields could have developed. Here, on the contrary, it is our purpose to illustrate that the incompleteness theorems and some of their generalisations, provide a rather natural purely arithmetical interpretation of Plotinus’ Platonist, non Aristotelian, “theology” including his “Matter Theory”. As a theory bearing on matter, such a theory is obviously empirically falsifiable: it is enough to compare empirical physics with the arithmetical interpretation of Plotinus’ theory of Matter. A divergence here would not refute Plotinus, of course, but only the present arithmetical interpretation. 3

I identify Peano Arithmetic, Zermelo Fraenkel and other axiomatisable theories with their theorem provers. A theorem by Craig can justify this move, see Boolos and Jeffrey [7]. Thus I will say that p is proved by PA, instead of saying the usual “p is proved in PA.

Plotinus’ Theory of Matter

265

This will illustrate the internal consistency and the external falsifiability of some theology. Here the term “theology” can be interpreted in some general, albeit non necessarily physicalist, sense of “theory of everything”, or “truth theory”, including what subjects can prove, or known, or guess about themselves and their possible neighborhoods, and what is true about them but which they cannot prove, but still guess, or not. This could hopefully help to eventually unify fundamental fields like some axiomatic theologies, theoretical physics, theoretical computer science and number theory4 . By incompleteness machine’s (pure) theology could already have been defined by true computer science minus computer’s computer science. This will be made more precise below by the use of Solovay theorem [61].

2

Plotinus and Machine’s Methodologies

Plotinus makes clear that its methodology belongs to those among the platonists who are both rationalist and mystic (looking inward). His basic methodology consists in self-analysis together with rational analysis and communication in a way inspired mainly by Plato and Aristotle. Our own methodology will consist in studying what “platonist” universal machine, the definition is given below, having enough deducibility and inferability abilities, can discover by self-analysis. It is hard to be fair with Plotinus, third century A.D., by summing up his work in a paragraph5 . At the same time Plotinus is enough clear and so much different from the current widespread Aristotelian conception of reality, that this project will not suffer too much from the obvious simplification it is asked for, and for which I apologize in advance. It is hoped that the self-observing machine’s discourses will appear to be near neoplatonists like Plotinus or Proclus [63, 69]. Plotinus’ view of “reality” can be given in term of three main hypostases. According to MacKenna, those hypostases are generalized and abstract notions of Persons [34]. Each hypostasis can be considered as a view of “reality” or “truth” from a personal, although “divine” (i.e. true but non effective, see below), points of view. The three main primary hypostases are: the One, the Divine Intellect or Intelligible Realm (Plato’s Nous), and the All-Soul or Universal-Soul. And there are also what I will call, to be short, the two secondary “hypostases”: the Intelligible Matter and the Sensible Matter. If we take into account the differences between the discursive, terrestrial, discourses and the true or divine possible discourses, this makes a total of a priori 5 times 2 = 10 “hypostases”, in a more general sense than Plotinus’ use of that word6 . Note that Plotinus considers that the One and the Matter, associated to the secondary “hypostases”, are above, respectively below, the realm of “existing things” or “authentically existing things”, which concerns mainly the divine intellect’s ideas (Plato’s Nous). The One can be considered as the ineffable, non necessarily effective, transcendental origin or source of everything. The word “origin” is closer to a mathematical or arithmetical origin than to a spatiotemporal cause7 . As a person, the One can be considered as sustaining a degenerate zero person point of view, comparable to Nagel’s point of view from nowhere [51]. The One is thus the ultimate fundamental “reality” responsible of the existence of anything capable of existence. It contains implicitly the other primary hypostases, sometimes described as different phases of the One. The One is also called “Good” in the sense that it will serve as a sort of universal attractor of the (terrestrial) souls. This implies a kind of two-way cosmogony. The One produces originally the divine intelligible which produces the universal and discursive souls which produce, by contemplation, Nature and eventually 4

5

6

7

The number theoretical aspect of computer science is beyond the scope of the present paper, but the basic bridge has been provided by the works of Davis, Robinson and Putnam and Matiyasevich which singles out the existence of universal diophantine polynomials, see [47]. Note that Diophantus is contemporary of Plotinus. Both have been taught by Hypatia one century later in Alexandria [18]. We have used the new, unabridged, translation of Plotinus’ Enneads by MacKenna (Larson Publications, 1992), together with the classical translation of A.H. Amstrong 1966 (Classical Loeb Library, Harvard University Press) and the older french translation by Emile Bréhier (Collection Les Belles Lettres, Paris, 1924q). Note that Plotinus strictly reserves the term “hypostasis” for the three primary, and divine, hypostases. See [72] for a physicist argument that the origin of the physical laws can hardly be physical.

266

Bruno Marchal

Matter—most of the time identified with “Evil” by the Platonist; but, from the inside personal views, it looks the other way round. Somehow the terrestrial souls feel as if they were extracting themselves from Evil-Matter to tend toward the One. This is probably why Plotinus was an optimistic philosopher— Porphyry called him once the happy Plotinus. The second hypostase, the “Intelligible” or “Divine Intellect” is mainly Plato’s Nous, i.e. his world of intellectual or immaterial ideas. It is related with the logos, as either terrestrial or divine verbs or discourses. What is considered as “existing”, or “authentically existing” are the ideas belonging to the divine intellect. Plotinus is priviledging some passage of Plato’s Parmenides, by making the intelligible realm second to the One, and argues, against Aristotle, that the ultimate One cannot be a thinking subject. Plotinus argues indeed that thinking already needs some more primitive reality, by having to divide it into thinking subject and object of thinking. The third hypostase, the “All-Soul” hypostasis, appears as a way of combining both the One and the Intellect, and thus allowing them to participate in one principle. Somehow the All-Soul is a version of an intellect which keeps better its ground through a direct link with the innefable One, and as such inherits part of its ineffability. The All-Soul is responsible for the existence of (subjective) time and of the creation of Nature and eventually Matter through a process of contemplation. This is the opposite of the Aristotelian metaphysics where somehow, mind and person arise or emerge from the organisation of some primitive or primary matter8 . As a rationalist, Plotinus does not hide some difficulties entailed by his approach, notably concerning the rôle of the soul and its relative place with respect to the secondary “hypostases”. The material or secondary “hypostases” corresponds to the “Two Matters” of the second Ennead9 (II, 4): the intelligible matter and the sensible matter, most of the time described as the matter “there”, meaning in the divine Nous, and the matter “here”, meaning that it has a sensible component for the terrestrial, intellectual, or discursive souls. Matter itself is then, following Aristotle, described by indeterminateness and/or privation. Plotinus departs from Aristotle by defining literally that matter, exclusively and quasi-axiomatically, by this very notion of privation or indeterminateness. This makes matter prone to acquire or to represent distinctive and possibly alternate incidental (contingent) qualities, but in a such a way that matter itself remains invariant and separated from any of those qualities. This makes Matter literaly the opposite or the negation of the intelligible. Plotinus refers to Plato’s Timaeus for the need here of a “bastard” or “spurious” reasoning to operate on that theoretical Unintelligibility. I will illustrate that, thanks to their outstanding self-reference powers, the correct or honest Löbian machine cannot escape the discovery of an arithmetical version for each of those hypostases, mainly as intensional variants of provability10 . Such intensional variants are made necessary by the incompleteness phenonomena. Such variants will include the “material” hypostases. They will be described in a precise way through their transparent arithmetical interpretations, and they will justify some precise and empirically testable logics of observability, where the “spurious reasoning” should lead to an arithmetical measure of probability or credibility. In fact each arithmetical hypostasis will give rise to weak logics11 structuring differently, “from inside”, or from “personal points of view”, the arithmetical reality. Now, physicists have been led in the last century to non boolean logics of observability, known as quantum logics [15, 16, 26, 49, 3]. Such logics capture many counter-intuitive propositions for which we were not prepared by the observation of the “macro-world” which provides an apparent canonical boolean phase spaces relating classical physics to classical logics. Like the whole of quantum physics, such logics are not easy to interpret, but our approach is mainly formal so that we will avoid any prematured interpretation problems. This is made possible by a result of R. Goldblatt (see [23]) showing that a minimal (propositional) quantum logic MQL can be translated in the classical modal logic, known as B. Theorem 1: MQL proves A iff B proves tmql (A). 8 9 10

11

See [41], see also [10]. See [13, 68] for a larger treatment. A case can been made, through an analysis of his insightful treatise on number (Ennead VI,6) that Plotinus could have welcome such or similar enterprise. A propositional logic is weak when the set of its theorems is properly included in the set of the classical tautologies.

Plotinus’ Theory of Matter

267

B is generated by the closure of the set {K, ¤p → p, p → ¤♦p} for the modus ponens rule MP and the necessitation rule NEC (derive ¤p from a derivation of p). K is for the “Kripke formula” ¤(p → q) → (¤p → ¤q), and tmql is a translation, called quantization [57], from quantum propositional quantum formula to classical modal formula: the quantization tmql (p) of an atomic formula p is given by the classical modal formula ¤♦p, where “♦p” is an abbreviation of ¬¤¬p. The quantization of ¬A is given by the application of the box ¤ applied to the negation of the quantization of A, and quantization commutes with conjonction. This result is similar to the presumption by Gödel [22] that the typical “introspective knower” modal logic S4, which follows from {K, ¤p → p (incorrigibility), ¤p → ¤¤p (introspection) }, by application of the Modus ponens and necessitation rules, formalizes soundly and completely, in the classical frame, the Heyting-Brouwer (propositional) intuitionist logic INT. The result has been proved by McKinsey and Tarski[48]: Theorem 2: INT proves A iff S4 proves tint (A), and has been made still stronger by Grzegorczyk[25]: Theorem 3: INT proves A iff S4Grz proves tint (A), where S4grz is the system S4 with the addition of the Grz formula ¤(¤(p → ¤p) → p) → p. tint (A) is Gödel 1933 translation [22]: tint (¬p) = ¬¤p, tint (p → q) = ¤p → ¤q, tint (p ∧ q) = p ∧ q, tint (p ∨ q) = ¤p ∨ ¤q. This is very appealing for the present approach, mainly due to a theorem by Solovay generalising Gödel’s incompleteness theorem by two startling completeness theorems: the modal logics G and G* formalise completely the provable provability logic, and the true provability logic. Omitting usual substitution rules, here is a formal presentation of G, build on classical propositional calculus. L denotes the formula of Löb12 [33]. Note the absence of the necessitation Rule NEC for G*. G is given by: AXIOMS : ¤(A → B) → (¤A → ¤B) ¤A → ¤¤A ¤(¤A → A) → ¤A RULES : A , A→B B A ¤A

K 4 L MP NEC

and G* is given by: AXIOMS : Any Theorem of G ¤A → A T RULES : A , A→B MP B Let us define an arithmetical realisation R by a function which assigns to each propositional letter p, q, r... an arithmetical sentence. An arithmetical interpretation i of a modal formula is given recursively by a realisation R, for the atomic letter, i.e. i(p) = R(p), i commutes recursively with the boolean connectors, and, i(¤p) = Bew(ppq), i.e. Gödel celebrate provability (beweisbar in German) arithmetical predicate. Solovay first completeness theorem asserts that G proves A if and only if PA proves i(A) for any arithmetical interpretation, i.e. for any arithmetical realisation R of the atomic 12

See [44] for the importance of the Löb formula, which is a genuine generalisation of Gödel’s second incompleteness, in a setting similar to the present paper.

268

Bruno Marchal

letter. The second completeness theorem is that G* proves A iff i(A) is true, i.e. true, for any realisation R, in the usual number theoretical sense, or true in the so-called standard model of PA. G captures the provable sentence by the machine; and G* captures the true one, including the non provable one. The propositions belonging to the set difference G* \ G are still inferable by the machine. Indeed, G* can be shown to be decidable. Solovay proved that G* proves a formula F if and only if G proves that the conjunction of the “reflection formula” ¤g → g implies F , where g is any boxed subformula of F (boxed means having the shape ¤x), see [61]. Some of those provable/truth splittings are inheritated by some of the intensional variant of provability. Indeed, with ¤p interpreted as Bew(ppq) with p an arithmetical proposition, although ¤p ∧ p, ¤p ∧ ♦p, ¤p ∧ ♦p ∧ p, are truly equivalent (as G* can prove), none are always, for any arithmetical p, provably so by the machine. This makes them obeying different modal logics having different weak logic interpretations. Plotinus’ hypostases will be (re)defined arithmetically through the use of those intensional nuances. This can be translated in the language of some universal machine similar to Peano arithmetic, and Goldblatt’s theorem will give us a way to measure the degree of plausibility of this arithmetical version of Plotinus, by comparing the arithmetical interpretation of Plotinus matter hypostases with quantum logic. Note that we do not allow the presence of free variable in the scope of the provability predicate. We contend ourself here with the “propositional” provability logic for which those completeness theorems hold. See [6] for proofs that the quantified version of G and G* are as undecidable as it can possibly be.

3

Weak Computationalism

Our strategy consists in interviewing a platonist “sufficiently introspective” chatty universal machine. By saying that the machine is “platonist”, we mean that the machine asserts (or proves, believes, etc.) the principle of excluded middle, among the classical tautologies. The letters p, q, ... will always represent arithmetical propositions. Those correspond to the first order logical formula together with the usual symbols of formal arithmetic. By saying that the machine is universal, we mean that the machine is able to prove all true Σ1 -proposition p, i.e. arithmetical propositions which are provably equivalent (by the machine) to a proposition with the shape: ∃xP (x) where P is a primitive recursive arithmetical predicate. Put in another way, it means that for any Σ1 -proposition p the proposition p → ¤p is true for the machine. It can be shown that such a machine has the full power of a Universal Turing Machine. By saying that the machine is “sufficiently introspective” we mean that, for any Σ1 proposition p, not only p → ¤p is true for the machine, but is actually provable by the machine. Given that the Gödelian provability predicate, represented by the box ¤ is itself Σ1 , the machine is able to prove ¤p → ¤¤p for any p. By “chatty” machine, we mean that the machine is so programmed that it dovetails on its proofs, and thus asserts soon or later all its provable propositions. Being classical and sufficiently rich, such a machine is Löbian, and its provability predicate, definable in its own language, is correctly formalized by the logics G and G*. By “interviewing” some machine, we are implicitly assuming a very weak version of the computationalist hypothesis, or digital mechanism, in the cognitive science. But, a priori the machine itself is not supposed to assume the computationalist hypothesis. Still, during the interview itself, we will have to translate it explicitly in the language of the machine. Some thought experiments can justify that the available verifiable sort of reality for universal machine are determined by the true Σ1 sentences13 . So we get a computationalist version of G and G* by adding the axiom p → ¤p to the logic G and G*. This gives the corresponding logics V and V* which have been proved sound and complete for the (provable and true respectively) logic of provability and consistency of the Σ1 sentences by A. Visser [67]. Our interview is an infinite conversation, made finite through the use of Solovay and Visser theorems. To say that G (resp V) proves ¤p → ¤¤p, means that the chatty machine asserts ¤p → ¤¤p 13

Mainly the Universal Dovetailer Argument. The argument shows that the computationalist hypothesis entails self-duplicability, from which follows a notion of first person indeterminacy. It is shown that whatever means are used, if any, to quantify that indeterminacy, the quantification remains invariant for some transformation, making physical predictions relying on a (relative) measure on true Σ1 -propositions, [38, 39, 41–45].

Plotinus’ Theory of Matter

269

for any arithmetical (resp. Σ1 arithmetical) interpretation of the formula p. It means for example that the machine tells us ¤(p1 + 1 = 1q) → ¤(p¤(p1 + 1 = 1q)q), but p1 + 1 = 1q can be substituted by any (false or true) arithmetical formula. For reason of simplicity, we will confine to machine “talking arithmetics”; but it is easy to generalized the result for richer machine like a theorem prover for ZF (Zermelo Fraenkel Set Theory), or even for non-mechanical entities like axiomatized version of second order logic with the (infinite) ω-rule. This follows from results easily accessible in the 1993 book by Boolos [6]. Actually we can associate, hopefully functorially14 , a Plotinian theology for each Löbian machine, but we could confine ourself to the “theology of a Peano Arithmetic machine”, with PA seen as a generic typical simple Löbian machine. All the sound recursively enumerable extensions of PA admit the same formal modal theology. To be sure, G ang G* remains sound and even complete for much more general Löbian entity, see again [6] for more information.

4

The Arithmetical Interpretation of Plotinus’ Hypostases

Each hypostase will be interpreted by a set of arithmetical sentences. Plotinus’ One is interpreted by Arithmetical Truth, i.e the set of all true arithmetical sentences. In case we were interviewing ZF, we would have needed the more complex set-theoretical truth. In any case, it follows from Tarski theorem that such a truth set is not definable by the machine on which such truth bears. Nevertheless, she can already, but indirectly, point to its truth set by some sequence of approximations, and there is indeed a sense to say that Löbian machines are able to prove their own “Tarski theorem”, illustrating again the self-analysis power of those theorem prover machines. See Smullyan’s book [60] for a sketch of that proof and reference therein. In this sense we recover the “One” ineffability, and it is natural to consider arithmetical truth as the (non-physical) cause and ultimate reality of the arithmetical machine. This is even more appealing for a neoplatonist, than just a platonist, given the return of the neoplatonist to the Pythagorean roots of platonism [52]. The atomical verifiable “physical” proposition will be modelized by the Σ1 sentences. Note that the machine can define the restricted, computationalist, notion of Σ1 -truth. Plotinus’ discursive intellect is interpreted by the machine Gödelian provability predicate Bew itself. The corresponding set of arithmetical sentences is the set of provable (by PA, or some Löbian machine M) arithmetical modal sentences. This set is captured by the modal logic G by the first half of Solovay theorem. By the second half of Solovay’s theorem, there is a notable second set to consider: the set of the true 15 arithmetical interpretations of the modal formula, as single out by the modal logic G*. This provides a natural arithmetical interpretation of the “divine intellect”. G play the rôle of discursive reason, or science about oneself as seen (or conjectured) as a finitely formal entity. G* plays the rôle of the whole (propositional) truth about the machine, including what is true but unprovable by the machine. This corresponds to a notion of true inference, as G* is decidable, and thus trivially “correctly inferable”, although it needs an act of faith, as the machine can prove to herself16 . It is remarkable that, in this setting, reason (G) is included in faith (G*), so that only bad faith can fear reason. This is coherent with the scientific attitude of the pagan neoplatonist, and thus rationalist, theologians. The corona set G* \ G can represent the pure theology. This is a set, closed for the modus ponens rule, of the true but unbelievable, unprovable, unassertable by any self-referentially correct machine, propositions. Plotinus believed that the divine intellect has self-referential complete knowledge, but we cannot follow him here: only the discursive machine, capture by G has self-referential but incomplete knowledge. G* has complete knowledge, but not about itself, just about the machine it talks about. Actually, in the mechanist or arithmetical setting, Plotinus’ critics about Aristotle attribution of selfthinking to the One can be repeated on the level of the divine intellect. This can perhaps be considered as a serious departure between the Löbian entities and Plotinus. 14

15

16

In that case we have to consider some interpretability logic [19]. Those logics constituted refinements of the provability logics. Algebraic approaches should help here [35]. In the usual sense of elementary school. Equivalently p is true if p is satisfied by the “standard model of PA”. A “rich” Löbian machine, like ZF, can prove the whole (propositional) theology of a less rich Löbian machine, like PA, but has still to make an act of faith to lift that theology on itself.

270

Bruno Marchal

It is important to realize that both G and G* talk about the machine in a third person way. It corresponds to a situation where a computationalist practicionners is reasoning about himself after betting on some level of digital formal self-description, for example in term of a giant rational complex matrix (represented in the arithmetical language say) representing, hopefully, a reasonable approximation of its “brain” quantum state, whatever that brain is supposed to be. Plotinus’ All-Soul, at least in its discursive form, is captured by Plotinus, according to Bréhier, by the classical traditional and Theaetetical way for defining knowledge by true justified opinion: to know p is to believe (prove) p and p is true [11]. Such a stratagem is very much debated [12] and it can be related to both the mechanist hypothesis and the use of dream in metaphysics [36, 37, 2, 40, 41]. This arithmetical “All-Soul” cannot be represented directly in the (arithmetical) language of the machine. For example Tarski non definability of truth theorem forbids to define the knower by some expression like Bew(ppq)∧ True(ppq), given that the truth predicate “True” bearing on the machine cannot be defined by the machine. More generally, it can be shown that no predicate of knowledge (obeying for example the S4 modal axioms) can be defined formally in any self-referentially correct way [28, 62]. But we can define for each arithmetical p, the knowledge of p by the provability of p and, simply, p. In the formal mathematical setting this has been done independently by Boolos, Goldblatt, and Kuznetsov & Muravitsky in the USSR [31, 5, 24]. Eventually Artemov makes this “definition” a thesis and defends that it is comparable, as a thesis, to the Church thesis (which bears on computations), for the notion of informal proof [1]. It has indeed much in common with Brouwer “unformalizable” notion of creative subject [8, 9, 65, 41] which plays some rôle in the foundation of intuitionist mathematics. Now, PA, in particular, is a sufficiently simple machine so that we know it is correct, and selfreferentially correct when talking about itself. So, is it not obvious that ¤p → p is always (for all p) true? Yes, it is, but PA can neither prove or know that. G* does indeed prove that ¤p is equivalent with ¤p ∧ p, for any arithmetical (realisation) of p, but G does not prove it. For many arithmetical sentences, such equivalence belongs to the corona of true but unbelievable and thus unknowable, but still inferable, truth. It is also obvious that (¤p ∧ p) → p, so we see that ¤p ∧ p defines an “incorrigible” first person notion of knowledge, and it has been proved that the logic S4Grz, see above, is sound and complete for both the provable and true point of view (see [6]). G and G* have exactly the same discourse about the first person, like if the knower, attached in this way to PA, was confusing truth and provability, in a manner similar to an intuitionist philosopher [6]. This can been made precise: by using the Gödel’s translation (see above) we do find indeed an arithmetical interpretation of intuitionist logic. Semantical considerations about the Grz formula can be used to argue that this knower logic can be related to a subjective (antisymmetrical and possibly bifurcating and fusing) temporal logic, quite natural in view of the Plotinus’s idea that the “Soul” generates time. An intensional variant of the provability logic like S4Grz provides also a tool for reconstructing diverse uses of Gödel’s theorem in the philosophy of mind. See [41] for an analysis of Lucas’s “errors” and Benacerraf’s reconstruction, through intensional variant of G and G*. Many confusions in this field can be recasted in term of confusion between third person point of view (treated by G and/or G*) and first person, singular or plural, points of view, treated through the S4Grz variants or those presented below.

5

Arithmetical Quanta and Qualia

Plotinus’ theory of matter is mainly a recasting of Aristotle’s theory in the platonist framework. Plotinus defined matter as the receptacle of the contingency and the possible, making it essentially undeterminate. His platonist constraints forces him to distinguish matter “there” which appears to be definable and intelligible, and matter ”here” which has a sensible counterpart somehow related to the (discursive or not) soul(s). Now, by incompleteness, any possibility is, from the machine point of view, already something undeterminated. The logic G, for example, is closed for necessitation, and the logic G* \ G is closed for possibilitation, i.e. if A is provable in G*, automatically the arithmetical possibility of A, i.e. the consistency of A, ♦A, i.e. ¬¤¬A, is provable in G* too, and is never provable in G. No formula with the shape ♦# is ever provable by G. Thus, a natural way to address the logic of certainty in this indeterminate frame will consist in defining a new intensional variant ¤p ∧ ♦p. This can be shown equivalent with ¤p∧♦t, where t represents some classical tautology (or arithmetical “tautology” like “0 = 0”). Actually this can be generalized by using ♦♦t instead of ♦t, or even transfinitely by ♦α t with α denoting a constructive ordinal, so that the material hypostases are infinite in number, but

Plotinus’ Theory of Matter

271

for the sake of simplicity I will treat only the simple case of ♦t. Using ♦α t can be related to the autonomous progressions, see [20] for an introduction. Motivations which are not based on Plotinus, but based on thought experiments in mechanist philosophy of mind, can provide supplementary reasons to quantify the “material indetermination” by such new arithmetical connectors. The modal formula, where the box ¤p is defined by ¤p ∧ ♦t, gives rise to a logic called Z ([41, 42]. Unlike S4Grz, due to the presence of the true but unprovable consistency ♦t, the logic Z, like the logics G, splits into a provable and unprovable but true parts, named naturally Z and Z*. The candidate for the arithmetical interpretation of the “intelligible matter” hypostases is the set of arithmetical interpretations of the logic Z (discursive, terrestrial) and Z* (true, divine). To get the sensible matter hypostasis, which are more “soul-like”, we have to reapply the theaetetical idea, and define again a new box by (¤A ∧ A) ∧ ♦t. This gives again a couple of splitting logic X and X*. To be sure the status of those logics is still rather mysterious, but we are not done yet: we must recall that we are not only interviewing a platonist universal, sufficiently introspective machine, but also a computationalist one, where the possible “mind states” have to belong to recursively enumerable sets of states (see section 3 above, see [44] for further motivations). This means we have to consider the intensional variants no more of G (and G*) but of V, or G1, that is G + “ p → ¤p” (see above). The corresponding logics will be denoted by the same name + “1”, to remember that they are intensional variant of G1 = V. The “1” is a reminder of the computationalist Σ1 restriction. So we get S4Grz1, which again is a non splitting logic, that is (cryptically): S4Grz = S4Grz1*, and the splitted logics17 Z1, Z1*, X1, X1*. The question we must address now is how close those logics are to a quantum logic of observation, as the empirical world seems to dictate. Let us define the modal logical system B− . It is the same system as the system B described above, except that the substitution rule is weakened so that in the formula p → ¤♦p, p can be substituted only by sentence letters, and also, B− is not closed for the necessitation rule. We have the following theorem [41, 42]: Theorem 4: B− provides a sound logical system for arithmetic (or any Löbian entity) by having its theorems proved in the three logics S4Grz1, Z1*, X1*. This can be shown to be enough for defining three different arithmetical interpretations of some quantum logic. This is done in the Goldblatt way, where the quantization of atomic formula p are described by ¤♦p (see above, see also [57]), This, optimistically perhaps, could reflect the physicists doubts about which quantum logic would be correctly operating in nature [64, 58]. We have not yet been able to verify typical “quantum questions”, like the question of (ortho)modularity, violation of Bell’s inequality, etc. Some more technical considerations provides hints that a quantum “not” can be interpreted in one of those arithmetical “quantum logic” in a manner similar to Rawling and Selesnick [57]. Although all those propositionnal logic are decidable, by translating everything in G makes most such questions still intractable today. As far as we know, it seems to us that S4Grz1, Z1* and X1* provides the closer possible arithmetical interpretation of possible quantum logics18 . Much work remains to dig on the significance of such, embryonic for sure, arithmetical and physical,in some neoplatonist sense, reality. The main interest of such an arithmetical“introspective” approach relies in the fact that the gaps Z1* \ Z1 and X1* \ X1, provides kind of quantum logic for the non-communicable beliefs i.e. the non provable true statement corresponding to “physically” (in a sense close to Plotinus) self-inferable and measurable, like intensities, qualities, providing good candidates for “machine notion of qualia”. If this happens to be correct, the quanta would appear to be definable as sharable (machine communicable) 17

18

The logic Z, Z*, Z1, and Z1* have been recently axiomatized [66]. Unfortunately the technics used cannot be lifted to the X logics. We currently hope to get, from this arithmetical quantization, a sufficiently well behaved arithmetical projection operator, providing some relevant Temperley Lieb algebra [29] so as to justify the exploitability of a universal quantum computer in the vicinity of all Löbian machine, relatively to their most probable computational histories, as seen from some first person plural person point of view, i.e. through some arithmetical version of the Intelligible Matter hypostasis. To be sure, the quantum entangling phenomenon remains hard to capture arithmetically, for the same reason that traditional quantum logic cannot represent the needed tensor product, but see [27] for some possible nuances on that question.

272

Bruno Marchal

qualia. It should give a path from bits to qubits, with the advantage of justifying the communicable and uncommunicable assertion in the realm of what is observable. This could help to prevent mechanism from its common usual person elimination interpretation. An absence of justification of some universal quantum machine, from the lobian self-observing machine, or a mathematical proof that there are none, or any empirical difference between this arithmetical physics and the empirical physics, would refute, not Plotinus, but the present arithmetical interpretation.

References 1. S. Artemov. Kolmogorov’s logic of problems and a provability interpretation of intuitionistic logic. In R. Parikh, editor, Proceedings of the Third Conference on Theoretical Aspect of Reasoning about Knowledge (TARK 90). Morgan Kaufmann Publishers, 1990. 2. E. Barnes. The causal history of computational activity: Maudlin and Olympia. The Journal of Philosophy, pages 304–316, 1991. 3. J. L. Bell. A new approach to quantum logic. Brit. J. Phil. Sci., 37:83–99, 1986. 4. P. Benacerraf. God, the Devil, and Gödel. The monist, 51:9–32, 1967. 5. G. Boolos. Provability, Truth, and Modal Logic. Journal of Philosophical Logic, 9:1–7, 1980. 6. G. Boolos. The Logic of Provability. Cambridge University Press, Cambridge, 1993. 7. G. S. Boolos and Jeffrey R. C. Computability and Logic. Cambridge University Press, Cambridge, 1989. Third Edition. 8. L. E. J. Brouwer. Leven, Kunst en Mystiek. Waltman, Delft, Holland, 1905. La Vie, l’Art et le Mysticisme. 9. L. E. J. Brouwer. Consciousness, philosophy and mathematics. In P. Benacerraf and Putnam H., editors, Philosophy of Mathematics, pages 90–96. Cambridge University Press, Cambridge, second edition, 1983. première édition chez Prentice-Hall 1964. 10. M. Burnyeat. Idealism in Greek Philosophy: What Descartes Saw and Berkeley Missed. Philosophical Review, 3:3–40, 1982. 11. M. Burnyeat. The Theaetetus of Plato. Hackett Publishing Company, Indianapolis, Cambridge, 1990. Translation by M. J. Levett. 12. M. Burnyeat. Socrate et le jury: de quelques aspects paradoxaux de la distinction platonicienne entre connaissance et opinion vraie. In M. Canto-Sperber, editor, Les paradoxes de la connaissance, essais sur le Ménon de Platon, pages 237–251. Editions Odile Jacob, Paris, 1991. 13. W. J. Carroll. Plotinus on the Origin of Matter, pages 179–207. Volume 8 of Wagner [68], 2002. 14. J. Case and Smith C. Comparison of Identification Criteria for Machine Inductive Inference. Theoretical Computer Science, 25:193–220, 1983. 15. M. L. Dalla Chiara. Quantum Logic and Physical Modalities. Journal of Philosophical Logic, 6:391–404, 1977. 16. M. L. Dalla Chiara. Quantum logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, Vol. III, pages 427–469. D. Reidel Publishing Company, Dordrecht, 1986. 17. M. Davis, editor. The Undecidable. Raven Press, Hewlett, New York, 1965. 18. M. Dzielska. Hypatia of Alexandria. Revealing Antiquity 8. Harvard University Press, Cambridge, Massachusetts, 1995. Translated by F. Lyra. 19. S. Feferman. Arithmetisation of Metamathematics in a general Setting. Fundamenta Mathematicae, XLIX:35– 92, 1960. 20. T. Franzèn. Inexhaustibility, a Non-Exhaustive Treatment. Association for Symbolic Logic, Massachusetts, 2004. 21. T. Franzèn. Gödel’s Theorem, an Incomplete Guide to its Use and Abuse. Cambridge University Press, Massachusetts, 2005. 22. K. Gödel. Eine interpretation des intuitionistischen aussagenkalküls. Ergebnisse eines Mathematischen Kolloquiums, 4:39–40, 1933. 23. R. I. Goldblatt. Semantic Analysis of Orthologic. Journal of Philosophical Logic, 3:19–35, 1974. Aussi dans Goldblatt 1993, page 81–97. 24. R. I. Goldblatt. Arithmetical Necessity, Provability and Intuitionistic Logic. Theoria, 44:38–46, 1978. Aussi dans Goldblatt 1993, page 105–112. 25. A. Grzegorczyk. Some relational systems and the associated topological spaces. Fundamenta Mathematicae, LX:223–231, 1967. 26. G. M. Hardegree. The Conditional in Quantum Logic. In P. Suppes, editor, Logic and Probability in Quantum Mechanics, volume 78 of Synthese Library, pages 55–72. D. Reidel Publishing Company, Dordrecht-Holland, 1976. 27. C. J. Isham. Quantum logic and the histories approach to quantum theory. J. Math. Phys., 35(5):2157–2185, May 1994. 28. D. Kaplan and R. Montague. A Paradox Regained. Journal of Formal Logic, 1:79–90, 1960. 29. L.H. Kauffman. Knots and Physics. World Scientific, Singapore, 1991. 2ed. 1993. 30. S. C. Kleene. Introduction to Metamathematics. North-Holland, Amsterdam, 1952. 31. A. V. Kuznetsov and A. Y. Muravitsky. Magari algebras. Fourteenth All-Union Algebra Conf., Abstract part 2: Rings, Algebraic Structures, pages 105–106, 1977. En Russe. 32. J. Lafitte. Réflexions sur la Science des Machines. Vrin, Paris, 1932. 33. M. H. Löb. Solution of a problem of Leon Henkin. Journal of Symbolic Logic, 20:115–118, 1955. 34. S. MacKenna. Extracts from the Explanatory Matter in the First Edition, in Plotinus Enneads, pages xxvi–xli. Penguin Books, London, 1991. 35. R. Magari. Representation and duality theory for diagonalizable algebras. Studia Logica, XXXIV(4):305–313, 1975. 36. N. Malcolm. Dreaming. Routledge & Kegan Paul ltd., London, 1959. 37. N. Malcolm. The conceivability of mechanism. Phil. Review, 77:45–72, 1968. 38. B. Marchal. Informatique théorique et philosophie de l’esprit. In Actes du 3ème colloque international de l’ARC, pages 193–227, Toulouse, 1988.

Plotinus’ Theory of Matter

273

39. B. Marchal. Mechanism and personal identity. In M. De Glas and D. Gabbay, editors, Proceedings of WOCFAI 91, pages 335–345, Paris, 1991. Angkor. 40. B. Marchal. Amoeba, planaria, and dreaming machines. In P. Bourgine and F. J. Varela, editors, Artificial Life, towards a practice of autonomous systems, ECAL 91, pages 429–440. MIT Press, 1992. 41. B. Marchal. Conscience et Mécanisme. Technical Report TR/IRIDIA/94, ULB, 1994. 42. B. Marchal. Calculabilité, Physique et Cognition. PhD thesis, Université de Lille, Département d’informatique, Lille, France, 1998. 43. B. Marchal. Computation, Consciousness and the Quantum. Teorie & Modelli, 6(1):29–43, 2001. 44. B. Marchal. The Origin of Physical Laws and Sensations. In 4th International System Administration and Network Engineering Conference, SANE 2004, Amsterdam, 2004. 45. B. Marchal. Theoretical Computer Science and the Natural Sciences. Physics of Life Reviews, 2–4:251–289, 2005. 46. E. Martin and Osherson D. N. Elements of Scientific Inquiry. The MIT Press, Cambridge, 1998. 47. Y. V. Matiyasevich. Hilbert Tenth Problem. The MIT Press, Cambridge, 1993. 48. J. C. McKinsey and A. Tarski. Some theorems about the sentential calculi of Lewis and Heyting. Journal of Symbolic Logic, 13:1–15, 1948. 49. P. Mittelstaedt. Quantum Logic. D. Reidel Publishing Company, Dordrecht, Holland, 1978. 50. J. Myhill. Some philosophical implications of mathematical logic. The review of Metaphysics, VI(2), 1952. 51. T. Nagel. The View from Nowhere. Oxford University Press, Oxford, 1986. 52. D.J. O’Meara. Pythagoras Revived. Clarendon Press, Oxford, 1989. 53. R. Penrose. The Emperor’s New Mind. Oxford University Press, Oxford, 1989. 54. R. Penrose. Shadows of the Mind. Oxford University Press, Oxford, 1994. 55. E. Post. Absolutely unsolvable problems and relatively undecidable propositions : Account of an anticipation. In Davis [17], pages 338–433. 1922. 56. E. Post. Recursively enumerable set. In Davis [17], pages 338–433. 1944. 57. J. P. Rawling and S. A. Selesnick. Orthologic and Quantum Logic: Models and Computational Elements. Journal of the ACM, 47(4):721–751, 2000. 58. M. Rédei. Quantum Logic in Algebraic Approach. Kluwer Academic Publishers, Dordrecht, 1998. 59. C. H. Smith. Applications of classical recursion theory to computer science. In F. R. Drake and S. S. Wainer, editors, Recursion Theory: its generalisation and applications, pages 236–247. Cambridge University Press, 1980. 60. R. Smullyan. Gödel Incompleteness Theorem. Oxford University Press, New York, Oxford, 1992. 61. R. M. Solovay. Provability Interpretation of Modal Logic. Israel Journal of Mathematics, 25:287–304, 1976. 62. R. H. Thomason. A note on syntactical treatment of modality. Synthese, 44:391–395, 1980. 63. J. Trouillard. L’Un et l’âme selon Proclos. Les Belles Lettres, Paris, 1972. 64. B. C. van Fraassen. The labyrinth of quantum logic. In R. Cohen and M. Wartosky, editors, Boston Studies of Philosophy of Sciences, volume 13, pages 224–254. Reidel, Dordrecht, 1974. 65. W. P. van Stigt. Brouwer’s Intuitionism, volume 2 of Studies in the history and philosophy of Mathematics. North Holland, Amsterdam, 1990. 66. E. Vandenbussche. Axiomatization of the logics Z, Z*, Z1, Z1*. Unpublished Manuscript, 2005. 67. A. Visser. Aspects of Diagonalization and Provability. PhD thesis, University of Utrecht, Department of Philosophy, The Nederland, 1985. 68. F. W. Wagner, editor. Neoplatonism and Nature, Studies in Plotinus’ Enneads, volume 8 of Studies in Neoplatonism: Ancient and Modern. State University of New-York Press, New York, 2002. 69. R. T. Wallis. Neoplatonism. Gerald Duckworth and CO.Ltd, London, 1972. Second Edition 1995 by Gerald Duckworth and CO.Ltd (London) and Hacket Publishing Company (Indianapolis). 70. J. C. Webb. Mechanism, Mentalism and Metamathematics: An essay on Finitism. D. Reidel Publishing Company, Dordrecht, Holland, 1980. 71. J. C. Webb. Gödel’s Theorems and Church’s Thesis: a Prologue to Mechanism. In R. S. Cohen and M. W. Wartofsky, editors, Language, logic, and method, pages 309–353. D.Reidel Publishing Company, Dordrecht, Holland, 1983. 72. J. A. Wheeler. Law without Law. In P. Medawar and Shelley J., editors, Structure in Science and Art, pages 132–154. Elsevier North-Holland, Amsterdam, 1980.

On a Relationship between Non-Deterministic Communication Complexity and Instance Complexity Armando B. Matos, Andreia C. Teixeira, and André C. Souto DCC-FC & LIACC? , Universidade do Porto Rua do Campo Alegre 1021/1055, 4169-007 Porto, Portugal [email protected] [email protected] [email protected]

Abstract. We study the relationship between non-deterministic communication complexity of uniform functions [8, 9, 4] and instance complexity [7] (see the definitions in 2.1 and 2.2 respectively). For that purpose, the witness of the non-deterministic communication protocol executed by Alice and Bob is interpreted by Alice as a program p that, for t sufficiently large, “corresponds exactly” (see Definition 4) to the instance complexity ict (y : Y1 (x)); in the previous expression x and y are the parts of the input known by Alice and Bob respectively and Y1 (x) is the set of all values of y such that f (x, y) = 1. The main results of this paper are 1 max {ict(n) yes (y : Y1 (x))} = N (f ) and

|x|=|y|=n

max {ict(n) (y : Y1 (x))} = N (f ),

|x|=|y|=n

where ictyes (a : S) is a variant of instance complexity (see Definition 5), and the nondeterministic communication complexities N 1 (f ) and N (f ) are defined in [4], Chap. 2. We also present in Sec. 5 a simple inequality relating individual communication complexity with instance complexity.

1

Introduction

Communication complexity and instance complexity seem at first to be totally unrelated concepts. In a typical setup for communication complexity the two parties, Alice and Bob, have unbounded computational power and one wants to find how many bits they need to exchange in order to compute the value of a given function of their inputs, f : X × Y → {0, 1}; on the other hand, the instance complexity ict (x : A) is the length of the shortest program that runs in time t, answers correctly the question “x ∈ A?” and does not “lie” about set A (the program may answer “⊥” meaning “don’t know”). Communication complexity is about the communication cost while instance complexity is related with computational complexity. In this paper we establish a relationship between these two concepts. Let x and y be the inputs of Alice and Bob respectively, n = |x| = |y| and let Y1 (x) be the set of all y such that f (x, y) = 1. Theorem 2 states that, apart from a constant and for t sufficiently large, max|x|=|y|=n {ictyes (y : Y1 (x))}, where icyes is a “one-sided” version of instance complexity (Definition 5), equals the non-deterministic communication complexity N 1 (f ) of the uniform function f ; similarly the maximum value of ict (y : Y1 (x)) equals the non-deterministic communication complexity N (f ), see Theorem 3. The main ingredient for the proof of this result is a protocol in which Alice uses the non-deterministic word p as a program that eventually corresponds to ict (y : Y1 (x)); Alice runs p(y) for all y ∈ Y for a maximum of t(|y|) steps. We should notice two facts that may help the understanding the rest of this paper: (i) Neither Alice nor Bob alone (without communication and without the help of the oracle) ?

Partially supported by funds granted to LIACC through the Programa de Financiamento Plurianual, FCT and Programa POSI.

Non-Deterministic Communication Complexity and Instance Complexity

275

can compute ict (y : Y1 (x)); the reason is that Alice only knows x and Bob only knows y. (ii) ict(n) (y : Y1 (x)), like N (f ), can be much smaller than n. We mention two previous works where the communication complexity has been analyzed in a non-standard way: the paper [2] on individual communication complexity in which Kolmogorov complexity is used as the main analysis tool and [5] where “distinguishers” are used to obtain bounds on communication complexity. This paper is organized as follows. Next section contains some background on communication complexity and instance complexity. We study one-sided protocols in Sec. 3 and two-sided protocols in Sec. 4. These two sections contain the main results of this paper, namely Theorems 2 and 3. Sec. 5 contains some comments on the relationship between individual communication complexity and instance complexity. Finally in Sec. 6 some future lines of research are suggested.

2

Preliminaries

The set of natural numbers (including 0) is denoted by IN. An alphabet Σ is an nonempty finite set whose members are called letters. The alphabet used in this paper is {0, 1}. A word is a sequence of 0 or more letters; words are denoted by x, y and w, possibly overlined. The length and the i-th letter of the word x are denoted by |x| and xi respectively. 2.1

Communication Complexity

We define several forms of non-deterministic communication complexity, for more details see [4]. Let f : {0, 1}n × {0, 1}n → {0, 1} be a boolean function. Two players, Alice and Bob want to compute f (x, y); Alice only knows x while Bob only knows y. A (“two-sided”) non-deterministic protocol P for f has output P (w, x, y) ∈ {0, 1, ⊥} where ⊥ means “don’t know” and w is the guess; the protocol P satisfies the following conditions [f (x, y) = 1] ⇒ [∃w [f (x, y) = 0] ⇒ [∀w [f (x, y) = 0] ⇒ [∃w [f (x, y) = 1] ⇒ [∀w

: : : :

P (w, x, y) = 1] P (w, x, y) 6= 1] P (w, x, y) = 0] P (w, x, y) 6= 0]

(1) (2) (3) (4)

For z ∈ {0, 1} a “one-sided” protocol P z has output either z or ⊥ and satisfies [f (x, y) = z] ⇒ [∃w : P z (w, x, y) = z] z

[f (x, y) 6= z] ⇒ [∀w : P (w, x, y) = ⊥]

(5) (6)

It is easy to build a non-deterministic protocol for f using the one-sided protocols P 0 and P 1 . We should emphasize that in any protocol, Alice and Bob must be convinced of the output of the protocol, in the sense that “false guesses” must be detected and rejected (output ⊥); this requirement corresponds to the “∀ · · · ” predicates above. In other words, Alice and Bob do not trust the oracle. Otherwise the problem would be trivial. Some of the variants of non-deterministic communication complexity are as follows. Definition 1. (non-deterministic communication complexities) Standard and individual (non-deterministic) communication complexities are denoted by N and N respectively. – Individual communication complexity of protocol P with output set {1, ⊥}: NP1 (f, x, y) = minw {|c(w)| : P (w, x, y) = 1} where w is the guess and c(w) (“conversation”) is the sequence of bits exchanged between Alice and Bob when the guess is w. Notice that NP1 (f, x, y) is only defined if f (x, y) = 1. Notice also that the behavior of the protocol P for the other inputs (x, y) is irrelevant.

276

Armando B. Matos, Andreia C. Teixeira, and André C. Souto

– Individual communication complexity with output set {1, ⊥}: N 1 (f, x, y) = minP {NP1 (f, x, y)} where the protocols P considered for minimization are protocols with output set {1, ⊥} for the function f . – Communication complexity of protocol P with output set {1, ⊥}: NP1 (f ) = maxx,y {NP1 (f, x, y)} : – Communication complexity of function f with output set {1, ⊥}: N 1 (f ) = minP {NP1 (f )}. The complexities NP0 (f, x, y), NP0 (f ), and N 0 (f ), are defined in a similar way. Define NP (f, x, y) = NP0 (f, x, y) if f (x, y) = 0, and NP (f, x, y) = NP1 (f, x, y) if f (x, y) = 1; 0 1 NP (f ) = log(2NP (f ) + 2NP (f ) ); N (f ) = minP {NP (f )}. A witness is a guess that causes the protocol to output a value different from ⊥. ¤ In the literature it is possible to find the following definition of the individual (non-deterministic) communication complexity associated with protocol P , see for instance [1]: NP1 (f, x, y) = minw {|w|+|c(w)| : P (w, x, y) = 1}; comparing with Definition 1, we see that the corresponding values differ by at most a factor of 2. 0 0 The definition NP (f ) = log(2NP (f ) + 2NP (f ) ) is from [4]; we have max{NP0 (f ), NP1 (f )} < NP (f ) ≤ max{NP0 (f ), NP1 (f )} + 1 therefore NP (f ) ≈ max{NP0 (f ), NP1 (f )}. The following result from [4] shows that for every function there is a simple optimal nondeterministic protocol. Theorem 1. For every boolean function f there is an optimal one-sided non-deterministic protocol P for f , that is, a protocol P such that NP1 (f ) = N 1 (f ), with the following form where the witness w, 1 ≤ w ≤ m, is the index of the first rectangle Rw = A × B containing (x, y) in the first minimum 1-cover: (1) Alice guesses w and checks if x ∈ A. (2) Alice sends w to Bob. (3) Bob checks if y ∈ B. Define the sets: X0 (y) = {x : f (x, y) = 0}, X1 (y) = {x : f (x, y) = 1}, Y0 (x) = {y : f (x, y) = 0} and Y1 (x) = {y : f (x, y) = 1}. Notice that Alice knows Y0 (x) and Y1 (x) while Bob knows X0 (y) and X1 (y). The set Y1 is often mentioned in this paper. Theorems 2 and 3 apply only to “uniform” functions. Definition 2. A function is uniform if it is computed by a fixed (independent of the length of the input) algorithm. ¤ Every function that can be described by an algorithm is uniform; for instance equality, conjunction and parity are uniform functions. An example of a function which with almost certainty is not uniform is the random function defined as f (x, y) = 0 or f (x, y) = 1 with probability 1/2. 2.2

Instance Complexity

We define several forms of instance complexity; for a more complete presentation see [7]. It is assumed that programs always terminate, and output either 0, 1 or ⊥ (“don’t know”). Definition 3. A program p is consistent with a set A if x ∈ A whenever p(x) = 1 and x 6∈ A whenever p(x) = 0. ¤ Definition 4 (instance complexity). Let t : IN → IN be a function, A a set and x an element. Consider the following conditions: (C1) for all y, p(y) runs in time not exceeding t(|y|), (C2) for all y, p(y) outputs 0, 1 or ⊥, (C3) p is consistent with A and (C4) p(x) 6= ⊥. The t-bounded instance complexity of x relative to the set A is ict (x : A) = min{|p| : p satisfies C1, C2, C3, and C4}

Non-Deterministic Communication Complexity and Instance Complexity

277

A program p corresponds to ict (x : A) if it satisfies conditions C1, C2, C3, and C4; if moreover |p| = ict (x : A) we say that p corresponds exactly to ict (x : A). ¤ Relaxing the condition “p(x) 6= ⊥” we get two weaker forms of instance complexity: Definition 5. (inside instance complexity) Consider the following conditions: (C1) for all y, p(y) runs in time not exceeding t(|y|), (C2) for all y, p(y) outputs either 1 or ⊥, (C3) p is consistent with A and (C4) x ∈ A ⇒ p(x) = 1. The t-bounded inside instance complexity of x relative to the set A is ictyes (x : A) = min{|p| : p satisfies C1, C2, C3, and C4} A program p corresponds to ictyes (x : A) if it satisfies conditions C1, C2, C3, and C4; if moreover |p| = ictyes (x : A) we say that p corresponds exactly to ictyes (x : A). ¤ Definition 6 (outside instance complexity). Consider the following conditions: (C1) for all y, p(y) runs in time not exceeding t(|y|), (C2) for all y, p(y) outputs either 0 or ⊥, (C3) p is consistent with A and (C4) x 6∈ A ⇒ p(x) = 0. The t-bounded outside instance complexity of x relative to the set A is ictno (x : A) = min{|p| : p satisfies C1, C2, C3, and C4} A program p corresponds to ictno (x : A) if it satisfies conditions C1, C2, C3, and C4; if moreover |p| = ictno (x : A) we say that p corresponds exactly to ictno (x : A). ¤ Notice that if x 6∈ A then ictyes (x : A) is a constant (independent of x), because the program p(x) ≡ ⊥ has fixed length and is consistent with every set; similarly if x ∈ A then ictno (x : A) is a constant. Notice also that for every element x, set A and function t we have ictyes (x : A) ≤ ict (x : A) and ictno (x : A) ≤ ict (x : A). On the other hand, from a program p 1 2 corresponding to ictyes (x : A) and a program p0 corresponding to ictno (x : A) we can define a 0 program r as follows: r(x) = 1 if p(x) = 1, r(x) = 0 if p (x) = 0 and r(x) = ⊥ otherwise, concluding that 1 2 icf (t1 ,t2 ) (x : A) ≤ ictyes (x : A) + ictno (x : A) 1 2 + O(log(min{ictyes (x : A), ictno (x : A)}))

where the function f represents the time overhead needed for the simulation of p(x) for t1 steps followed by simulation of p0 (x) for t2 steps; the logarithmic term comes from the need to delimit p from p0 in the concatenation pp0 . Notation. To emphasize that t is a function of n, we write ict(n) (y : A(x)), ict(n) yes (y : A(x)) t(n) and icno (y : A(x)).

3

One-sided Protocols

As an illustration we first consider in sub-section 3.1 a somewhat simplified analysis of the function x 6= y (also called “NEQ”), and show how to use programs corresponding to instance complexity as guesses of (optimal) non-deterministic protocols. This usage is later generalized to any uniform function in sub-section 3.2.

278

3.1

Armando B. Matos, Andreia C. Teixeira, and André C. Souto

Inequality: an Optimal “icyes -protocol”

Consider the predicate NEQ and suppose that x 6= y; then for some i, 1 ≤ i ≤ n, we have xi 6= y i . A possible program pi corresponding to ictyes (y : Y1 (x)) is pi (y) = 1 if yi 6= xi , pi (y) = ⊥ if yi = xi . The reader may compute the set Y10 = {y : pi (y) = 1} and show that Y10 ⊂ Y1 (x). If p(y) = 1 and if |p| is minimum, this program corresponds exactly to ictyes (y : Y1 (x)). Consider now the following protocol Pt for NEQ where t is a time bound sufficiently large (see more details in sub-section 3.2). Alice receives a word p as a guess; p may eventually be the program pi above. Then she runs p(y) for every y ∈ Y until the program halts or until t(|y|) steps have elapsed. If p(y) does not halt in time t(|y|), the word p is not a valid witness and the protocol halts. Otherwise Alice defines the set Y10 = {y : p(y) = 1}. If Y10 ⊆ Y1 (x), i.e., if p is consistent with Y1 (x), she sends p to Bob, otherwise outputs ⊥ and halts. Bob tests if p(y) = 1; if yes, outputs 1, otherwise outputs ⊥. Correctness conditions: (1) If x 6= y, there is a witness p that corresponds to ictyes (y, Y1 (x)). We have xi 6= y i for some i, 0 ≤ i ≤ n. Then, if p happens to be the program pi above, the protocol P outputs 1 so p corresponds to ictyes (y, Y1 (x)), that is, we must have Y10 consistent with Y1 (x) (verified by Alice) and p(y) = 1 (verified by Bob). (2) If a guess is wrong, the output is ⊥. If the guess is wrong, then either some p(y) runs for more than t(|y|) steps or p is not consistent with Y1 (x) or p(y) = ⊥; if t(n) is sufficiently large, the output is ⊥ in all these cases. (3) If x = y, no guess p can cause output 1. This follows directly from the definition of the protocol. Complexity: The length of pi need not to exceed log n + O(1) and max0≤i≤n {|pi |} is log n + O(1). Thus the complexity of the protocol P is log(n) + O(1). But the non-deterministic communication complexity of NEQ is also log n + O(1) (see [4]), thus the protocol is optimal. 3.2

“icyes -protocols” are Optimal

Consider now the one-sided protocol of Figure 1. In the general case, the function f , which is known by Alice and Bob, is arbitrarily complex; therefore the description of f can not be included into an “instance complexity program” p unless limn→∞ |p| = ∞ (see also the comments after the proof of Theorem 2). However, a much simpler situation arises if we consider only uniform functions. Theorem 2. (icyes -protocols are optimal) Let f be an uniform function. There is a computable function t(n) such that N 1 (f ) =

max {ict(n) yes (y : Y1 (x))} + O(1)

|x|=|y|=n

(7)

Proof. Let p be the non-deterministic word given to Alice by the oracle; the protocol Pt(n) is described in Figure 1 where t(n) is an appropriate time bound (see below). Notice that the protocol specifies that Alice should interpret p as a program and execute p for a maximum time t(n). The program p, being an arbitrary word, may behave in many different ways; in particular, if f (x, y) = 1, the behavior described in Figure 2 will cause Pt(n) to output 1. If i is chosen so that (x, y) ∈ Ri (if f (x, y) = 1 there is at least one such i, otherwise there is none) then p is consistent with Y1 (x) and p(y) = 1. Then |p| ≥ ict(n) yes (y : Y1 (x)) for t(n) sufficiently large. Moreover, if p is not “correct”, that fact can be detected by Alice or by Bob; thus, conditions (5) and (6) (see page 275) are verified. As f is assumed to be uniform, the length of a program which is accepted as a witness, needs not to exceed log m + O(1).

Non-Deterministic Communication Complexity and Instance Complexity

279

Alice: Receive program p(y) (as a possible witness) Test if, for every y ∈ Y , p(y) halts in time not exceeding t(n) with output 1 or ⊥ If not, output ⊥ and halt Compute the set B = {y : p(y) = 1} Find the set of smallest 1-covers Select the first (in lexicographic order) such cover hR1 , R2 , . . . Rm i Select a rectangle Ri = A × B from that cover where B ⊆ Y is the set computed above As the cover is minimum, there can be at most one such rectangle. If there is none, output ⊥ and halt Comment. At this point we know that p is consistent with Y1 (x) Test if x ∈ A If not, output ⊥ and halt Send p to Bob Bob: Verify if p(y) = 1 If yes, output 1 and halt Output ⊥ and halt Fig. 1. A family of one-sided non-deterministic protocols Pt(n) . The guess is based on a program p that corresponds to ictyes (y : Y1 (x)). Program p, input y: From d(f ) and i: Find the set S1 of smallest 1-covers Select the first (in lexicographic order) cover hR1 , R2 , . . . Rm i ∈ S1 Select rectangle Ri = A × B in that cover With input y, output p(y) = 1 if y ∈ B p(y) = ⊥ otherwise Fig. 2. A possible behavior of the program p which may cause the protocol Pt(n) (see Figure 1) to output 1. A string p with this behavior can be specified in length |d(f )|+log m. The existence of this program, which has length log m + O(1) where m is the size of the minimum covers, justifies the step between equation (9) and inequality (10).

How much time t(n) must Alice run p(y) (for each y) so that, there is at least a witness for every pair (x, y) with f (x, y) = 1? As f is uniform, it is possible to obtain an upper bound t(n) in a constructive way by detailing and analyzing the algorithm that the witness p should implement, see Figure 2. Therefore we may suppose that t(n) is a well defined function. Suppose now that f is uniform, and that f (x, y) = 1. If the protocol accepts (x, y) with guess p, we have |p| ≤ log m + O(1) and max|x|=|y|=n {|p|} ≤ log m + O(1). Thus N 1 (f ) = log C 1 (f ) + O(1) = log m + O(1) ≥ max {|p|} + O(1) |x|=|y|=n



max {ict(n) yes (y : Y1 (x))} + O(1)

|x|=|y|=n

(8) (9) (10) (11)

On the other hand, there exists a non-deterministic protocol with complexity max|x|=|y|=n {ict(n) yes (y : Y1 (x))} + O(1); this is the protocol of Figure 3 when t(n) is sufficiently large. Notice that program p can be any program running in time t(n) which is consistent with Y1 (x) and such that p(y) = 1 (and, if f (x, y) = 1, there is at least one

280

Armando B. Matos, Andreia C. Teixeira, and André C. Souto

such program, as we have seen above); thus it can be the shortest such program, |p| = ict(n) yes (y, Y1 (x)). Taking the maximum over all x and y with |x| = |y| = n (see Definition 1) we 1 get N 1 (f ) ≤ max|x|=|y|=n {ict(n) yes (y : Y1 (x))} + O(1) because N (f ) is the smallest complexity among all the protocols for f . Combining this result with inequation (11) we get N 1 (f ) = ¤ max|x|=|y|=n {ict(n) yes (y : Y1 (x))} + O(1). A Note on the Uniformity Condition At first it may be not obvious why the validity of equality (7) of Theorem 2 depends on the uniformity of f . Let us argue that (7) may be false for non uniform functions, using the Kolmogorov complexity as a tool. Denote by C(x) the (plain) Kolmogorov complexity of x which is defined as C(x) = min{|p| : U (p) = x} where U is some fixed universal Turing machine, see [6]. Consider a monochromatic cover of a non uniform function such that (i) the number m of rectangles in the cover is very small and (ii) the horizontal side B of the first rectangle in the cover has a Kolmogorov random length, C(|B|) ≈ n. The length B can be obtained from p, thus C(|B|) ≤ C(p) + O(1) which implies C(p) ≥ n + O(1) >> log m; thus the step (9) → (10) in the proof is not valid. Alice: Receive program p(y) (as a possible witness) Test if, for every y, p(y) halts in at most t(n) steps If not, output ⊥ and halt Test if {y : p(y) = 1} ⊆ Y1 (x) (p is consistent with Y1 (x)) If not, output ⊥ and halt Send p to Bob Bob: Compute r = p(y) and test if r = 1 If not, output ⊥ and halt Output 1 Alice: Output the message received 0 Fig. 3. A family of one-sided non-deterministic protocols Pt(n) . The guess may be any t program p that corresponds to icyes (y : Y1 (x)), that is p must satisfy only {y : p(y) = 1} ⊆ Y1 (x) and p(y) = 1.

4

Two-sided Protocols

Now we consider the two-sided protocols for non-deterministic communication complexity. If t(n) is sufficiently large, then there are optimum protocols whose guesses correspond exactly to ict (y : Y1 (x)). Theorem 3. Let f be an uniform function. There is a computable function t(n) such that N (f ) =

max {ict(n) (y : Y1 (x))} + O(1)

|x|=|y|=n

See the Appendix for comments on the proof of this result.

Non-Deterministic Communication Complexity and Instance Complexity

5

281

About Individual Communication Complexity

The one sided individual communication complexity satisfies N 1 (f, x, y) ≥ ictyes (y : Y1 (x)) + O(1) The complexity N 1 (f, x, y) is obtained from a minimization over all protocols which must of course “work correctly” for every pair (x, y) and not only for (x, y) while no such restriction exists in the definition of instance complexity. The individual communication complexity may in a few rare cases (if i has a very short description), be much smaller than log m. Finally we present a result relating the individual non-deterministic communication complexity with the instance complexity. Theorem 4. (Individual upper bound) For every function f and values x and y there is a t0 ∈ IN such that the individual non-deterministic communication complexity N (f, x, y) satisfies ∀t ≥ t0 : N (f, x, y) = ict (y : Y1 (x)) + O(1) ≤ N (f ) + O(1)

6

Conclusions and Future Work

We have established that, for uniform functions f and for t(n) sufficiently large, the maximum value of ict(n) (y : Y1 (x)) (where Y1 (x) = {y : f (x, y) = 1}) equals the non-deterministic communication complexity N (f ). The work done here can be continued along several lines of research and, in particular, it would be interesting to study more deeply the relationship between individual communication complexity and instance complexity and to search for the existence and properties of protocols that, besides being optimal (in the worst case), minimize the communication for all pairs (x, y). It would also be interesting to relate the time bound of instance complexity with the complexity class of the function f . In more general terms we think that it is important to study in depth the relationship between measures of communication cost and measures of computational complexity.

References 1. Sanjeev Arora, Boaz Barak, Computational Complexity: A Modern Approach, Princeton University, 2006, url: http://www.cs.princeton.edu/theory/complexity/communicatechap.pdf 2. Harry Buhrman, Hartmut Klauck, Nikolai Vereshchagin and Paul Vitány, Individual communication complexity, STACS 2004: 21st Annual Symposium on Theoretical Aspects of Computer Science, Montpellier, France, March 25-27, 2004. 3. Ilan Kremer, Noam Nisan, Dana Ron, On randomized one-round communication complexity, Proc. of 27th STOC, pp 596-605, 1995, url: citeseer.ist.psu.edu/kremer95randomized.html 4. Eyal Kushilevitz, Noam Nisan, Communication Complexity, Cambridge University Press, New York, Springer-Verlag, 1996. 5. Sophie Laplante, John Rogers, Indistinguishability, Technical report TR-96-26, 1996, url: citeseer.ist.psu.edu/laplante98indistinguishability.html 6. Ming Li e Paul Vitányi, An Introduction to Kolmogorov Complexity and its Applications, SpringerVerlag Graduate Texts In Computer Science Series, second edition, 1997. 7. Pekka Orponen, Ker-I Ko, Uwe Schöning, Osamu Watanabe, Instance Complexity, Journal of the ACM, 41:1, pp 96-121, 1994, url: citeseer.ist.psu.edu/orponen94instance.html. 8. Andrew Chi-Chih Yao, Some complexity questions related to distributive computing, Proceedings of the 11th Annual ACM Symposium on Theory of Computing, Atlanta, pp 209-213, 1979. 9. Andrew Chi-Chih Yao, The entropic limitations on VLSI computations, Proceedings of the 13th Annual ACM Symposium on Theory of Computing, Milwaukee, pp 308-311, 1981.

282

Armando B. Matos, Andreia C. Teixeira, and André C. Souto

Appendix A. About the Proof of Theorem 3 The proof of Theorem 3 is similar to the proof of Theorem 2; we make only a few observations. The reader should compare Figures 1 and 2 with Figures 4 and 5 respectively. The main difference in the proof is that we have now to consider a minimum cover of 0-rectangles and a minimum cover of 1-rectangles. Denote by m = C 0 (f ) and m0 = C 1 (f ) the size of those covers; as the function f is assumed to be uniform, the witness (program) p has a description with length log(m + m0 ) + O(1). It is not difficult to verify the correctness of conditions (1) to (4), see page 275. Alice: Receive program p(y) (as a possible witness) Test if, for every y ∈ Y , p(y) halts in time not exceeding t(n) with output 0, 1 or ⊥ Compute the set B = {y : p(y) 6= ⊥} Test if B is monochromatic and not empty Find the set S0 of smallest 0-covers and the set S1 of smallest 1-covers Select the first (in lexicographic order) sequence s = hR1 , . . . Rm , Rm+1 , . . . Rm+m0 i where hR1 , . . . Rm i ∈ S0 and hRm+1 , . . . Rm+m0 i ∈ S1 Select a rectangle Ri = A × B from s Comment. There is at most one such rectangle Test if x ∈ A Send p to Bob Bob: Compute r = p(y) Output r Fig. 4. A family of two-sided non-deterministic protocols Pt(n) . The guess is based on a program p that corresponds to ict (y : Y1 (x)). Compare with Figure 1. For simplicity we assume that whenever a test fails, the protocol outputs ⊥ and halts.

Program p, input y: From d(f ) and i: Find the set S0 of smallest 0-covers and the set S1 of smallest 1-covers Select the first (in lexicographic order) sequence s = hR1 , . . . Rm , Rm+1 , . . . Rm+m0 i where hR1 , . . . Rm i ∈ S0 and hRm+1 , . . . Rm+m0 i ∈ S1 Select the ith rectangle Ri = A × B from s With input y, output: p(y) = z if y ∈ B and rectangle A × B has color z ∈ {0, 1} p(y) = ⊥ otherwise Fig. 5. A possible behavior of the program p which may cause the protocol Pt(n) of Figure 4 to output a value different from ⊥. A string p with this behavior can be specified in length |d(f )| + log(m + m0 ).

Parallelism in DNA and Membrane Computing Benedek Nagy1,2 and Remco Loos1 1

2

Research Group on Mathematical Linguistics Rovira i Virgili University Pça Imperial Tàrraco 1, 43005 Tarragona, Spain Faculty of Informatics (Computer Science and Information Technology) University of Debrecen, Egyetem tér 1. 4032 Debrecen, Hungary [email protected] [email protected]

Abstract. In this paper we consider DNA and membrane computing, both as language generators and as problem solving devices. The basic motivation behind these models of natural computing is using parallelism to make hard problems tractable. In this paper we analyze the concept of parallelism. We will show that the parallelism has very different meanings in these models. We introduce the terms ’or-parallelism’ and ’and-parallelism’ for these two basic types of parallelism.

1

Introduction

Over the last decade, molecular computing has been a very active field of research. The great promise of performing computations at a molecular level is that the small size of the computational units potentially allows for massive parallelism in the computations. Thus, computations that are intractable in sequential modes of computation can be performed (at least in theory) in polynomial or even linear time. In this paper, we investigate the way parallelism is used in different models of molecular computation. We are interested in the role the parallelism plays in the theoretical models, that is in the language-generating devices, as well as in the way parallelism is employed to solve computationally hard (typically NP-complete) problems. In the current paper, we focus on two branches of molecular computing, DNA computing and Membrane computing. In the future, this work could be extended to other models of molecular computation, such as forbidding-enforcing systems, as well as other bio-inspired models of computation, like cellular automata and neural networks. 1.1

DNA-computing

The field of DNA computing was instigated by Leonard Adleman’s 1994 paper [1], in which he reports a molecular solution of an instance of an NP-complete problem, namely the Hamiltonian-path problem. Since then, much work has been done in this area, covering both experimental work and the formulation of formal and computational models. These models typically represent DNA strands as strings and model biochemical operations by string rewriting rules. The field of DNA computing comprises a variety of ways to consider molecular computing, which range from purely theoretical computational models to more practical ’molecular algorithms’ to actual experimental implementations of molecular computations. The theoretical models include different types of systems such as splicing systems, sticker systems and deletion-insertion systems. Details about these systems can be found in [6]. Here we do not consider WatsonCrick automata, which is not a parallel device, but rather one based on the inherent power of Watson-Crick complementarity.

284

1.2

Benedek Nagy and Remco Loos

Membrane Computing

Membrane computing is a area of molecular computing initiated by Gheorghe Paun [3, 5]. A membrane system (also called P system) is a computing model inspired by the structure of a living cell process chemical compounds in their compartmental structure. A membrane structure defines regions where objects evolve according to given rules. From this basic structure, many different computational devices can be defined, according to the objects used, the types of rules one allows and the way the generated language is defined. The objects can be described by symbols or by strings of symbols (in the former case their multiplicity matters, that is, we work with multisets of objects placed in the regions of the membrane structure; in the second case we can work with languages of strings or, again, with multisets of strings). By using the rules in a nondeterministic, maximally parallel manner, one gets transitions between the system configurations. A sequence of transitions is a computation. With a halting computation we can associate a result, in the form of the objects present in a given membrane in the halting configuration, or expelled from the system during the computation. Various ways of controlling the transfer of objects from a region to another one and of applying the rules, as well as using so-called active membranes: possibilities to dissolve, divide or create membranes were considered. Gheorghe Paun’s book [4] is a good introduction to the most important types of membrane systems. We will try to avoid considering specific models. Instead we focus on the nature of the parallelism present in them, which we will see to be common to most models in the considered area.

2

Parallelism in Language Generation

In this part analyze the nature of the parallelism in the formal computational models of both areas. We claim that despite the different levels of abstraction and the different models, essentially the same type of parallelism underlies all systems of DNA computing. In experimental DNA computing, the working assumption is that all of molecules are present in such huge quantities that they can be considered infinite. All formal models considered share this assumption: We start from a (generally finite) initial language L, with all words w ∈ L present in arbitrarily many copies. Similarly, in experiments, a series of biochemical operations is applied sequentially, but each biochemical operation applied affects all molecules present. This is reflected in the theoretical models, where each word is rewritten sequentially, but this sequential rewriting is applied to all words in parallel. In this way, one computation gives all possible solutions. In the context of language generating systems, we can say that one ’run’ of the system gives the entire generated language. This is an important difference with most known models of computation, such as the Turing machine or Chomsky grammars or as we will present the P-systems, where one run of the system accepts or generated just one word in the language. Now let us see how a membrane system can generate languages. First, as the most usual case the so-called multiset languages are considered. The membrane system in Figure 1 starts with a copy of the object a and b in membrane 1. In any subsequent configuration (except the halting one) there is also exactly 1 copy of a and b in membrane 1. Either a or b processed by the rule sending symbols into membrane 2 the other symbol should be processed by the same type of rule, because of the maximal parallelism. In these parallel steps two copies of a and three copies of b appear in membrane 2. No rule is available in membrane 2 therefore all copies inside still remain there without any change. The process continues until using the cooperative rule sending out both a and b from membrane 1. The computation halts with this step. A word of the generated language is in membrane 2 (we consider it as the output membrane). This membrane contains objects a2n b3n depending on the length of the process. So, the generated multiset language as a Parikh-set is (2n, 3n).

Parallelism in DNA and Membrane Computing

a → a(a, in2)(a, in2) b → b(b, in2)(b, in2)(b, in2)

285

1

ab → (a, out)(b, out)

2

a b

Fig. 1. A cooperating membrane system generating multiset-language a2n b3n in membrane 2

As we have seen in membrane system the parallelism is inside the computation of a (multiset) word. The result of a computation process is a particular word of the language, and it can be any because of the non-determinism. To generate the whole language we need to restart the computation many times (usually infinitely many times). So, the power of parallelism used inside a computation of a word: which means a fast and effective computation for each word. Parallelism can be controlled through cooperative rules, catalysts, etc. Sometimes the objects sent outside of the system are building the result word (traces). The parallelism has the same role here, the whole system constructs only a very particular part (a word or some words having the same Parikh-vector) of the language in a run. The basic idea of parallelism is the same whenever we consider membrane system using catalysts, evolution rules, priorities, cooperative rules, symport, antiport, electrical charges, dissolutions, creating and/or dividing membranes. The only difference is that using active membranes one can dynamically play with the structure of the system as well. With more membranes one can easily organize the derivation process because the membranes can have various rule sets. The creation and division of membranes allow to perform independent computations in parallel way, too.

3

Parallelism in Problem Solving

First we analyse how DNA-computing attacks hard problems, by examining a typical example of a DNA algorithm for problem solving, taken from [2]. The algorithm we consider solves the satisfiability problem for disjunctive clauses (SAT). Suppose the boolean variables of the formula are p1 to pn and the number of clauses is m. Suppose moreover we have a molecule with 2n sites on which we can ’write’, i.e. change the site in such a way it can later be recognised or ’read’ as being written. An unwritten site is interpreted as 1 (true) and a written one as 0 (false). We start with a single molecule which encodes 2n 1’s. We interpret this as the values of p1 , ¬p1 , p2 , ¬p2 , ...pn , ¬pn . Now, the algorithm for solving SAT is the following: 1. For each variable y, divide the solution into two parts. In one part, write the site for y. In the other part, write the site for ¬y. This yields all consistent assignments of variables. 2. For each clause, divide the content of the test tube. If for instance the clause is (pi ∨ ¬pj ), we divide it into two parts. In one part, remove all molecules which have ¬pi = 1, in the other, those which have pj = 1. Thus only molecules which satisfy this clause remain. 3. Check if a molecule remains, if so, the answer is ’yes’, else ’no’. We see that all the solutions are generated and checked in parallel, and in that way we can simultaneously explore all options, allowing to solve the instance of SAT in O(n + m) biochemical operations, i.e. in linear time, assuming each operation takes constant time. The idea and the concept of parallelism are the same as in language generation. Now let us analyze how membrane systems can be used for effective problem solving. We briefly describe how membrane creating can be used to solve SAT in linear time. First we have

286

Benedek Nagy and Remco Loos

an initial membrane with only one object. Applying the only applicable rule for this object we introduce two new objects corresponding to the possible values of the first variable, and a technical object to continue the process. Then the new objects of the logical variable create new membranes (and copy some symbols to the new membranes). Now for each new membrane two new objects are introduced corresponding to the next variable, these new objects create new ones again, etc. Finally the membrane structure forms a complete n-level binary tree. Each path from the initial to a leaf-membrane represents a possible truth-assignment. Now, each membrane in the n-th level compute objects corresponding satisfied clauses of the SAT formula. (It can be done easily by a comparison among the literals of the clauses and the given truth-assignment of the membrane.) Using a cooperative rule a special symbol is sent out if all clauses are satisfied in a membrane. In the next step the previous level membranes forward those symbol. Therefore this special symbol moves up all n levels, and finally leaves the system and terminate the process with answer ’satisfiable’. More technical details can be found in [4]. In this process the power of parallelism builds up a complete tree by levels in linear time. In each membrane in the deepest level there are rules for each clause, therefore the evaluation of clauses can go in a parallel way.

4

The Two Basic Notions of Parallelism

Abstracting from the specific systems we can identify two essentially different notions of parallelism. We classify those here. In the ‘and-parallelism’ the computation needs several branches, which provide some subresults. Typically rewriting in parallel way, but there is a dependence between parallel computations. For instance a parallel computation generates one word. It affects both computational power and complexity. The non-deterministic massive parallel way applying the rules in the membrane system exactly fits this notion. The ’or-parallelism’ (which we could also call ’Chinese army parallelism’) is the following. The parallel branches are independently try to solve the problem. Any of them can produce the solution. There are independent computations and they are performed in parallel. An example is parallelism to generate all words simultaneously. This type of parallelism affects only complexity. We assume that the space can be considered arbitrarily large. The parallelism in DNA computing, where each word is sequentially rewritten, but all words are rewritten in parallel corresponds to this type. Also the use of active membranes in membrane computing allows to have this kind of parallelism. Note that this notion is closely related to non-determinism in the usual sense. This concept of parallelism allows to consider all possible ways of a nondeterministic algorithm to run at the same time. In Table 1 we summarize the differences of these concepts.

5

Conclusions

In this paper we analyzed two fields of natural computing, DNA computing and membrane computing, looking at the parallelism present in them. Even though the parallelism is the main motivation for these fields and an important property of both computational systems and problem solving algorithms, it has never really been studied in itself. Looking at the way the systems in these areas work in terms of parallelism, we made two important observations. Firstly, in spite of the diversity of computational systems and even levels of description (experimental implementations, problem solving algorithms, language generating devices), we can still make general statements about the parallelism present in these fields, because the underlying notion of parallelism is essentially the same in all models. Secondly, we noted that the notion of parallelism used in DNA computing is very different from the parallelism present in all membrane systems. Whereas systems in DNA computing have parallel independent computations, in membrane systems the parallelism is applied in a dependent way, with all parallel

Parallelism in DNA and Membrane Computing and-parallelism

287

or-parallelism

divide the problem to brute-force, several attempts independent subproblems in the same time parallelism inside the production any attempts may produce of the solution a/the solution the solution is provided in a parallel way the solution is provided in a sequential way if an attempt is successful the solution is obtained by ’and’ of the the solution is obtained by ’or’ of the (sub)results of the parallel branches results of the parallel branches the construction of the solution is the construction of the solution goes faster than sequentially in the same time as sequentially if one finds the right guess Table 1. Comparison of the two main concepts of parallelism

computations taking place inside the rewriting of a single configuration. However, in membrane computing there exists a subclass of membrane systems called membrane systems with active membranes, which in addition to the parallelism present in all membrane systems allow for another type of parallelism, which is the same type of parallelism as in DNA computing. It is just this additional kind of parallelism that allows to have efficient solutions to computationally hard problems. In addition we described the concept of the parallelism in computations in a more general way. There are two basic notions. In the so-called ’and-parallelism’ the results of several parallel branches are needed to have the result of the computation. We use parallelism to have effective computation, parallel branches work somehow on the same ’state’ or configuration of the system. In ’or-parallelism’ (it is analogue to the so-called Chinese army algorithm) the branches are independent and any of them can produce the final result. We use parallelism because we do not know which branch will be successful. This concept can be imagined as a non-deterministic machine running in all possible ways at the same time.

Acknowledgements The work of the first author is supported by the programme Öveges of National Office for Research and Technology and Agency for Research Fund Management and Research Exploitation and by the grant OTKA T049409 of the National Foundation of Scientific Research of Hungary. The work of the second author is supported by Research Grant BES-2004-6316 of the Spanish Ministry of Education and Science.

References 1. L.M. Adleman, Molecular Computation of Solutions To Combinatorial Problems, Science, 266: 1021-1024 (1994) 2. T. Head, X. Chen, M. Yamamura and S. Gal, Aqueous computing: a survey with an invitation to participate, J. Computer Sci. & Tech. 17, 672-681 (2002) 3. Gh. Paun, Computing with Membranes, Journal of Computer and System Sciences, 61, 1, 108-143 (2000) and Turku Center for Computer Science-TUCS Report No. 208 (1998) 4. Gh. Paun, Membrane Computing: An introduction. Springer-Verlag, Berlin (2002) 5. Gh. Paun and G. Rozenberg, A guide to membrane computing, Theoretical Computer Science, 287, 73-100 (2002) 6. Gh. Paun, G. Rozenberg and A. Salomaa, DNA Computing - New Computing Paradigms, SpringerVerlag, Berlin (1998)

On the Complexity of Matching Non-injective General Episodes Elżbieta Nowicka1 and Marcin Zawada2 1

Wroclaw University of Technology Chair of Computer Systems and Networks [email protected], 2 Wroclaw University of Technology Institute of Mathematics and Computer Science [email protected]

Abstract. We investigate the complexity of the episode-matching problem, in which an event sequence is matched against partially-ordered sets of events. Typically, an episode can be seen as an interesting regularity (i.e. pattern) that needs to be found in a collection of data. Solutions to this problem have many important applications such as WWW server log files analysis. Although polynomial time algorithms for matching serial and parallel episodes have been proposed, to date no formal complexity result has been established for the general, more interesting but also more complex episodes. We prove that matching general episodes is NP-complete, even if the size of the alphabet is 2. However, we also formally identify a class of episodes for which polynomial time solution exists and give an on-line matching algorithm.

1

Introduction

Episodes defined as partially ordered sets of events are intensively used in areas of data mining and pattern matching to find useful information in large collections of computerized data. The concept of episodes was introduced in [9] and based on the observation that some events (i.e. data items) that frequently occur together can be grouped into higher-level structures that provide useful information. Episode mining and matching problem solutions have become of interest in many recent applications, e.g. sale trend analysis, DNA sequences analysis, WWW server log files analysis, alarm correlation and computer attack signature detection. There are three main classes of episodes considered in literature (accordingly to the nature of the order on the underlying set): parallel (i.e. no order), serial (i.e. total order) and general (i.e. any partial order). Further, episodes can be also viewed as injective (if each event type occurs only once in an episode) or non-injective (if some event types occur more than once). Simple forms of episodes, i.e. serial and parallel ones, used in early episode mining tools [9, 10, 12] turn out to be often insufficient for providing useful information as they produce large number of patterns that can be irrelevant or redundant [8]. Such reasons motivated further study and extension of previous work to deal with general episodes. To date, some solutions have been proposed in the episode mining area [4–7], but most work is done on the assumption of the injectivity of the labeling, like in closure systems for partial orders proposed in [6], or are considered at a heuristic level [7]. Even less similar research has been devoted to the problem of matching general episodes. Previous work on episode matching that includes [1–3, 13] presents various on-line and off-line algorithmic solutions, but they are mostly limited to variants of matching simple episodes. And though, in [3] possibility of the modification of some algorithms to handle general episode matching has been suggested, no efficiency result has been given for such episodes.

On the Complexity of Matching Non-injective General Episodes

289

In this paper we study from the complexity point of view the problem of matching general episodes. Our main result is the proof that matching non-injective general episodes is NPcomplete. We also present an additional result, showing that we can formally determine a class of such episodes for which there exists a polynomial time solution, and we propose an on-line algorithm to address such cases. The structure of this paper is as follows. Section 2 introduces basic definitions we use. Section 3 presents the general episode matching problem. In Section 4 we prove that the general episode matching problem is NP-complete. Section 5 introduces a class of general episodes for which polynomial-time matching algorithms exist. This result motivates Sections 6 in which we describe our matching algorithm and prove its accuracy and time complexity. Some proofs are shortened due to space limitations. Finally, Section 7 draws conclusions and proposes future directions.

2

Basic definitions

Let Σ be a finite set of events, that we will call the alphabet. Given a set Σ, an event sequence H = h1 h2 . . . hn , where hi ∈ Σ for all 1 ≤ i ≤ n, is an ordered sequence of events and elements of H are ordered with respect to event timestamps. A general episode is a triple A = (A, ≺, λ), where A is a set of nodes. The relation ≺ is a strict partial order (i.e. irreflexive and transitive binary relation) on A and λ : A → Σ is a labeling which assigns nodes of A to events. In that context a ≺ b is interpreted as an event λ(a) precedes an event λ(b) in time. An episode, called also a labeled partial order, can be represented as a directed acyclic graph (DAG) with a labeling function λ. A general episode A = (A, ≺, λ) is, with respect to ≺, – serial, if ≺ satisfies (∀a, b ∈ A)(a 6= b ⇒ a ≺ b ∨ b ≺ a), – parallel, if ≺ satisfies (∀a, b ∈ A)(a 6= b ⇒ a ⊀ b). Moreover, with respect to λ, episode A is injective if λ is injective, i.e. (∀a, b ∈ A)(a 6= b ⇒ λ(a) 6= λ(b)), and non-injective otherwise. Notice that since we let λ be non-injective, some events can have multiple occurrences in an episode. A few examples of episodes are presented in Fig. 1. A A

B

?>=< 89:; a1

89:; / ?>=< a2

?>=< 89:; a1

?>=< 89:; a2 B AS

AP

A

89:; ?>=< a1 QQ QQQQ B Q( 89:; a3 6 ?>=< m m m m m mm 89:; ?>=< a2 B AG

Fig. 1. Examples of episodes. AS = (AS , ≺S , λS ) - a serial injective episode, where AS = {a1 , a2 }, ≺S = {(a1 , a2 )}, λS = {(a1 , A), (a2 , B)}, AP = (AP , ≺P , λP ) - a parallel injective episode, where AP = {a1 , a2 }, ≺P is empty, λP = {(a1 , A), (a2 , B)}, AG = (AG , ≺G , λG ) - a general non-injective episode, where AG = {a1 , a2 , a3 }, ≺G = {(a1 , a3 ), (a2 , a3 )}, λG = {(a1 , A), (a2 , B), (a3 , B)}

For an episode A = (A, ≺, λ), some important definitions follow. Two distinct elements a, b ∈ A are comparable, if (a ≺ b) ∨ (b ≺ a) and incomparable, if (a ⊀ b) ∧ (b ⊀ a). An element a ∈ A is called minimal in A, if (¬∃b ∈ A)(b ≺ a). For any set A, |A| denotes the cardinality of A.

290

3

Elżbieta Nowicka and Marcin Zawada

Problem Formulation

We consider the following general episode matching problem, that is an adaptation of the episode mining problem initially introduced in [9]. We say that an episode A = (A, ≺, λ) matches an event sequence H = h1 h2 . . . hn if there exists a function f : A → {1, 2, . . . , n} from nodes of A into indexes of H such that (∀a, b ∈ A)(a 6= b ⇒ f (a) 6= f (b)), (∀a, b ∈ A)(a ≺ b ⇒ f (a) < f (b)), (∀a ∈ A)(hf (a) = λ(a)).

(1) (2) (3)

For example, the event sequence BAB matches the episode AG of Fig. 1, while BBA does not.

4

Episode Matching Problem is NP-complete

In this section we prove that the general episode matching problem is NP-complete. Theorem 1. The general episode matching problem is NP-complete. Proof. We first show that the episode matching problem is in NP. Suppose, we are given an episode A and an event sequence H. The certificate we choose is the matching function f . The verification algorithm affirms conditions (1), (2) and (3) which can be performed straightforwardly in polynomial time. We prove that episode matching problem in general is NP-hard by showing that the k-clique problem (i.e. standard NP-complete problem) is polynomial-time reducible to our problem. The reduction algorithm takes as input an instance of the k-clique problem hG, ki of an undirected graph G = (V, E). Let construct an episode A and an event sequence H such that a graph G contains a clique of size k if and only if there exists a match of episode A to an event sequence H. An episode A = (A, ≺, λ) over the alphabet Σ = {α, β} is constructed from a graph G = (V, E) as follows (see also Fig. 2). A = V ∪ E,

(4) 0

0

≺ = {(v, e) : v ∈ V ∧ e ∈ E ∧ (∃v ∈ V )((v, v ) = e)}, ½ α, if a ∈ V, λ(a) = for a ∈ A. β, if a ∈ E,

(5) (6)

Let consider the following event sequence H . . . β ααα H = ααα . . . α βββββ . . . β . | {z. . . α} βββββ | {z } | {z } | {z } k |V |−k |E|−(k (k2) 2)

(7)

The above reduction can be easily computed in polynomial time. We show now that a transformation of a graph G into an episode A is a reduction by showing that if there exists a clique of size k in a graph G, then there exists a match f of an episode A to an event sequence H that satisfies conditions (1), (2) and (3). We construct a function f using a graph G. At first we define consecutive values for the vertices V 0 of the clique f (v10 ) = 1, for v10 ∈ V 0 , f (v20 ) = 2, for v20 ∈ V 0 \ {v10 }, .. . 0 0 f (vk ) = k, for vk0 ∈ V 0 \ {v10 , . . . , vk−1 }

On the Complexity of Matching Non-injective General Episodes β

?>=< 89:; v3 3 ® ® 333 e2 ®® 33e3 ® 33 ® ®® ?>=< 89:; ?>=< 89:; v2 v4 e4 22 ¯ 22 ¯ ¯ 22e5 e1 ¯¯ 22 ¯ ¯ 2 ¯¯ 89:; ?>=< ?>=< 89:; v2 v4 G +3

β

β

β

291

β

89:; ?>=< ?>=< 89:; ?>=< 89:; ?>=< 89:; 89:; ?>=< e1 e2 e3 e4 e5 E O O Y33 xx< O O Y33 O Y33 ® 33 33 ® x3x 33 33 xxx 333 ®® ® x 33 33 ®® xx33 ® xx ?>=< 89:; ?>=< 89:; 89:; ?>=< 89:; ?>=< ?>=< 89:; v3 v5 v1 v2 v4 α

α

α

α

α

A

Fig. 2. Reducing a clique problem to a problem of general episode matching. (a) An undirected graph G = (V, E) with a clique V 0 = {v2 , v3 , v4 }. (b) The episode A = (A, ≺, λ) produced by the reduction algorithm. For V = {v1 , v2 , v3 , v4 , v5 }, E = {e1 , e2 , e3 , e4 , e5 } and Σ = {α, β}, we have A = {v1 , v2 , v3 , v4 , v5 , e1 , e2 , e3 , e4 , e5 }, ≺= {(v1 , e1 ), (v2 , e1 ), (v2 , e2 ), (v2 , e4 ), (v3 , e2 ), (v3 , e3 ), (v4 , e3 ), (v4 , e4 ), (v4 , e5 ), (v5 , e5 )}, λ(vi ) = α and λ(ei ) = β, for i = 1, 2, 3, 4, 5

and for the edges E 0 of the clique f (e01 ) = k + 1, for e01 ∈ E 0 , f (e02 ) = k + 2, for e02 ∈ E 0 \ {e01 }, .. µ ¶ . k f (e0(k) ) = k + , for e0(k) ∈ E 0 \ {e01 , . . . , e0(k)−1 }. 2 2 2 2 Then we also need to define values for remaining vertices V \ V 0 µ ¶ k f (v1 ) = k + + 1, for v1 ∈ V \ V 0 , 2 µ ¶ k f (v2 ) = k + + 2, for v2 ∈ (V \ V 0 ) \ {v1 }, 2 .. . µ ¶ k 0 f (v|V \V 0 | ) = k + + |V \ V |, for v|V \V 0 | ∈ (V \ V 0 ) \ {v1 , . . . , v|V \V 0 |−1 } 2 and for the remaining edges E \ E 0 µ ¶ k f (e1 ) = k + + |V \ V 0 | + 1, for e1 ∈ E \ E 0 , 2 µ ¶ k f (e2 ) = k + + |V \ V 0 | + 2, for e2 ∈ E 0 \ {e1 }, 2 .. . µ ¶ k 0 0 f (e|E\E 0 | ) = k + + |V \ V | + |E \ E |, for ek ∈ E 0 \ {e1 , . . . , e|E\E 0 |−1 }. 2 Now, we show that for f defined as above, conditions (1), (2) and (3) are satisfied. Condition (1) is satisfied, because when we construct the function f we are using consecutive positions of an event sequence H. Hence, the function f has to be injective. To check condition (2), let a, b ∈ A such that a ≺ b and b ∈ E. Let consider the two only possible cases: a ∈ V 0

292

Elżbieta Nowicka and Marcin Zawada

and a ∈ V \ V 0 . In the first case, if a ∈ V 0 , then (∀e ∈ E)(f (a) < f (e)) that follows the above definition of a function f , hence f (a) < f (b). In the second case, if a ∈ V \ V 0 , then (∀e ∈ E \ E 0 )(f (a) < f (e)) that follows the above definition of a function f . We claim that if (∀e∈E 0 )(a ⊀ e),

(8)

then f (a) < f (b). So we need to show that condition (8) holds. Assume by contradiction that (∃e ∈ E 0 )(a ≺ e), for a ∈ V \ V 0 . That means that there exists an edge of a clique in which one of its endpoints is in V \ V 0 . That leads to a contradiction. Condition (3) is satisfied if and only if (∀v ∈ V )(hf (v) ∈ λ(v)) and (∀e ∈ E)(hf (e) ∈ λ(e)), that holds from a construction of f and condition (6). Conversely, suppose that f is a match. We must show that there is a clique of size k in the graph G. We construct a graph G00 = (V 00 , E 00 ) as follows © ª V 00 = f −1 (1), f −1 (2), . . . , f −1 (k) , ½ µ µ ¶¶¾ k E 00 = f −1 (k + 1), f −1 (k + 2), . . . , f −1 k + . 2 ¡ ¢ This means that there is some k-element set V 00 ⊆ V of vertices of G such that there are k2 edges in E 00 ⊆ E of G with both endpoints in V 00 . Thus G00 forms a clique of size k in a graph G. u t Notice that the considered problem is NP-complete even if |Σ| = 2.

5

A class of Collision-free Episodes

In this section we introduce a new class of episodes that we call collision-free. We prove in the next section that for such a class of episodes there exists a polynomial-time algorithm that can find a function f that satisfies conditions (1), (2) and (3). At first we introduce necessary definitions. Two episodes A = (A, ≺, λ) and A0 = (A0 , ≺0 , λ0 ) over the alphabet Σ are isomorphic, denoted as A ∼ = A0 , if and only if there is a bijection f : A → A0 such that (∀a, b ∈ A)((a ≺ 0 b ⇔ f (a) ≺ f (b)) and (∀a ∈ A)(λ(a) = λ0 (f (a))). S Let B ⊂ A. Then by cutting of A on set B, we define A|B = (AB , ≺B , λB ), where AB = x∈B {a ∈ A : x ≺ a} ∪ {x}, ≺B = {(a, b) ∈≺: a ∈ AB ∧ b ∈ AB }, λB = λ|AB . Let K = (K, ≺K ). Then by A \ K = (A \ K, ≺ \ ≺K , λ) we define a subtraction of A and K. Definition 2. Let A = (A, ≺, λ) be an episode. Let Nx = {b ∈ A : x ≺ b ∧ ¬(∃c ∈ A)(x ≺ c ∧ c ≺ b)} be the set of elements that are immediate successors of x ∈ A. We say that episode A is collision-free if the following condition holds (∀B ⊂ A)((∀a, b ∈ B)(a = 6 b ∧ a ⊀ b ∧ b ⊀ a ∧ λ(a) = λ(b)) (9) ⇒ (∀a, b ∈ B)(a 6= b → A|B \ Ka ∼ = A|B \ Kb )), S where Kx = ({x}, y∈Nx {(x, y)}). To name a whole class of such episodes that satisfy (9), we use CF . If an episode A is collision-free, we write A ∈ CF, otherwise we write A ∈ / CF. In particular, notice that condition (9) allows identical labels to repeat along chains of A, without any other constraints. Obviously, condition (9) holds for the whole class of injective episodes, i.e. for episodes in which every two nodes have different labels. Hence, every injective episode is collision-free. Practically, in most occurrences of the episode matching and mining problems the episode is relatively small and the event sequence is very long. For such episodes checking condition (9) can be done easy and fast. Moreover, since in episode matching, the episode is often reused, so checking this condition, i.e. preprocessing the episode, would be done only once.

On the Complexity of Matching Non-injective General Episodes

293

Algorithm CF-MATCH(A = (A, ≺, λ), H = h1 h2 . . . hn ) var i : integer init 1 DON E: set init ∅ Amin : set init ∅ f : set init ∅ 1. Amin := min(A) 2. while (i ≤ n) ∧ (DON E 6= A) do 3. if (∃a ∈ Amin )(hi = λ(a)) then 4. r ∈ {a ∈ Amin : hi = λ(a)} 5. DON E := DON E ∪ {r} 6. Amin := min{A \ DON E} 7. f := f ∪ {(r, i)} 8. end 9. i := i + 1 10. end 11. if DON E 6= A then return ∅ 12. return f

Fig. 3. Pseudo-code of the CF -MATCH

6

Our Matching Algorithm

In this section we prove that for a class of collision-free episodes indeed there exists a polynomial time and space solution. Namely, for such a class we give an algorithm which time complexity is O(|H| + |A|2 ) and space complexity is O(|Σ| + |A|2 ). Our algorithm works on-line, i.e. each consecutive event of sequence H is read only once. 6.1

Description of the Algorithm

Let A = (A, ≺, λ) and A ∈ CF . Let H = h1 h2 . . . hn be an event sequence. A pseudo-code of the algorithm CF-MATCH is given in Fig. 3. Let DON E denotes the set of all already matched elements of A and let Amin denotes the set of all minimal elements of the set A \ DON E. The main idea of the algorithm is as follows. The algorithm starts from setting DON E to be the empty set and Amin to be the set of all minimal elements of A. In the main loop of the algorithm a current element from an event sequence H, say hi , is scanned. If a match occurs, i.e. if hi = λ(a) for some a ∈ Amin , then one of such elements from Amin is marked as DON E and the set Amin is recalculated. The whole matching procedure is repeated for the next event from H and a new Amin . Scanning an event sequence H is continued until no more elements are left in A or if the end of an event sequence H is reached. If A \ DON E becomes an empty set before an event sequence ends, a matching is reported what means that an episode (A, ≺, λ) has been found in an event sequence H. 6.2

Correctness and Complexity of the Algorithm

We prove now the correctness of the algorithm CF -MATCH and analyze its running-time and space complexity. Theorem 3. (Soundness) Let A = (A, ≺, λ) and A ∈ CF . Let H = h1 h2 . . . hn be an event sequence. If algorithm CF -MATCH finds a match f , then f satisfies conditions (1), (2) and (3).

294

Elżbieta Nowicka and Marcin Zawada

Proof. Our algorithm can be represented as follows. Let r0 = ∅, Si−1 ri ∈ {a ∈ min{A \ j=0 rj } : hi = λ(a)},

(10)

for i = 1, 2, . . . , n, where ri denotes a current run with a label that is the sameSas a currently i−1 scanned hi for an iteration i = 1, 2, . . . , n. Also note that elements of a set {A \ j=0 rj } : hi = λ(a)} are incomparable. Let i1 be the first iteration such that ri1 6= ∅ and in general let ik be the k-iteration such that rik 6= ∅. Hence the function f is chosen by our algorithm in the following way f (rik ) = ik , for k = 1, 2, . . . , |A|. (11) We will show that (11) satisfies conditions (1), (2) and (3). Condition (3) holds straightforwardly. To show that condition S (2) holds, assume that a, b ∈ A and a ≺ b. We have to show n that f (a) < f (b). Because A = k=1 rk if a match has been found, then there exist p, q such that a = rp and b = rq . It easy to see that for our function (11) condition f (a) < f (b) holds if and only if p < q. Since we choose only one element r at each time, hence p 6= q. Suppose, that Sq−1 p > q, i.e. b = rq has been matched before a = rp . In that case b = rq ∈ min{A \ k=1 rk } and Sq−1 a ∈ A \ k=1 rk , thus a ⊀ b which contradicts the assumption that a ≺ b. Condition (1) holds, because at each iteration only one element r is matched. Thus function f has to be injective. Now, we only need to show that indeed the way in which we have chosen an element Si−1 ri ∈ Bi does not matter, where Bi = {a ∈ min{A \ j=0 rj } : hi = λ(a)} in case of |Bi | > 1. Let us assume that we have a run r0 , . . . , rn of our algorithm in which it finds a match. Let i ∈ {1, . . . , n} be the first index that ri ∈ Bi and |Bi | > 1. It is easy to see that, if |Bi | > 1, then the elements of Bi are incomparable. Suppose that, now we choose ri0 ∈ Bi such that ri0 6= ri . We need to show that S S (12) Bi \ {{ri }, z∈N (ri ) {(ri , z)}} ∼ = Bi \ {{ri0 }, z∈N (r0 ) {(ri0 , z)}}, i

where Bi = (Bi , {(a, b) ∈≺ : a, b ∈ Bi }). By assumption that the episode is collision-free, we obtain that Bi = A|Bi , since elements of Bi are incomparable and also they have identical labels. Hence by (9) condition (12) holds. u t Theorem 4. (Completeness) Let A = (A, ≺, λ) and A ∈ CF . Let H = h1 h2 . . . hn be an event sequence. If there exists at least one matching that satisfies conditions (1), (2) and (3), then algorithm CF -MATCH finds one of them. Proof. (Sketch) Let fix the episode A = (A, ≺, λ). At first we assume that for A the stronger condition holds (∀a, b ∈ A)(a 6= b ∧ λ(x) = λ(y) ⇒ x ≺ y ∨ y ≺ x). (13) We can easily see that condition (13) implies (9). If condition (13) holds then we can represent our algorithm as follows B0 = ∅, Bi = {a ∈ min{A \

i−1 [

Bj : λ(a) = hi }},

j=0

for i = 1, . . . , |H| and |Bi | ≤ 1. Let F(H) be a set of all functions f for which conditions (1), (2) and (3) hold for the event sequence H. On the set F(H) we introduce the lexicographical ordering as follows. Let f, f 0 ∈ F(H), then f