PDP Models Can't Learn Relational Representations

PDP Models Can’t Learn Relational Representations Guillermo Puebla & Leonidas A. A. Doumas

Department of Psychology, PPLS, University of Edinburgh, Scotland, UK.

Story Gestalt Model •

•

Taking the activation of the gestalt layer at the previous time step and combining it with the input sentence at current time allows the model to form a “gestalt” representation of the story “so far”.

•

•

Story example:

•

We use ADAM for training instead of plain SGD ( more efficient).

•

What the model can do:

•

agent = Andrew, predicate = decided-‐to-‐go, destination = beach agent = Andrew, predicate = drove, patient = Mercedes, destination = beach agent = Andrew, predicate = returned, destination = home

The model fails to capture the role of of texts with untrained concepts and texts that break the statistical regularities of the corpus:

•

Statistical regularity test:

Before testing the relational processing capabilities of the model, we replicated St. John ( 1998) original results to ensure that our implementation of the model was correct.

• • •

7 possible roles ( agent, predicate, patient or theme, recipient or destination, location, manner, attribute ) of which only the “predicate” role is mandatory:

Albert

•

Replication of previous results

This model process a story by taking as input a sequence of sentences, one by one, and forming a representation of the story presented so far in the gestalt layer. • •

•

Lois

•

However, the model can’t go beyond its training data set ( it can only use seen pairings).

•

Instead of trying to emulate relational processing without using symbols, the real problem is how to implement symbolic operations in a neural-‐like architecture.

•

Future work:

•

Andrew liked .

•

Andrew went-‐to the restaurant-‐0.

•

Andrew ordered .

• We plan to extend this simulations to complete distributed representations of concepts in the input layer of the model.

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Inference: if Clement goes to a restaurant, he won’t tip. Revision: if Clement, ordered expensive wine though, he will. Pronoun resolution: ‘He’ refers to Clement if there is no other male in the story.

References mean violation mean regular

0 contexts 8 contexts 0 contexts 8 contexts

Generalization •

Much of the positive evaluation that the Story Gestalt model has gained comes from its generalization capabilities.

•

St. John ( 1998) argued that the model is able to process untrained texts with the help of a highly combinational corpus.

•

Script example: • liked . • went-‐to the restaurant-‐0. • ordered .

•

Script restrictions: • ‘Andrew’ likes [‘pancakes’, ‘salad’]. • ‘Lois’ likes [‘banana’, ’peaches’].

•

Applying St. Johns procedure involves providing cases in which Andrew likes all kinds of food in other places different than the restaurant ( 1-‐8 here only on Lois).

Lois

•

• It has been argued that the Story Gestalt model forms a overall representation of the story presented so far ( in the gestalt layer). Machine learning research has shown that “deep transition” RNN are powerful time series learning machines ( Pascau et al., 2014). • Other related RNN architectures that don’t have a straightforward interpretation as forming a “gestalt” representation could show the same behavior as the Story Gestalt model ( e.g., LSTM, GRU).

•

Hummel, J. E., & Holyoak, K. J. ( 2003). A symbolic-‐connectionist theory of relational inference and generalization. Psychological Review, 110, 220–264.

•

Marcus, G. F. ( 1998). Rethinking eliminative connectionism. Cognitive Psychology, 37, 243–282.

•

Pascanu et al. ( 2013). How to Construct Deep Recurrent Neural Networks. arXiv:1312.6026

•

Rogers, T. T. & McClelland, J. L. ( 2014). Parallel Distributed Processing at 25: Further Explorations in the Microstructure of Cognition. Cognitive Science, 6, 1024–1077.

•

Rogers, T. T., & McClelland, J. L. (2008b). Précis of semantic cognition: A parallel distributed processing approach. Behavioral and Brain Sciences, 31(6), 689.

•

St. John, M. F. (1992). The story gestalt: A model of knowledge-‐intensive processes in text comprehension. Cognitive Science, 16, 271–306.

•

St. John, M. F., McClelland, J. L. (1990). Learning and applying contextual constraints in sentence comprehension. Artificial Intelligence, 46, 217–257.

Albert

New concepts test: 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

mean violation

0 contexts

8 contexts

New patient

Here, we show that even though the model is able to draw complex inferences based on the statistical regularities of its training corpus, it fails to capture the relational structure of input that breaks these statistical regularities.

The Story Gestalt model is able to draw complex inferences based on the statistical regularities of its training corpus. At first glance, it looks like the model is making inferences based on abstract roles.

0 contexts 8 contexts 0 contexts 8 contexts

No agent

•

Take for example the Story Gestalt model (St. John, 1998). If this model is presented with a story about a restaurant where Albert plays the role of the agent ordering food, the model would predict that Albert will eat the food and pay the bill instead of bringing the bill or receiving the payment.

•

mean regular

New agents

•

violation

Cross-‐script patient

Recently, Rogers and McCelland (2008, 2014) have argued that ‘gestalt’ PDP models ( St. John & McClelland, 1990; St. John, 1992) can learn to bind abstract roles to objects and use these representations to make inferences based on the relational structure of the input.

New patient

•

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

No agent

The ability of PDP models to capture abstract relational knowledge has been put into question repeatedly (e.g., Hummel & Holyoak, 2003; Marcus, 1998)

Discussion

Effect of applying a highly combinatorial corpus:

•

New agents

•

The core architectural feature of the model is the use of a ”deep transition” RNN:

•

Cross-‐script patient

Introduction

PDP Models Can't Learn Relational Representations

PDP Models Can't Learn Relational Representations

Suggest Documents

Relational Models - DBS

Subset Infinite Relational Models

PDP-503CMX PDP-433CMX - pdfstream.manualsonline.com

Tabular Representations in Relational Documents - Semantic Scholar

Composing Distributed Representations of Relational Patterns

Transforming Graph Representations for Statistical Relational ... - arXiv

Representations and Ensemble Methods for Dynamic Relational

Learning Word Representations from Relational Graphs - Danushka ...

Transforming Graph Representations for Statistical Relational ... - arXiv

Distributed Relational State Representations for ... - Semantic Scholar

PDP-503CMX PDP-433CMX - pdfstream.manualsonline.com

Infinite Hidden Relational Models - DBS

comparing relational models across cultures

comparing relational models across cultures

Intro to Probabilistic Relational Models

PDP-5080HD PDP-4280HD - Pioneer Electronics

PDP TV SERVICE MANUAL PDP-42XR7K

5 On models and representations

PDP-4271HD

PDP-42A3HD PDP-4214HD - Pioneer Electronics USA

PDP FAQs

PDP Worksheet

PDP-4271HD

PDP-42A3HD PDP-4214HD - Pioneer Electronics USA