Parallel Processing Capability Versus Efficiency of ... - Semantic Scholar

Parallel Processing Capability Versus Efficiency of Representation in Neural Networks Sebastian Musslick1 ([email protected]), Biswadip Dey2 ([email protected]) ¨ Kayhan Ozcimder2 ([email protected]), Md. Mostofa Ali Patwary3 ([email protected]), Theodore L. Willke3 ([email protected]), and Jonathan D. Cohen1 ([email protected]) 1 Princeton

Neuroscience Institute, Princeton University of Mechanical and Aerospace Engineering, Princeton University 3 Parallel Computing Lab, Intel Corporation

Methods For the purpose of our graph analytic explorations, we conA sider a bipartite graph of task representations as a singlelayer, feed-forward network in which edges between input and output nodes constitute a process (referred to as task). Each task corresponds to a unique mapping from an input representation to an output representation (simplified as input and output nodes respectively). Based on the assumption that cross-talk between processes arises due to shared representations, we derive an interference graph in which nodes correspond to tasks and edges represent interference between pairs of tasks. Finding the maximum independent set of this graph corresponds to finding the PPC of the network, i.e. the largest set of tasks that can be performed in parallel without interference. We investigate how network size (number of input/output nodes) and the degree of process overlap (outdegree of input nodes) affect the PPC across different networks. We also describe how the interference graph can be extracted from task representations encoded in neural networks or from neuroimaging data, and use this to compare predictions about multitasking performance for specific task combinations with actual multitasking performance in trained 2-layer neural networks. Finally, we analyze how multitasking capability trades off with the degree to which shared representations develop as a function of feature-overlap across tasks. To further investigate the tradeoff between efficiency of representation and multitasking capability, we assessed learning speed, generalization performance, and multitasking performance for networks in which weight vectors for similar tasks are either initially correlated or de-correlated (inducing

Results and Discussion In line with earlier numerical work (1), we found that the average PPC of single-layer networks drops precipitously as a function of process overlap, and scales highly sublinearly with network size (Fig. 1A). Our simulations show that the extracted interference graph of a trained neural network can predict how well the network performs a given set of tasks simultaneously (Fig. 1B). Under the assumption of task-set inertia, parallel-processing limitations extend to networks in which multitasking is implemented sequentially, by switching between tasks as rapidly as possible: tasks predicted to be interfering were subject to greater switch costs that constrained serial multitasking performance (Fig. 1C). In accordance with observations that shared task representations are likely to develop in environments with high correlations between task-relevant features, we observe that multitasking capability of trained networks drops in such environments. We finally observe that weight priors on task similarity improve learning speed and generalization performance but lead to strong constraints on PPC. Our simulation results identify a tradeoff between learning efficiency and multitasking capability, suggesting that multitasking limitations of a neural architecture may arise as the result of a bias towards shared representations and the value of learning efficiency.

A

B

Graph-Analysis of PPC 8 6 4 2 1

2

3

4

5

Predicting Multitasking Performance 0.4 0.3

Tasks Non−Interfering Tasks Interfering

0.2 0.1 0

6

7

Pathway Overlap Process Overlap P

2 3 4 Total Number Of Tasks Performed

C

Network Performance (MSE)

A key feature of neural networks is their ability to support the simultaneous interaction among large numbers of processes in the learning and processing of representations. However, how the richness of such interactions trades off against the ability of a network to simultaneously carry out multiple independent processes – a salient limitation in many domains of human cognition – remains largely unexplored. In this work we address this question using graph analytic tools in combination with neural network simulations. We describe initial findings that we hope will help lay the groundwork for a rigorous exploration of the factors that affect the tension between efficiency of learning, generality of representation, and parallel processing capability (PPC) in neural networks.

a bias towards shared or separate task representations).

Parallel Processing Capability Parallel Processing Capability

Introduction

Network Performance (MSE)

2 Department

8

5

6

7

8

Network Size Network Siz

Task Switching 0.2 0.15

Tasks Non−Interfering Tasks Interfering

0.1 0.05 0

10 20 30 40 Time Step After Switch

50

Figure 1: Graph analytical and neural network simulation results.

References 1. S. F. Feng, M. Schwemmer, S. J. Gershman, J. D. Cohen, Cognitive, Affective, & Behavioral Neuroscience 14, 129 (2014).

Parallel Processing Capability Versus Efficiency of ... - Semantic Scholar

Parallel Processing Capability Versus Efficiency of ... - Semantic Scholar

Suggest Documents

Filtering versus parallel processing in RSVP tasks - Semantic Scholar

PARALLEL VIDEO PROCESSING ... - Semantic Scholar

measuring process capability versus ... - Semantic Scholar

Organizational capability, efficiency, and ... - Semantic Scholar

A learnable parallel processing architecture ... - Semantic Scholar

(REE) Fault- Tolerant Parallel-Processing ... - Semantic Scholar

A domain decomposition parallel processing ... - Semantic Scholar

Parallel Distributed Processing Approaches to ... - Semantic Scholar

Parallel information processing channels created ... - Semantic Scholar

Parallel and Distributed Processing - Semantic Scholar

Parallel central processing between tasks ... - Semantic Scholar

Evaluating solvency versus efficiency performance ... - Semantic Scholar

Immediate versus sustained processing in ... - Semantic Scholar

Transaction Processing Versus Decision Support - Semantic Scholar

Usability of parallel processing computers in ... - Semantic Scholar

Parallel Processing of Objects in a Naming Task - Semantic Scholar

User Transparent Parallel Processing of the 2004 ... - Semantic Scholar

Synchronous Parallel Processing of Big-Data ... - Semantic Scholar

Analysis of Parallel Scan Processing in Shared ... - Semantic Scholar

Gossip Versus Punishment: The Efficiency of ... - Semantic Scholar

Parallel Motion Processing for the Initiation of ... - Semantic Scholar

Parallel Processing for Detection of Lunar Crater ... - Semantic Scholar

The Effects of Conceptual Processing Versus ... - Semantic Scholar

The Processing of Associations versus the ... - Semantic Scholar