INVESTIGATION OF AUTOMATED TASK LEARNING ...

4 downloads 15285 Views 2MB Size Report
AND. COMPUTER. ENGINEERING. COLLEGE. OF. ENGINEERING. AND. TECHNOLOGY .... Professor,. Department of Electrical and Computer. Engineering,.
DEPARTMENT COLLEGE OF OLD DOMINION NORFOLK,

OF ELECTRICAL ENGINEERING UNIVERSITY

VIRGINIA

INVESTIGATION LEARNING,

AND COMPUTER AND TECHNOLOGY

ENGINEERING

23529

OF

AUTOMATED

DECOMPOSITION

TASK AND

SCHEDULING

By Principal

David

Investigator:

L.

Livingston

and Graduate

Research

Final Report For the period

Prepared National Langley Hampton,

Under Research

ended

Grant

Submitted by Old Dominion P.O. Box 6369

July

1990

Gursel Serpen Chandrashekar

February

for Aeronautics and Research Center Virginia 23665

28,

Space

1990

Administration

NAG-I-962

Donald Soloway, ISD-Automation

Norfolk,

Assistants:

Technical Technology

the University

Virginia

Monitor Branch

Research

23508-0369

Foundation

L.

Masti

Table

Introduction

and Organization

1.1 Introduction

Search

1 2

.....................................................

2.2 Problem

Definition

Partitions

3.3 Partitions

..........................

.......................................

11 14

18

.............................................. Satisfaction

Neural

Development

3.5 Derivation

of the Energy Basis

3.7 The Transitivity

..........................

.................................

19 20

.....................................

21

Function

21

of the Derivation Constraint

18

Networks

and Decomposition

5 7

..............................................

3.4 Theoretical

3.6 Algebraic

Implementation

.................................................

3.1 Introduction 3.2 Constraint

4

............................................

Application

2.6 Conclusions

3

..........................................

Machine

Function

2.5 Example

3

...............................................

Boltzmann

2.4 Energy

Finding

1

..............................................

2.1 Introduction

2.3 The

..........................................

...............................................

1.2 Organization

Graph

of Contents

.............................. ...............................

...................................

ii

23 23

3.8 The Substitution 3.9 Ground

State

Interconnection

3.11

Boltzmarm

3.12

Conclusions

3.13

Future

Machine

4.2 Hopfield 4.3An

4.6 Future

References

Annealing

..................

33

..................

34

....................

34 42

..........................................

43

.......................................

45

..............................................

Adaptive

4.5 Conclusions

Simulated

Rule

30

.............................................

Networks

4.4 Simulation

Function

and the Activation

with

Satisfaction

4.1 Introduction

............................

of the Energy

Strengths

Research

Constraint

Constraint

Characteristic

3.10

Adaptive

Property

and the Adaptation

Constraint Results

45

Satisfaction

Algorithm Network

.........................................

..............................................

Research

................... ..............

48 ........

53 53 79

...........................................

......................................................

81

83

°oo

111

1 INVESTIGATION

OF AUTOMATED

TASK

SCHEDULING

AND

DECOMPOSITION

by David

L. Livingston

Gtirsel

Serpen 2

Chandrashekar

Introduction

1

L. Masti 3

and Organization

1.1 Introduction This document of neural operations heuristics

networks that

reports

to task planning

humans

perform

and usually

do not perform human-like

well

much due

performance to intractable

reasoning

behind

attempting

1 Assistant

Professor,

Norfolk,

2 Graduate Dominion

3 Graduate Dominion

Research

of research

intractable

Department

manner.

automatic

problems

has

shown

promise

reported

and

in the application

and decomposition Without

planners

of the

as planning

conducted

planning

nature

networks

such

Task

efficient

interaction,

the study

the

use

and decomposers under for

decomposition.

of good generally

consideration. generating This

are

The

acceptable

was

the

primary

herein.

of Electrical

and Computer

Engineering,

Old Dominion

23529. Assistant,

Norfolk,

Research

University,

in a reasonably

neural

problems

VA

University,

to the

and results

and decomposition.

human

of

solutions

University,

on the details

VA

Assistant,

Norfolk,

VA

Department

of Electrical

and Computer

Engineering,

Old

of Electrical

and Computer

Engineering,

Old

23529. Department 23529.

2 The basisfor the work is theuseof statemachinesto modeltasks. Statemachinemodels providea usefulmeansfor examiningthe structureof taskssincemanyformal techniqueshave beendevelopedfor their analysisandsynthesis.It hasbeenour approachto integratethe strong algebraicfoundationsof statemachineswith the heretoforetrial-and-errorapproachto neural networksynthesis.

1.2 Organization The first

research

category

constraint

we have

reported

satisfaction

representing

performed

in section networks,

the task.

The

can be broken

2 examines

the

down

use

to plan a task by finding

necessary

background,

of

into three a type

the transfer

theory

broad

of neural sequence

and an example

categories. networks,

The called

of a state machine are presented

in this

section. The structures

second called

As in section The some the

category,

s.p. partitions

2, background, final

category,

of the shortcomings problems

constraint

section

studied

satisfaction

3, deals

which theory

section

are essential

4, deals

in the previous

with

and report

sections. on some

satisfaction

machine

networks

decomposition

to find process.

are included.

a technique

satisfaction two

constraint

in the state

and an example

of constraint

network

with using

networks We call

that used this

of the primary

was

developed

in the search

to overcome for solutions

of

new

technique

an adaptive

results

achieved

at this point.

3 Graph

Search

2.1 Introduction The presence

main

action

of task

of constraints.

restrictions

of some

Conducting

real-time

examples

of implementations

i.e., solve

constraint

Two searches and

the

Both

the

paradigms

realized

effectively

[6],

defined,

conditions

the

and

the order

update

rule to prevent

update

is achieved

minimum

the network

by choosing

schedule conditions

irrespective is employed.

the

the neurons

of network The tradeoff

is thus the sequential

network

settles

are updated.

machine, initial

meet

Neural state

manner

is

a

using

in the

the

time

networks

are

space

search;

[1].

constraint

satisfaction

deterministic

method

stochastic

of a performance

techniques.

function

and the associated

into

Hopfield

trapped

which

for update

conditions

determined

networks

require

in random

each local

is

of the algorithm

that

by

have

its initial

an asynchronous

In practice,

asynchronous

order.

minimum

given

parameters

is solely

in limit cycles.

for the insensitivity

character

which

once the constraints

the neurons

not

perform

searches

minimum

may

space

topology.

from getting

In the case of the Boltzmann probability

for a local

by the interconnection

local

[5]

performs

state

the constrained

which

[4],

large

methods.

and distributed

networks [3],

fashion

for parallel

in a parallel

which

network,

of a very

in a serial

a need

[2],

[7]

search

In the case of a Hopfield been

search

of neural

network

searching

that can perform

problems

examples

machine

in the network

this

of algorithms

Hopfield

Boltzmann

is the

tasks indicating

satisfaction

well-known

are

planners

can be visited a sufficiently

of the Boltzmann introduced

with a certain long

machine

by the annealing

annealing

to it's initial schedule.

4 The typesof taskplanningproblemsthatareconsideredin this studyarepurely constraint satisfactionproblems;that is, thereare no costsinvolved. In thesetypes of problems,referred to as syntactic constraintsatisfactionproblems, local minima of the performancefunction generallydo not correspondto solutionsas in the caseof optimizationproblems. Henceit is necessaryto find the globalminimumof theperformancefunctionto find a solutionwhich does not violate any of the constraints. Since the Boltzmann machine convergenceprocessis independentof network initial conditions,it has a better chanceof ending up in the global minimum ascomparedto the Hopfieldnetwork. Thereforethe Boltzmannmachineis preferable over the Hopfield networkfor syntacticconstraintsatisfactionproblemsif the time degradation introducedby the annealingschedulecanbe tolerated.

2.2 Problem Definition Our purposefor the researchreportedherein is to demonstratethe use of constraint satisfactionnetworksto performthe searchfor the shortest,viable pathbetweendefined initial andfinal statesin a statespace.The resultingpathrepresentsa plan for executingthe taskover which the statespaceis defined. In our analogy,we are searchingfor the transfersequenceof a statemachinefor a given initial andfinal statepair. A directedgraph(digraph)is a functional definitionfor a statemachine[8] andis used as an abstractmodel for the transfersequencesearchproblem. Verticesof the graph represent statesand directededgesstandfor transitionsbetweenassociatedstates. By using a proper representation,a constraintsatisfactionnetworkcanbe constructedsuchthat the final statethe networksettlesinto representsthe shortestpath throughthe digraph.

5 The shortestpath betweentwo verticesof a given directedgraph can be definedas a subgraphwhich meetsthe following constraints: 1) the subgraphrepresentinga path is both irreflexive andasymmetric, 2) eachvertex exceptthe sourceandtargetverticesmusthavein-degreesandout-degrees of exactly 1, 3) the sourcevertexhas in-degreeof 0 andout-degreeof 1, 4) the targetvertexhasin-degreeof 1 andout-degreeof 0, and 5) the lengthof the pathis equalto thatpowerof the adjacencymatrix which hasthe first nonzeroentry in the row andcolumnlocationsdefinedby the sourceandtargetvertices respectively.

2.3 The Boltzmann From

Machine

a topological

Implementation

viewpoint

irreflexive

directed

graph

(neurons)

and

interconnections

activation

function

the

where

is defined

a Boltzmann

graph nodes

machine

and weighted

respectively.

Each

can be visualized edges

represent

neuron

output

as a symmetric

the computation is binary

valued.

and nodes The

as

1 Pi(si-l)

where

-

Pi is the probability

computational net_ for a typical

parameter

that

%, the activation

analogous

second-order

to temperature

machine

is defined

of neuron

i, is equal

and neti is the input by

to 1, T is a time-varying sum to unit i. The term

6

neti

where

" E J

Wij

Sj

wij is the connection Neurons

parameter

weight

in a Boltzmann

T is analogous

through

the use

initial

value

value.

This

the noise

and

[9].

problem

activation

asynchronously.

A Boltzmann

at discrete

time

steps

effectively

starts

analogous

to annealing.

machine

is able to escape

temperature until

As previously

parameter

it reaches

in a very

A sufficient

noisy

mode

minima

is assigned

a large minimum

and gradually

for converging schedule

the

local

a predetermined

condition

one is to use an annealing

stated,

reduces to a global

of the form

%

-

+ k)

application

probability

requires

to the global This upper

of being

of all neurons

e

T E(S

minimum bound

i)

convergence is established

is necessarily

Sn, where

in the network,

P (S_) =

and the discrete

a finite

in the state

E(S n)

i

'

high initial temperature

consideration.

values

the machine

approaching

of converging

The

are updated

is decreased

a practical

under

%, and %, and bi is a bias for s i,

The

T Ois a sufficiently

probability

neurons

noise.

with probability

Since

machine

to temperature.

log(l where

between

induced

process

T(k)

b i,

of "thermally"

in a manner

minimum

+

is given

a state by

time-step time,

k approaches

an upper

bound

by the time-limitations less than is defined

infinity for the of the

one. to be the vector

of the

7 where over

E(S_) is the energy 2 N states

of an N-neuron

corresponding

to a low value

is defined

such

constraints

of the problem,

with

that

the

associated

network

[6].

of the energy

minimum

Thus

function

energy

then the Boltzmann

with the state the network

Sn and index will

tend

with high probability.

values

correspond

machine

to the

will seek

i implies

to settle

into a state

If the energy states

function

which

out and settle

a sum

meet

into those

the

states

high probability. The

array the

of the network

network

of neurons network

topology

representing

stands

employed

to search

the adjacency

for an entry

matrix

of the adjacency

for the shortest of a given matrix

path

directed

and thus

in a digraph graph.

is an N×N

Each

for an edge

neuron

of the

in

directed

graph.

2.4 Energy

Function

The general

form of the quadratic

performance

function

a Boltzmann

machine

minimizes

is

E(S)

where

- --_

w_ is the weight

n

.

between

.

wij sis _ +

neurons

.

b_

s i,

si and sj and bi is a bias.

The weight

can be defined

by

8 where

I_ _ R ÷ if the hypotheses

and I_ E R" if they represented

are mutually

to define

topology

machine

translates

symmetric

node Figure

constraint

n are mutually

8_j is equal

supporting

to 1 if the two hypotheses

n and is 0 otherwise.

a path specification

constraints

has to satisfy,

of the path search

the next step

problem

is

for an adjacency

network. has all diagonal

into clamping

entries

all neuron

of its adjacency

outputs

matrix

equal

along the main diagonal

to zero,

which

of the Boltzmann

to 0. Graph

Hence

graph

for constraint

The term

constraints

topological

of the neural

An irreflexive equivalently

under

the graph-theoretic

the corresponding

represent

conflicting.

by si and sj are related

Given

matrix

both nodes

asymmetry

positions

btij = 1 and sj and

the

with respect

column

index

column

index

of node

matrix si equals

a digraph

node

row of the adjacency

if the row index #3 are illustrated

of node

main

that

of node si equals in figure

2.2.

one

of the

of the

two

adjacency

index

of node

the row

index

entries

matrix

si equals of node

located

be equal

the column sj; otherwise

at

to 1.

index

of

6Hi --- 0.

of this constraint. of 1 implies

(constraint the column

#2). index

(constraint

the row index

the existence

The term

with an out-degree matrix

only

diagonal

s_ equals

node with an in-degree

of the adjacency

associated

to the

requires

the implementation

column

Similarly,

#1)

K 1 _ R- if and only if the row

2.1 illustrates A digraph

(constraint

of node

I in the associated

82_j --- 1 and K 2 E R if and only if the sj; otherwise

of 1 requires #3).

of a single

8z_j = 0.

only a single

1 to exist

in the

The term b3_j -- 1 and K 3 E R if and only

of node sj, otherwise

53_j= 0. Constaints

#2 and

9

Figure2.1. Constraint#1.

10

0

1

2

3

0

1

3

Figure

2.2.

Constraints

#2 and #3.

11 A pathspecificationimposestheconditionthatanynodethatis not a sourceor targetand is includedin the pathmusthavean in-degreeandout-degreeof 1 (constraint#4). In termsof the adjacencymatrix, if thereexistsa 1 in a particularrow/column,then theremustexist a 1 in the column/rowthat hasthe samenodelabel asthe correspondingrow/column. The term 64_j= 1 andK4 _ R÷ if column

index

resuks

and only if the row index

of node

si equals

in an excitatory

2.5 Example

the

connection

row

i-th row

It was

entries

assumed

in figure

network,

a digraph

and j-th column

shown

2.1.

in the Boltzmann

Note

machine

The digraph

The

possible

there

of all possible

Since along

s_; otherwise

index

of node

64_j -- 0.

This

sj or the constraint

2.3.

it is known

the main

diagonal

belong

with a specially matrix

diagonal

of the

equal

matrix which

digraph

matrix

for an example are equal

adjacency had

all

matrix its lower

to 1; e.g., let a_j, the entry

to A, the adjacency

adjacency

defined

of the digraph,

digraph

with

in the then

10 vertices

to 0 are equivalently

a_j is

clamped

to 0

V 0 and Vs.1.

ThUS

network.

specification

will be mapped

adjacency

the main

that the entries

this is the largest are circuits

the

above

of a matrix,

if i ,: j + 1.

in Table

that

and the entries

-- 1 if and only

circuits

of node

the column

Application

employed.

triangular

index

as shown

In order to test the proposed was

of node s_ equals

yields

a path of length

path for an N-vertex

to local

lengths minima

that a path to 0 given

digraph.

implying

is irreflexive,

Another

a difficult

of the quadratic

that they

N-1 between

feature

state space performance

we can clamp

represent

vertices

of this digraph to search

all those

function.

the neurons

the hypothesis;

since

is that

which

are located

the path has a self-loop

12

0

1

2

3

o©© ©O©. ©© © Figure

2.3.

Constraint

#4.

13

Vo VI V2 V3 V4 V5 V6 V7 Vs V9 V01

1 0 0 0 0 0 0 0 0

VI1

1 1 0 0 0 0 0 0 0

V21

1 1 1 0 0 0 0 0 0

V31

1 1 1 1 0 0 0 0 0

V41

1 1 1 1 1 0 0 0 0

V51

1 1 1 1 1 1 0 0 0

V61

1 1 1 1 1 1 1 0 0

V71

1

1

1

1

1

1

1

1

0

Vs 1

1

1

1

1

1

1

1

1

1

V 91

1

1

1

1

1

1

1

1

1

Table

2.1.

The

a specially

adjacency defined

matrix digraph.

of

14 for vertex Vi. In orderto mapthe in-degreeof 0 for a sourcevertex,the neuronsbelongingto the columnlabelledby thatvertexareclampedto 0 andsimilarly, the neuronsof the row labelled by the targetvertex areclampedto 0 to realizethe requirementthatthe out-degreeof the target vertex is equalto 0. Assumethatwe arelookingfor a pathstartingwith vertexV0andendingwith vertex V9. Clearly,there existsonly one pathwhich includesthe adjacencymatrix entriesabovethe main diagonalas depictedby table 2.2. This pathrepresentsthe solutionto the planningproblem. The neuronoutputswhich takepartin computationswererandomlyinitialized. The initial starting temperaturewas selectedsuch that at least 80% of the neuronoutput probabilities belongedto the interval of realsgiven by [0.4, 0.6]. The setof parameterslisted in table 2.3 were determinedby trial and error and resulted in convergenceto the state vector which represented the solutionillustratedin table2.2 in over 90%of all trial runs.

2.6 Conclusions We have search

in a directed

constraint the network tools

demonstrated graph.

satisfaction

problems

domain.

It seems

the use of a second-order One

to help with the mapping Another

issue is the need

for the determination Chapter

4 details

important

machine

of employing methodology

mathematics

neural

for the shortest networks

to map a given

may provide

a rich

source

path

to solve

problem

into

for abstract

problem. for a heuristic

of the correct

the incorporation

difficulty

is the lack of a proper that discrete

Boltzmann

set of values of an adaptive

approach

combined

with

for the gain parameters component

a trial and error of the energy

to the constraint

satisfaction

search

function. search

15 algorithmsothat the neuralnetworkcanlearnwhile searching.This eliminatesthe needfor the trial and errorsearchthat is requiredto determinethe valuesof the gain parameters.

16

g

0

V

1

V 2

V 3

V4

V 5

V 6

V 7

V 8

g

VoO

1

0

0

0

0

0

0

0

0

VIO

0

1

0

0

0

0

0

0

0

V20

0

0

1

0

0

0

0

0

0

V30

0

0

0

1

0

0

0

0

0

V40

0

0

0

0

1

0

0

0

0

Vs0

0

0

0

0

0

1

0

0

0

V60

0

0

0

0

0

0

1

0

0

VTO

0

0

0

0

0

0

0

1

0

V80

0

0

0

0

0

0

0

0

1

Vg0

0

0

0

0

0

0

0

0

0

Table

2.2.

The

path between

9

V o and V 9.

17

Parameter KI K2 K3 K,

-1 -2 -2 1 1

I Table

Gain

2.3.

Parameter

values.

18 Finding

Partitions

3.1 Introduction A significant combinatorialy

portion

explosive

the decompositions algebraic

the

Partitions an

use

become

are

computationaly partitioning

addressing

determine

existing

an exhaustive

by a given

growth

the

in problem

input size

structure

and

Steams

search

environment.

This

consequentially

an [10]

machines.

of decompositions for

which from

of sequential

techniques

of the

from

approached

Hartmanis

sequential

and

structure

has been

algebra.

the possibility

Currently

and provide

obtaining

machine

into

the machine's

process

is known

imposes

to

intemperate

resources. of artificial

neural

have

observed

that

"hard"

problems

including

[11]

and

for furthering

the N-P-hard

of decomposition

analyzing

of a task consists

of the general

on partition

for

on performing

resurgence

problem

motivation

algebra

governed with

recent who

relies

a decomposition

the elements

The problem

structure.

based

on computation

researchers

of finding

essentially

their

characteristics

The

a clear

of

intractable

demands

in generating

over state machines

understanding

transition

that

of partition

generated

decompositions

problem

are obtained.

perspective

pioneered

of the effort

machine

the

these

a serious

keen

demonstrate

travelling

problem

exploration

decomposition

has drawn

models the

N-queens

nets

in

problem

[2],

This

into the potential

problem.

from

capabilities

salesman [12].

interest

new

groups addressing the

development

of a neural

of

graph offers

approach

to

19 3.2 Constraint

Satisfaction

Artificial connectionist

models

processing called

neural

Neural

Networks

net models

[1].

The fundamental

takes place through

units or neurons,

are also known

assumption

interactions

where

as parallel

between

for these

each can send either

excitatory

In the application

of neural

nets to constraint

are themselves

used to represent

hypotheses.

The activations

associated

between

the

neurons.

different

When

performing

with

the

hypotheses

neural

represented

are used

of solutions

to constraint

state

by an iterative can be interpreted

by reducing

a well defined

states

to which

energy

function

violates

the

chosen

are

by

of simple

in this fashion

elements

to other

the individual

of the neurons

weighted

processing signals

satisfaction,

The

or

is that the information

or inhibitory

hypotheses.

(PDP)

neurons

are analogous

constraints

known

interconnections

they demonstrate

units

to the to exist

between

the

the capability

for

optimization.

is performed

associated

possible

net models

The computation

This

different

processing

models

a large number

in the system.

validity

distributed

stipulated

energy.

The

that this global

search

as a proposed "objective"

the network defined

relaxation

for the network constraints.

network

which solution

energy

problems

the network

function.

measures

the extent

used shows

state

for updating a general

"good" to which

of

activity

activity

decline

chosen

networks initial

state.

progressively

improves

low-energy

(minimal)

The eventual

the required

possible

by connectionist

with a randomly

represent

Each rule

starts which

or "energy"

converges

activation

satisfaction

or valid

solutions.

the current of the

levels

with every

interpretation

network

of the

The

neurons

iteration.

has

an

is so

20 3.3 Partitions

and Decomposition

A partition and

S = USk.

n on a set S is a division

Each

Sk is called

overlines

atop the disjoint

as blocks

of set members.

subsets

a block

of _.

algebra.

theory

An example

of finite

Mathematically

M -- ,

where

of state

functions

transition

determined spaces,

by certain

the most

machine

characterized

property

of substitution A partition

property

defined,

a finite

S is the state

as satisfied a state

(s.p.) and is denoted

ik _ I, the states

(0)

denoted

-

state

set

is based

machine

mappings). exhibited

them

being

by

on

as defined

S of a state

a partition

by writing

so that the subsets

State

machines

appear

we present

a three-tuple

over

With

to satisfy block

the

s-_}

and

_(1)

substitution

of _x and any input

of n.

M, there

=

of the

machine.

always

exist

two trivial

by n(O) and _x(/), where

{ s I; sz; s 3; ...;

state

to a state

definition state

as

their

respect

a formal

are in the same

.... %} of a machine

partition

set and {/5} is a set

set of a finite

M is said

block

by

generated

property.

the state

from

may be decomposed

the partitions

machine

by n if, V %, sj E S which

derived

M, I is its input

above, over

laws

M is characterized

the substitution

by a partition

set S = {%, _

to represent

V i ,, j

1, 3, 5}

6(ik, %) and 6(i k, %) are also in a common

For any n-state p. partitions

among

Sk 9 Si f3 Sj = 0,

by semicolons

set of the machine

properties

subsets

is:

structure

(or "next-state"

by a three-tuple

on

2, 4;

machine

significant

them

of a partition

state

special

It is usual

and separating

-{0,

The

of S into disjoint

{ sI, sz, s 3.....

s_}

s.

21 Decomposition machine

M equals

independent

theory the trivial

machines

3.4 Theoretical

M 2 operating

a neural

attention.

therefore

use

network.

The network

relation

N.

equivalence

_x(0), then

approach

Partitions

matrices

means

between

states

addressing

to any

to serve

that

problem,

as the

in the "on" state

the

issue

of

of the

energy

and then

establishing

a one-on-one

expression function involved

for the energy

and a determination

space

is

by

the

of state

suggesting

to the incorporated

an

constraints.

function

the

network.

This

is done

correspondence

of appropriate

of its order

next

are based

order.

step

is

establish

by deriving of terms

The

to

the

with

derivation

on the identification

those

the

network in the

of the network of the constraint

for the problem.

Partitions

requirement

column-j

We

proposed

for a problem

at row-i,

subject

representation,

the neurons

function

relations.

Function

between

relations.

into two

representation

with

for partitions

of N x N neurons

interconnections

terms

decomposed

of a proper

correspondence

representation

into a system

a neuron

the issue

a one-to-one

i and j of the state machine

of the Energy

After

energy

M can be directly

for a state

in parallel.

have

is organized

This

3.5 Derivation

classical

of any two s.p. partitions

Development

primary

dimension

that if the product

s.p. partition

M1 and

In developing merits

guarantees

satisfying

By definition, that

they

the substitution

congruence imply

image

property

relations equivalence

define

are implicitly whenever

uniquely equivalence states

corresponding relations

are established

congruence with the added as equivalent.

22 Thus,s.p.partitionsmustpossessthe equivalencerelationproperties: reflexivity, symmetryand transitivity aswell as the (implication) propertiesof imageequivalence. The reflexive and symmetric relations can be implicitly encoded in the representation

scheme,

1. Diagonal Since

neurons

before

Symmetry

With

the above

noticeably

energy

the

constraint,

i.e., the substitution

Thus,

behavior

if we denote

relation

exact

neurons

correspondence

in with

been

obviated.

second

in given

would

and

part

problem

have

been

domains necessary

the substitution

becomes to enforce

property.

Thus,

of essentially

two parts.

The first part

encodes

state-image

congruence

the

property. energy

of the state machine

E 1 is the energy

which

to consist

under

function

function

must essentially

its stimulus

of the network E

where

terms

may be interpreted

part of the network

the energy

points

transitivity

The

state.

portion.

for solution

have

the reflexive

of off-diagonal

of the N x N grid maintain

triangular

in the "ON"

process.

that the states

are therefore

constraint.

to remain

of a state with itself,

by ensuring

space

properties

function

grid are clamped

the computation

the two constraint

properties

transitivity

The second transition

the search

since

remaining

the network

starts

portion

in the lower

and symmetry

The

encodes

method

reduced

the reflexive

the network

triangular

states

method:

of the equivalence

can be incorporated

the in the upper the neuron

the following

in the N x N neuronal

this is representative

is satisfied 2.

by employing

N x N

(input)

map the complete

set to the network

state

dynamics.

by E, we may write

= ktE, 1 + k:,E2,

term due to the transitivity

constraint

and E 2 is the energy

term due to the

23 state-imagecongruence(s.p.propertyor next statefunction)constraint. The parametersk I k 2 are known or weight

as gain terms

that is assigned

3.6 Algebraic

Basis

The neurons eventually solution

states

that provides

3.7 The

then

state

must

equivalence

scheme

the

representing

neurons

the relative

importance

words, also

equivalence

in the network.

from

they can

Boolean

terms on the neuron

of derivation

iff a R b and b R c _ this means

be



between

network

functional

to which

that the ultimately

use techniques

for our purpose

method

states

of derivation.

stable algebra

activation

This is the basis

for the network

energy

function.

Constraint

In other "a"

This means

of the constraint

variables

and algebraic

state.

We therefore

dependencies

R is transitive

representing for

state or the "OFF"

functional

for a systematic

Transitivity

third-order

the "ON"

we may regard as binary

R is defined.

determine

have only one of two possible

of the neural net are binary.

A relation

They

to each constraint.

in the N x N network

the required

which

constraints.

of the Derivation

converge:

to derive states,

for the respective

and

must between

dependence

to state the states

that if state "c".

"a" • state

Therefore,

the

set S over which

"b" and state third

neuron

"b" • state

"ab"

to be and

of the transitivity

"bc"

"ON" are

constraint

whenever in their term

the "ON"

"c",

responsible

"a" and "c" in the N x N topological

be constrained states

a R c, V a, b, c _ (state)

for

representation pair

states.

on the activation

of

neurons This states

is a of

24 The abovediscussionleadsto the conclusionthat the transitivity constraintcannotbe enforcedinto the networkby a functionthat is of the moreconventionallyusedquadraticorder. We thusstatea theoremwhich mathematicallyestablishes thatthe transitivity constraintmustbe a third-orderfunctionof the neuronactivationstates. Theorem Third-order relation

interconnections

of transitivity

in proposed

are required

for an N x N topological

neural

net to verify

the

partitions.

Proof Defining

Boolean

matrix

multiplications

over

n-th order

square

matrices

A and B by

/3

c_j

_

Va_kAbkj

,

(i)

k-i

where

the V and A represent

matrix

C = AB contains

row-i,

column-k

is reflexive

in matrix

bit-ORing

and bit-ANDing

a "1" in the row-column A and row-k,

and symmetric,

column-j

its relation

matrix

M _

The

matrix

equation

MR is a Boolean

matrix

"

respectively,

position

indexed

in matrix

B V k _ [1, n].

M R will encode

MxM

because

operations

we see that the

ij whenever

transitivity

a "1" exists

For a relation

in R [13]

that

iff:

"- M a .

its elements

in

(2)

are

only

l's

or O's.

Thus,

using

(1)

mij

6 M z

-- M×M

(3)

25 can be written

_;j "

_ikAmkj

(4)

k-I

Equation

(2) essentially

means

that V k E [1, n] with mij E M R,

1_ij

_

mij

=

0

(5) ~

/

~I

(mij A m_j ) V

Since

the variables

logic

operations

complement

involved

are binary,

(mij A

we have the following

V x, y _ {0, 1}, into the domain

of a Boolean

(binary)

falj )

variable

xI

of integer

_-

three

0

results

arithmetic

that can map Boolean

operations:

The logical

is

-

(I - x)

(i)

The expression

xAy

equates domain. above

the

logical

Finally, two results

"AND"

operation

the equivalent

-

x×y

to the operation

expression

(i) and (ii) with the second

xy

theorem

(iJ.)

of multiplication

for the logical

(xlAyl) I ,,

yielding

-

"OR"

operation

of De-Morgan:

(xVy)

,

in the

integer-number

is derived

using

the

26 xVy

.

(xIAyl)1

--

( (i

- x)A(I

- y) )I

-

((i

- x)×(l

- y))' (iii)

-- (i - y"-

(i

- 1

x + x×y) ÷ x

"- (x + y-

Thus,

equation

(5) may

be rewritten

n71j (1

which

when

simplified

using

+ y-

xy)

xy)

as

- mij)

the results

(mij

I

- mijmxj

V (1

-

t_ij)

mij

_-

(i), (ii) and (iii) transforms

) V (mij

- mijmxj

)

-

(6)

o

to

0 (7)

"_

Thus,

based

on all the above

mij

results

+ mij

developed,

- 2mij_ij

equation

"- 0

(4) may be written

in the form

n

mij

"

E

mzkmkJ

(8)

k-i

Upon substituting

in equation

(7), we finally

obtain

n

n

k-I

This is an equation We theorem

in m 6 M.



now derivethe transitivity constraintterm using the resultestablishedby the above

and the laws of Boolean

are binary triplets

of third order

k-i

variables,

of neurons

algebra.

we enumerate (Vii, Vjk, V_

Since

the activation

all possible

combinations

in the form of a classical

states

of the neurons

of activation

truth table.

values

This is shown

in the system considering in table 3.1.

27 The transitivity constraintterm (El) assignsa penalty of (positive) unity to the networkenergy function (E) in situationswhere the requiredneuroncombinationsviolate the definition of transitivity. Conformationto the definition of the relationof transitivity obviatesthe penalty. To derive the functionaldependencyof the constraintterm E1 Vjk, V_,

we draw Using

constraint

the Kamaugh-map

the rules

E 1, we may write

E_ which

of Boolean

upon

using

as shown algebra

( vijA vjkA _)

the upper starting

3.1.

the classical

sum-of-products

form,

for the

V

V (_._AvjkAvik)

(vi.iA_kAvik)

i

(1

xAy

"-

xy,

xVy

"-

(x

- x)

,

+ y-

xy)

in the form

the representation

triangular at 0).

(V_j,

the results

g_ Since

to obtain

triple

V i, j, k:

x /

may be written

in figure

on the neuron

Thus,

portion

v;j vjk + vjk v_ + v_j v;k - 3 v;# vjk v_k • used is the N x N network of the square

the last equation

topology,

grid of N x N neurons

may be compacted

(n-3)

EIIE

(n-2)

E i-O

j- (i÷I)

in the system

to the following

(n-l)

Eel k- (j+l)

the indices

i, j, k span only (reference

overall

form:

index

28

Neuron

Neuron

Neuron

Vq

?

Constraint value

E1

(1 or O)

0

No

0

0

Law:

Violated

Vtk

0

Transitivity

No

0

0 No

0

Yes

1

0

0

No

0

0

0

Yes

1 Yes

0

No

1

Table

3.1.

Transitivity

function

truth table

0

V Vii, Vjv V_, E N x N neuron

grid.

29

Vi i Vik

V _,_

00

(!)

3.1.

Kamaugh

I0

0

0

0

0

01

0

Figure

II

map of transitivity

truth values.

30 The equationfor the transitivity that

the

equation

sums

topology

of neurons.

equation

ensures

(Vii, Vjv

V_

evaluates

to a value

3.8 The

Substitution This

satisfying family

indices

that E_ generates

in the

upper

spanning

the

for achieving

triangular

of zero energy

Property

portion for every

upper

the substitution

property.

functions

for

to the network

motivating

set of inputs.

the

The state-transition As a result,

the requirement

the neuron

activation

enumerating

all combinations

input

"k" denoted

V_._0_.k0_). The (present)

that

We note

of the N × N overheads.

V triples

violate

The

of neurons

transitivity.

Ez

transitivity.

state

for pairs subscript "i".

of neurons "8-k(i)"

behavior function

state-image variables

partitions

partitions,

is defined

the must

for each present

this information

to which

current

into

solutions

congruence. again,

(V 0 and its image

denotes

finding

of the state machine

mapping

the extent

as binary

towards

into the generated

the constraint

of preserving states

network

this property

the state transition

term.

table

"k", for the current

energy

that does not violate

net will need to scan the input set and determine

We interpret

portion

in computation

of the N x N grid

To enforce

determining

into the constraint

by the net violate

triangular

a reduction

triple

as required.

Constraint

is responsible

state by a pre-specified

term El is of third order

a +I contribution

constraint

be incorporated

proposed

for

This is useful

of next-state

the neural

only

constraint

the next-state

and generate neuron

a truth

for a specific

determined

by input

31 A truth tablegeneratedby consideringpairsof neuronsin generaltermsis shownin table 3.2. The s.p.constraintterm (£'2), assignsa penaltyof negativeunity to the networkenergyfor all caseswhere the required neuron combinationsviolate the definition of the substitution property. Conformationto the propertyof substitutionobviatesthe penalty. The entriesin the truth tablefor constraintterm E 2 show on

the

states

combination

of the

corresponding

the sum-of-products previously

neurons

triangular

in the

results

half

case

logical

for the E 2 term

the

same

form

as the

"EXCLUSIVE-OR

V i,j,k may

be written

from

dependency

classical

"0110"

(XOR)"

function.

the truth

values

Thus, and the

as

Vij +

_--

As

V___06_,0) has

to a two-variable

form

employed

Vii and

that its functional

of the

Vs-k(i)

b-k(j)

transitivity

of the N x N neuron

grid,

-

2

constraint, the above

Vij

Vs_k(i)

since

8-k(j)

the

equation

indices

span

may be compacted

only

the

upper

to the concise

form:

n-2

n-i

%-E

the benefit

above

of scanning

Combining overall

equation

equation

E

j-(i+l)

6-k(i)8-k(j)-8-k

is quadratic;

constraint

for the network

x (i)_-k

it is computationaly

only a total of [N(N-1)]-,-2

the two

(i)8-kk,_x(J)

E i-O

The

8-khrax

terms

energy

efficient from

E 1 and E 2 derived

function E

neurons

l (j)

2.

of offering

the N x N grid. so far

may now be written

= klE l+k2E,

to the extent

into one as

expression,

the

32

Violation

Neuron

of Substitution Property

VU

0

0

Value

?

0

No

0

Yes

1

Table

3.2.

0

Yes

1

No

Substitution

Property

constraint

E2

term F_, z truth table.

for

33 The gain parameters enforcing

transitivity

k 1 and k 2 are set relatively

shares

This is not only beneficial characteristic

State

The energy

function.

in eliminating

points

Characteristic function

by

classical

global

of the gains,

so that the constraint

the substitution

property.

but is also necessary

due to the

A further minima,

Function has degenerate represent

before

of the network

we have

use of this characteristic

developed

problems

compared

their

zero energies

In other words,

obviates

detection

for merit

associated

with solutions

The

degenerate

the comparison

by their associated of local

exercise

zero-energy

minima

with them.

energy

of the class of the

"optimality".

identified

is the easy

states.

a value of zero to the network

been

accepting

states are now uniquely

these states will not have

ground

satisfaction/optimization

and others, have always

techniques

characteristic

problem

to constraint

problem

due to the fact that solution values.

"tuning"

of the Energy

for the s.p. partition

TSP, graph partitioning

ground-state

with the one influencing

(E) for the network

Neural net solutions

generated

importance

to each other

of s.p. partitions.

3.9 Ground

the solution

equal

equal

since

Thus,

unlike

the energy

\

function energy global

more tangibly

reflects

is zero, the proposed solution,

net needs

the merit of suggested

solution

else the computation

to be improved.

by the network needs

solutions

by the network:

in its current state of activity

to be continued

and the present

If the partition is the required

interpretation

of the

34 3.10

Interconnection The

Strengths

classical

and the Activation

form of the energy

function

Rule (E) for a third-order

network

[14]

is written

as

1

E(e>where

i'j_k

j*k*i

W_3),j, is the weight

(V_V_VO, I,V2)_contains for each

and equating

constraint

rule

this expression

i*j

j_i

the interconnection

between

strengths

pairs of neurons

The

neural

energy

by the

number

of times

neuron.

understood

landscape

and

in the network Machine

A third-order and simulated

This

decision changes

to have

effected

in the

,

i

between

(Vyj)

triples

of neurons

and W(°, represents

cease

an optimization then

reflect with

Boltzmann

that lowers is made

performed

the bias

the intended of the energy

machine

to solve for s.p. partitions.

on

the

state

the global

and energy

by each network

Hopfield-like

function.

The

the difference

it is in the

and

globally.

descent

matrices.

of the network

neuron

gradient

computed

classical

when

function

weight

takes turns in evaluating

"ON"

the

energy

to the three

is based

locally

to affect

the solution

Simulated

net

it is in the

(E) for the network

the entries

Each neuron

when

of the neuron

till further

thereupon

Boltzmann

of neurons

of the network

equation

we obtain

net mechanism.

state of activity

assumed

with the derived

of like terms,

for activation

satisfaction

in the global

3.11

storing

the strengths

the coefficients

The

neurons

k_j_i

matrix

i

EEw' )i,v;vJ

neuron. Comparing

state.

I

E E E w'3'. vivJvk

iterated The

in a 2 N(N_ final,

stable

"OFF" is then enough

network dimension state

of the

by the net.

Annealing network

employing

Amongst

several

simulated algorithms

annealing

is

was developed

that exist for incorporating

35 the techniqueof by Geman Their

simulated

and Geman

rule

into parallel

[9] mathematically

is called

simulations

annealing

classical

for Boltzmann

simulated

machines.

connectionist

nets, the logarithmic

guarantees

asymptotic

annealing

(CSA).

An example

convergence We

problem

used

rule derived

to global

this

is now

illustrated.

in this example.

A state

minima,

technique

in our

Example A state states

space

and three

in table

3.3.

of dimension

inputs

18 is addressed

was presented

Figures

to the network.

3.2 - 3.5 provide

The

an illustration

state table

of the

machine

with

18

for this machine

is given

as it generated

a valid

network

solution:

_x -

(0, 2,

Figure

3.2

4;

shows

intermediate

a snap-shot

states

of the

illustrated

by figures

converged

in four

fifteen

independent

is impressive. Figure annealing

I, 9, 17;

legend

solution

figures.

was

for the

8,

× 18

I0; 7,

net

is shown

to the

(required) in figure

and is tabulated always

various

obtained

I--T;12,

solution 3.5.

The 3.4.

- the fastest

minima.

initial

final

in table

s.p. partitions

the issue of local

15;

in a random

ultimately

The valid global

time-steps

was used to address

in the snap-shot

18

net as it evolved

runs was observed

A valid

3.6 is the

of the

3.3 and 3.4. discrete

3, 5; 6,

listed

14,

starting

solution

state.

configuration

state to which performance The success

in as little

in table

Further

13,

3.4.

relavant

Two are

the network of the

net to

rate of the net

as two Classical

statistics

16 }

time steps. simulated

are indicated

36

Present

state

input

It

input

I2

input

0

0

2

4

1

1

9

17

2

0

2

4

3

3

3

3

4

0

2

4

5

5

5

5

6

6

8

10

7

7

7

7

8

6

8

10

9

1

9

17

10

6

8

I0

11

11

II

11

12

12

14

16

13

13

13

13

14

12

14

16

15

15

15

15

16

12

14

16

17

i

Table

3.3.

18-state,

17

3-input

state

machine

table.

[3

37

NEURON

DISPLAY LAYOUT Actiuations :tatus

in in

right

louer

uindou

ordmr

NETWORK

network

by

STATUS

l

Partition

Energy: to

Iterations

ilO equilibrium:2

TiMe_steps:l On_state

TMax:5

Tnin:O.Oi

Probabilit_:5.08e-05

Press

Space

o_ : •

uindou

Third

leMperature:

ACTIVATIONS

@ :off

bar

to

000000000000000000 000000000000000000 000000000000000000 000000000000000000 O0 O0 O0 O0 O00000 O000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000®00 000000000000000000 O0 O0 O0 O0 O0 O0 O0 O0 O@ O0 O0 O0 O0 O0 O0 O0 O0 O0 000000000000000000 O@ O0 O0 O0 O0 O0 O0 O0 O@

exit Finding

$.P. for

I8-$tateThis

Figure

3.2.

A

random

starting

state

is

of

Partitions an

3-input exa_i)le

the

Machine _:

18

3

in

x 18

thesis.

network.

38

NEURON

DISPLAY LAYOUT _ctivations Status

in in

right

lower

window

window

Third

order

NETWORK re_perature

:

Ti_ejteps:

2

On_tate

b_

STATUS

Probability:

Press

network

TRax:5

Space

Tnin:O.Ol

3.81e-06

bar

to

ACTIVATIONS

@:orr

0.:@

000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000••••00000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000000000 000000000000•00000

exit Finding

S.P. for

IS-StateThis

Figure

3.3.

State

of

the

net

at

is

the

Partitions an

3-input exa_le

second

Machine #:

3

in

thesis.

time-step.

39

NEURON ACTIVATIONS

DISPLAY LAYOUT Activations

in

Status

in

right

louer

uindou

order

NETWORK

Part

it

T tne__st On_state

network

b_l

STATUS

: ion

Ener9_l:

Iterations

to

5

equilibriun:2

: 3

eps

Tnax

Probability:

Press

Space

:5

Thin

:0.02

1.28e-05

bar

to

on: •

oooooooooooooooooo

wir_lou

Third

Temperature

@ :off

OOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOO go go go go OO OO go go OO O0 O0 O0 gO gO O0 OO OO gO OOOOOOOOOOOOOOOOOO OOOOOOOOO®OOOOOOOO O0 go O0 O0 O0 O0 go OO O0 OOOOOOOOOOOOOOOOO0 OOOOO00OOOOOO00OOO go go go go go go go go go OOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOO go go go go go O@ 0000 OO

exit Finding

S.P, for

18-StateThis

Figure

3.4.

Time

lapse

snap-shot

is

Partitions an

3-input exanple

of

Hachine m:

the

3

in

net.

thesis,

4O

NEURON

DISPLAY LAYOUT Activations Status

in

uindou

right

in lower



order

network

• lv_r_l_tk_ NETWORK

bw

_. _.

STATUS

Energy:

Iteration_

to

OOOOOOOOOOOOOOeOOO 000000000000000000

0

On_state

oooooooooooooooooo

eQuilibriun:2

Ti_e__steps:4

OeeeeeeeeeeeeeeeeO 000000000000000000 000000000000000000

OOOOOOOOOOOOOOOOOO

TeKoerature:

Partition

On: •

DOOOOOOOOOOOOOOOOO 000000000000000000

uindou

Third

ACTIVATIONS

:off

Tnax:5 Probabilitg:

Tnin:O.OI

3.57e-06

OOOeOOOOOOeOeOOOOO

ooooooooooooooooo0

OOOOOOeOOOOOOOOOOO OO00OOOOOOOOOOO®O0 OOOOOOOOOOOOOOOOO0

oooooooooooooooooo

OOOOOOOOOOOOeOO®OO OOOOOOOOOOOOOOOOOO Prmss

Space

bar

to

exit Fi_ing

$.P. for

18-$titeThis

Figure

3.5.

Final

solution

state

is

of

Partitive on

3-input exa_le

the

Hachine W:

18

x

3

in

18

thesis.

net.

41

Trial

number

S. p. partition

Time

steps

needed

1

nl

2

2

r_

2

3

_

5

4

zc,,

2

5

z_5

5

6

n6

3

7

n7

3

8

n8

2

9

r,.9

7

10

:t m

2

11

n1_

3

12

ax_a

4

13

n13

2

14

_14

5

15

nls

3

Table

3.4.

Results

of the network

performance

to 15 trial

for solution

runs.

42 3.12 Conclusions Presentapproachesto themachinedecompositionproblemhavereliedon a sequentialand exhaustivesearchtechniquefor generatingpartitionssatisfyingthe substitutionproperty. This is known to becomeintractablewith growthin problemsizeandthereforeimposesintemperate demandson computationresources.In this researchwe havedevelopedan intrisically parallel approachto addressingthe intractabilityinvolved in the partition searchproblemby the useof an artificial neuralnetworkmodel thathaspotentialfor offering a viable alternativeto existing sequentialsearchmethods. The problemof decompositionshasbeencastinto theframeworkof constraintsatisfaction and a neural net model was developedto solve constraint satisfactionproblems through optimization of a mathematicallyderived objective function over the problem space. The formulationstrategiesgoverningthe derivationof objectivefunctionsfor constraintsatisfaction neuralnetshavebeenbasedso far on heuristics.This researchpresentsa formalizationfor the methodby exploiting usefultheoriesfrom Booleanandrelationalalgebras. We haveverified thatthe issueof decompositionsbelongsto a classof problemsthatare beyond the scopeof solvability for second-ordernetworks. A theoremhas been stated and proved establishingthat third-ordercorrelationsmustbe extractedby a neuralnet to generate substitutionpropertypartitions. A third-orderdeterministicnetworksuchas a hopfield net has beenshown to fail in solving the s.p.partition searchproblemdueto the lack of an adequate techniquefor escapingfrom any local minima of its associatedobjectivefunction. Impressive results havebeen obtainedby using a stochasticapproachsuch as a third-order Boltzmann machinenet with simulatedannealing.

43 A spectrumof problem sizeswas addressedusing a network.

In many

excellent

convergence

ground

state

engineering

degradation of

computing

Future

precluded

order

five discrete

scheme

performance the

it has

states

network

of an essentially any substantial

been

immediate

with

the

to introduce

problem

domains

scaling

as simulated

parallel

Due to the

possible

favorable

dimension.

machine

in connection

in arbitrary

has shown

to the input

Boltzmann time-steps.

involved

derived,

solution

However,

simulation

have

at

with the state

currently

does

Intrinsic

limitations

(neural)

approach

show

behind

the

on

a serial

state

as well

improvements.

Research problems

as input dimensions. performance

in decomposition

As a result,

in adequate

in the

communication

input

link

global

set

A suggested of the

[15].

as determined

overall

solution.

A little insight

suggests

strategy

space

between

the

The

common

by

every

state input

to have

for addressing

technique

problem

to their convergent

solution

can be expected

to the input dimension

(a constraint)

minima

with respect

a possible

proportion

of sub-groupoids.

dimension

minima

global

with respect

a software

Real world

respective

space.

in under

function

The network

problem

machine

energy

in identifying

in performance

technique

obtained

of the higher

of the

fast rates.

of the

were

properties

ingenuities

dimension

of lattices

solutions

chracteristic

unconventionally

3.13

cases,

third-order

the issue

is to exploit

is to develop addressed

individual

nets

of activity

reflecting

dimension

that such an approach

and

as they

point

would

of

then

network

of intersection

network

for each

incorporate descend their

a valid,

has its potential

of scaling

the theory

a separate

being

intersection

large-sized

a parallel toward

individual globally

represent in essentially

their global

acceptable the

required

a hardware

44 implementation,since larger the communicationbandwidthsimulated,greater the iterative

loops

a single-CPU

an icreasingly

inadequate

such a "decomposed favoring

using

neural

no higher time,

than

such

Neumann-machine

to the decomposition

problem"

emphasizing

the order

potential

networks,

VLSI

of higher

order

devices

to execute,

at software

on VLSI

of

rendering

it

based

on

simulations

has provided

problems

between

determined

we speculate

problem

attempt

satisfaction

so far have

multi-level

to the decomposition

a future

of interconnections

The

as neural

be required

Our attempts

of optimization/constraint

limitations

would

for the purpose.

toward

quadratic.

of high-speed,

algorithms approach

approach

nets with

but hardware

advent

choice

our conviction The majority

yon

number

corroborative

based

harware.

have been

the neurons

addressed

nets has been

known

limitations

to their

[16].

designed

that the foundations

will one day obviate

sequential

till date

in the network

neural

specially

results

for

use parallel

we have methods

being

for some With

the

computation

laid for a neural altogether.

45 Adaptive

Constraint

Satisfaction

4.1 Introduction The original several

drawbacks

constraint one of which

of the associated

Liapunov

of the state vector. memory

[18],

satisfaction

satisfaction

but

network

This

is the network

function property

is useful

because

proposed

to converge

are reachable

if the Hopfield

feature the local

by Hopfield

is guaranteed

[17], which

is not a desirable simply

network

from

[3], [4], [5] has

only to local

the given

network where

[2],

is used

for the

case

minima

do not correspond

initial

minima position

as an associative

it is used

as a constraint

to solutions

in most

applications.

To examine satisfaction

problems,

with a Hopfield each

node

weight

The initial

and then

is complete

this argument

nodes

define

the process

analytically,

network

the weight

consider

does not perform

of solving

task is to find a suitable

only up to determining

si and s_. The weight

_t

why a Hopfield

we will analyze

network.

represents

matrix

To observe between

the main reason

matrix.

representation Unfortunately,

the actual an element

can be constructed

a constraint

magnitudes of the weight

as

well for constraint satisfaction

problem

with the hypothesis the

definition

of the gain matrix,

of the

parameters.

wij, the weight

46 where

K_ E R ÷ if the hypotheses

and K_ E R if they represented easily

are mutually

by nodes

see

from

magnitude

definition

there

have

magnitudes

initialize

parameters,

identify

the

observed,

the

constraints There

constraints

is no guarantee

always

converge

conditions.

goal

to 1 if the two hypotheses is 0 otherwise.

wij, all

variables

[19],

The

to determine [20],

typical

there

solution

state

vector

to which

using

that

state

vector.

Once

gain

parameters

which

minimum to note

of the

is that

no

the

As one but

increase

optimal

the

are

set

convincing

network

unsatisfied

associated

can

actual

function, in the

gain

of gain evidence

is to randomly converges

and

constraints

are

with

the

has to be repeated

can be found

performance

the

is

the

the case that this process

set of gain parameters

the

of this problem

the

of this research

Hopfield

network

evaluates

the output

function

extracted

entry

supporting

violated

many

times.

such that the network regardless parameter

of the

will initial

magnitudes

is

specified.

The

main

problem

It is often

issue

matrix

ct and

in literature

exists.

that a good

Another

arbitrarily

attempts

of the

to a global

constraint

weight

et are mutually

term 6aij is equal

observe

violated

magnitudes

under

for constraint

K_ can be determined.

method

are increased.

the

a given

formal

gain

The

been

for

that an acceptable the

conflicting.

of

of the gain parameter

parameter

represent

si and sj are related

the

Although

both nodes

performs

the

effort

searching

of the network

of the adaptive

from the state vector

is to design

block

and then

action adapts

is to redefine

to which

the Hopfield

a closed-loop and

another

system

in which

functional

the weights

block

of the Hopfield

the weight

matrix

network

converges.

based

a classical

observes network.

and The

on the information

The redefinition

of the

47 weight matrix shouldbe performedin sucha way that thesearchnetworkis lesslikely to break the constraintswhich it violatedpreviouslywhen the networkrelaxesto a new state. It is necessary

that the adaptation

are not solutions

for the

block

to initiate

can decide

search

for a good There

network

the

converges:

node

which

the gain phase

outputs

at a randomly is solely

chosen

determined

amount. is violated

constraint

is not large

vector

regardless

state vector

state vector

by the random

parameters

hand,

cause

the

the nodes

follows

the relative there

network

function order

is started minima

are updated. function

is increased

the argument

importance

an "if a

level of that

is still no guarantee

to converge

and

that the

to a solution

state

conditions.

the proposed

order.

representing

a Hopfield

set of local

magnitude

to

system.

network

of the performance

parameter

to be increased,"

will always

and set of gain

update

of the gain

thus it needs

order

gain parameter

adaptive

and the random

Hopfield

and the random

loop

of the energy

to one of the admissible

the associated

magnitude

shape

which

performed

to which

node outputs

a local minimum

this adjustment

procedure

the state vector

the exact

minima

so that the

this closed

If a classical

function

represents

is violated,

enough,

finalize

it will converge

discussion

by employing

determine

[21].

the local

trial and error

state of the network

by the Liapunov

of the initial

On the other initial

state,

then the weight

set of gain

which which

the initial

Although

constraint

variables

to detect

under

The

is eliminated

to be updated

initial

at least one constraint

arbitrary

adjusted

space,

problem

process.

parameters

are chosen

If the converged hence

the adaptation

important

has the means

satisfaction

set of gain parameters

are three

in N-dimensional

constraint

block

adaptive

parameters

If the local

search

system

and then converges

minimum

starts

with randomly

to the local

is not a solution

then

minimum

the algorithm

specified implied adapts

48 the gain parameter the local

magnitudes

minimum.

is restarted

The

the

conjectured

at this point

4.2 Hopfield Given

where bias

dependency

a significant

function

relaxation

and this goes on until

eliminating

play

based

role

a vector

of discrete

if

network

that the random of iterations

neurons

order

needed

a solution

set of gain parameters

is found,

on its initial

representing

thus effectively

conditions.

the node outputs

updated

It is also might

still

to find a solution.

Algorithm S, Hopfield

[2] has derived

an "energy"

or Liapunov

case:

between

%. Assuming

1

Hopfield

by the state vector

with the new

that represents

and the Adaptation

wij is the weight

si =

of classical

provided

of the network

a minimum

for the number

for the discrete

to neuron

process

of discussion

Networks

on the information

E

neurons

s_ and sj, and bi is a

the following

Wij

Sj

activation

rule:

+ b i > O,

2

si = 0

otherwise

the neurons will change statessuch thatenergy will always decreaseor remain the same. The main philosophy of the adaptation algorithm is to observe the state vector the Hopfield

network

Liapunov

function.

restructuring Widrow's

the adaptive

converges An

error

to as implied signal

N-dimensional linear

element

by the

is generated

energy (adaline)

space.

initial

conditions,

and is used to adapt The

in structure

adaptive and indeed

node

update

order,

and

the gain parameters

algorithm

is very

is an modified

thus

similar

version

to

of the

49 the adaline proposed

[22].

The

adaptive The

algorithm

functional

X(k)

convergence

definition

algorithm

e(k) be the error, be the output

used

to adapt

which

= W(k)

e(k)

= y(k)

vector,

is set to 0,

and at time step k. the weights

y(k)

vector,

be a 1 x L weight

output

is computed

as follows:

T X(k)

and

Given element

this error

term,

weight

vector

W(k where

- d(k)

the next update

_t is the learning

.

step is to define

the adaptive

rule:

+ i) = W(k)

does

not apply

in the case of

of the modification.

of the adaptive

- [w0(k ), ..., WLa(k)]

d(k) be the desired

The error

because

for the adaline

= [x0(k ), ..., XL.,(k)] be a 1 x L input

W(k)

y(k)

theorem

+ _ e(k)

rate parameter

1 0 Ero s,

6_i j sis j otherwise.

j

In the case of implementation

E a (k)

implementation

of an excitation

Ere f i f E_ (k)

constraint

with a first-order

term

> Ere f,

s i otherwise. i

An inhibition

Ea

constraint

(k)

=

E. (k)

implementation

Ere f if

m _

with a first-order

Ea (k)

si

term

employs

the following:

< Ere f,

otherwise.

i

The weight

vector

entries

are the magnitudes

of the gain parameters

associated

with each

constraint: W(k)

= [G,(k)

An argument

which

G2(k)... may serve

Observation: magnitude

GL(k)].

The

proposed

of the gain parameter

it less likely

for the network

Proof:

Consider

to clarify

when

to violate

a state vector

rule for the gain parameter

magnitude

the presentation adaptation

which

G,_(k+l)

that e(k) a G,_(k).

> 0 and Q.E.D.

_t > 0 thus

under

constraint

violates

associated

G (k,l) = G (k) ÷ We also have

algorithm

the constraint the same

of the adaptation

with

as

defined

algorithm above

discussion

is violated,

next time

it converges.

constraint constraint

increases

a, then x,_(k) > 0. ct is given

by

follows. the

thus making

The

update

53 4.3 An Adaptive Once vector

Constraint

Satisfaction

the constraint

S at time instant

satisfaction

k, two

network

variables

vector

error term.

If the error is not zero the current

the solution

state vector

these variable

is a global

Using network

this new

period

diagram

the high-level

and the current

is initiated

functional

The goal

block:

the adaptation represents

of the performance

will update

set of gain parameters

description

4.4 Simulation

state vector

to relax to a state state

vector

S(k)

block will compute

a local

minimum

function

the gain parameters

assuming

with the error which

the

equal

are associated

violated.

is computed

relaxation

block

and is allowed

to the adaptation

instances,

minimum

In this case, the adaptation

with the constraints

block

Given

is initialized

are passed

and weight

to zero.

W(k).

Network

and

a new weight

state vector

the cycle

for the constraint

is used as the network

is repeated

of the closed-loop

matrix

system

definition

of the closed-loop

of this application

is to demonstrate

until

a solution

is shown adaptive

initial

system

conditions.

state vector

in Figure

4.2.

satisfaction A new

is found.

A flow-chart

is presented

A for

in Figure

4.3.

Results

networks

to search

proposed

algorithm.

problem,

where

for the

shortest

A directed

vertices

path

graph

of the graph

the use of adaptive

in a state space is used

represent

and thus

as an abstract the states

constraint

satisfaction

test the performance

model

and directed

for the state edges

space

of the search

54

Reference

Energy

Values

Block Adaptation

_>_

/\

S(k)

W(k*l)

W(k) Constraint Satisfaction

Initial

Figure

4.2.

Network

Conditions

Closed

loop system.

55

" Start

._

Hopfield

I

Relaxes

'

,1

Stop

Figure

4.3.

Flow-chart

of adaptation

algorithm.

56 standfor the transitions

between

associated

constraints

network

search

for the neural

states

[8].

algorithm

This problem

to satisfy

and hence

effectively

generates

five

constitutes

a considerable

challenge. The shortest graph

which

meets

path between

two vertices

the following

constraints:

a. the sub-graph b. each vertex of exactly

representing except

of a given

directed

a path is both irreflexive

the source

and target

vertices

graph

can be defined

as a sub-

and asymmetric,

must have

in-degrees

and out-degrees

1,

c. the source

vertex

has an in-degree

d. the target

vertex

e. the length

of the path is equal

non-zero

entry

has an in-degree

of 0 and an out-degree

of 1,

of 1 and an out-degree

to that power

in the row and column

of 0, and

of the adjacency

locations

defined

matrix

by the source

which

has the first

and target

vertices

respectively. Given to define matrix the

the graph-theoretic

the corresponding

topology

neurons

topological

of the neural

which

are

constraints

network.

located

along

hypothesis:

the path has self-loop

vertex,

neurons

similarly, constraint implement

the

belonging

the neurons that the

the out-degree

constraints Since the

main

to the

diagonal

labelled

by the target

of the target

of 1 for the source

vertex vertex

to satisfy, problem

the next step is for an adjacency

that a path is irreflexive,

V i. In order

column

has

of the path search

it is known

for vertex

of the row labelled

out-degree

a path specification

to 0 given

that

they

to map the in-degree by

vertex

is equal

that

vertex

are clamped to 0.

and in-degree

Note

we can clamp

are

represent

of 0 for a source clamped

to 0 and

to 0 for realizing that

the

one

also

of 1 for the target

needs

vertex,

the to but

57 a specialinstance the

formulation

of the digraph of

implementation

at symmetric

index

K1 _E R.

with

of node

The

nodes

column

respect

of node

to the main

value

matrix

is only one path in

hence

eliminating

The

reference

a digraph

s_ equals

is inhibition, time.

The

the

index equal

value

term

the column

sj; otherwise

associated

#2).

the

matrix index

of node

value

Since

the nodes

of a single

term 62_ = 1 if and 62_j = 0.

term for this constraint of 1 requires #3).

The

sj; otherwise

of the energy

to 1.

sj and the

only one

of

is inhibition, is 0.

1 in the associated only

This

if the

constraint

column simply

is inhibition

is 0.

only a single

1 to exist in the

term 63_j = 1 if and only if the row 63ii = 0.

Again

K 3 U R, for the fact that only one of the N interaction reference

of node

thus the type of the interaction

an out-degree (constraint

index

located

be equal

with this constraint

of node s_, otherwise to 1" rule,

matrix

61_j = 0.

between

the existence

The

of the energy

node with

row

of node

one of the two entries

of the adjacency

si equals

of 1 implies

(constraint

the column

row of the adjacency

of node

at a certain

that there

constraints,

that only

diagonal

of node

of the energy

node with an in-degree

s_ equals

Similarly,

interaction

other

to 1, the type of interaction

the "1 out of N nodes

K 2 E R.

associated

all

#1) requires

the row index

can be equal

of the adjacency

implements

will be used such

meets

(constraint

si equals

reference

A digraph

index

of a graph

positions

two interacting

with

which

61_j = 1 if and only if the row index

column

index

problem

problem

of the constraint.

Asymmetry

Hence

the

search

term for this constraint

the type nodes

is 0.

of the

can be ON

58 A pathspecificationimposestheconditionthatanynodethatis not a sourceor targetand is includedin the pathmusthavean in-degreeandout-degreeof 1 (constraint#4). In termsof the adjacencymatrix,if thereexistsa 1 in a particularrow/column,thentheremustexist a 1 in thecolumn/rowthathasthesamenodelabelas thecorrespondingrow/column. The term64ij= 1 if andonly if the row index of nodesi equalsthe columnindex of nodesj or the column index of nodesi equalsthe row index involved

are

reference

value

source

supporting

In order between

source

the energy

nodes

hence

term

minus

to impose and

term

adjacency

hypotheses;

of the energy

and target

of node sj; otherwise

employed.

the

constraint

nodes,

is simply

equal

assumed

and the entries

a bias

that any for each

to the shortest

i-th row

and j-th

adjacency

Note that the entries digraph

is the longest

matrix which

specification possible

that

column

is twice

network neuron

path

with

the shortest

solution

K4 E R ÷.

path

must

is employed.

length

the

a digraph

computed

with a specially

adjacency above

matrix

of the

the main

diagonal

of a matrix)

length

The

between

be the shortest

The reference

path

value

from

the powers

defined

adjacency

for

of the

belong

to A (the

with

10 vertices

digraph equal

had

matrix

all its lower

to 1; e.g., let a_j (the

adjacency

matrix

of the

if j ,: i + 1.

for an example

are equal yields

path

network,

immediately

then % = 1 if and only The

This

is excitatory

1.

target

It was

entries

in the

digraph),

64_j --- 1, the two nodes

matrix.

triangular entry

Whenever

interaction

for this constraint

In order to test the proposed was

the

_4ij - 0.

digraph

to 0 are equivalently

a path of length

for an N-vertex

clamped

N-1 between

digraph.

Another

is shown

in Table

to 0 in the network

vertices feature

Vo and Vs.l. of this

4.1.

topology. Thus

this

59

Vo V1 V2 V3 V4 V5 V_ V7 Vs V9 Vol

1

0

0

0

0

0

0

0

0

Vll

1

1

0

0

0

0

0

0

0

V21

1

1

1

0

0

0

0

0

0

V31

1

1

1

1

0

0

0

0

0

V41

1

1

1

1

10000

V 51

1

1

1

1

1

1

0

0

0

V 61

1

1

1

1

1

1

1

0

0

V 71

1

1

1

1

1

1

1

1

0

V 81

1

1

1

1

1

1

1

1

1

V91

1

1

1

1

1

1

1

1

1

Table

4.1.

Adjacency

matrix.

60 digraphis that therearecircuitsof all possiblelengthsimplying a difficult statespaceto search sinceall thosecircuitsmay be mappedto local minima of the quadraticperformancefunction. Assumethatwe arelooking for a pathstartingwith vertexV0 andendingwith vertexVg. Clearly, thereexistsonly onepathwhich includesthe adjacencymatrix entriesabovethe main diagonal. A

number

of experiments

The measure

of comparison

solution

vector.

state

Experiment

version

comparison

for

satisfaction

search

the

of the random

algorithm

and a new

4.4.

iterations.

iterations In

the

of the proposed

the network

network.

to converge

distribution

to the

case

of

the

to local

while

10%

adaptive

Hopfield

minimum,

almost

the

state

network

required

few needed

network all trials,

of

constraint

to observe

vectors

form

of the

the effects

for the

converged

classical to a local

with that set of parameters.

of the trials

of the trials

a countable

and

some

as an instance

were employed

gain

algorithm.

closed-loop

chosen

trials for the classical

that most

to enable no

was

time after

initiated

of successful

observable

Approximately

was

developed

digraph

The

each

of the proposed

is currently

algorithm

vector.

initialized

was

there

of the proposed

process

the performance

A 10-node

of the state

relaxation

for convergence

convergence

it takes

network

since

available.

randomly

It is clearly

Hopfield

algorithm

versions

were

The frequency

1300

proposed

initialization

network

in figure

of iterations

was to evaluate

of the classical

Two different

minimum

is the number

goal of this experiment

A closed-loop

Hopfield

done to test the performance

1

The

problem.

were

without 99%,

Hopfield

converged somewhere

3800

to 6000

state

converged

vector

network

in the interval between

is shown of 1 to

1300

to 3800

re-initialization

after

iterations.

in the interval

1 to 1100 with

61

significantpercentagebelongingto the 1 to 500 interval, The

frequency

re-initialization vector The

has

distribution a lower

re-initialization. percentage

considerable

peak

Only

of trials lower

the

adaptive

and is more

converged

spread

within

than

500

the

figure

Hopfield

20% of all trials converged

which

than

for

96%,

by

difference

between

the two frequency

Hopfield

100 iterations

iterations

realized

with

network

the adaptive

within

96%

4.5.

adaptive

vector

without

state

for this algorithm.

is approximately Hopfield

state

67 which

without

state

is

vector

re-initialization. One

noticeable

adaptive

Hopfield

Hopfield

network.

vector

algorithms The

re-initialization

converging

within

There has the most of three,

is that the distribution

distribution clearly

associated

favors

desirable

more

out for the classical

with the adaptive

the interval

properties

needed

spread

Hopfield

1 to 500 iterations

of three

implies

up to 1100 iterations

for the experiment

to the far left of others

is the narrowest

of the classical

network

with nearly

without

and

state

45% of the trials

100 iterations.

are only 4 trials which

located

is much

distributions

and skewed

that a better

estimate

because

to converge.

the distribution

to the right.

The

This distribution is the least

spread

fact that the distribution

of the average

iterations

for convergence

to the left of the others

is related

to the claim

can be established. The location the mean

value

of the distribution

of the process

being

is smallest

and hence

the mean

number

of iterations

for

that

62

Number

of

Trlala

50

40

3O

2O

10

0 100

600

1000

2000

3000

Iterations Horizontal Step 10-Node Digraph 100 Tr|aJs

Size:

100

Figure

to

4000 Convergence

(teratlonl

4.4.

Classical

gain and state vector

Hopfield, reinitialized.

6000

6000

63 Number

ot

Trlala

50

40

3O

2O

I0

0 100500

1000

2000 Iterilllonl

Horizontal lO-Node Learning

3000

4000

to Convergence

Step Slaw: 100 Iterations DigreDh, 100 Trials Rate 8oaler: 0.1

Figure 4.5. Adaptive Hopfield, state vector not reinitialized.

5000

6000

64 convergenceis smallest.The right-skeweddistributionis anindicationof thefact thatmosttrials actually convergein fewer iterationsthan the medianvalue. The second outperformed Hopfield

the classical

networks

algorithms. period

Experiment

network.

the performance

state

with state vector

The difference

to the importance the current

Hopfield

of the

vector

in performance

state

vector

as the initial

of the adaptive

re-initialization

for the two adaptive

re-initialization

condition

which

aspect

for the next

of the

relaxation

algorithm.

2

This frequency

experiment

was

distribution

to 0.9 in steps parameter

of 0.1 and

100 trial

digraph.

observation

to observe trials.

runs

out as the value

mean

estimate

the

effects

The learning

were

The results

is that the curve

0.6] and spreads

the statistical

conducted

of the successful

for a 10-node

One

Thus,

is the adaptive

Hopfield

points

Employing

improves

of [0.4,

best algorithm

attempted

are shown

in figures

toward

of the distributions

learning

rate parameter for each

is the least spread is varied

of the

was varied

value

on

the

from

0.1

of the learning

rate

4.6-14.

for learning both ends

is more

rate

rate value

in the interval

of the interval

accurate

for learning

[0.1, 0.9]. rate values

of 0.5 ± 0.1. The peaks

which

rate values within iterations

frequency are located converged

100 iterations for the same

distributions

for learning

rate values

to the left of the horizontal within

a small

for a learning learning

rate.

number rate

in the interval

axis and hence

of iterations.

of 0.5 and

99%

Indeed, of all

more

[0.3, 0.6] have trials

for those

78% of 100 trials trials

converged

higher learning

converged within

275

65

Number

of

Trials

5O

4O

3O

2O

10

0 100

600

1000

2000 Itoratlona

Horizontal lO-Node Learning

3000

4000

to Convergence

8lip Sire: 100 Iterations Oigraph. 100 Trials Rate Scaler: 0.1

Figure 4.6. Adaptive Hopfield, state vector re-initialized.

5000

6000

66

Number

of

Trials

35

30

25

2O

15

10

5

0 26100

600

250

750

I_ratlons Horizontal Step lO-Node Digraph 100 Trials

Stzm:

25

1000

to Convergence

Iterations

Figure 4.7. learning

ACSN performance, rate scaler: 0.1.

1260

1600

67

Number

o!

Trials

35

3O

25

2O

15

10

6

0 26

500

260

750

I_ratlons Horizontal Step lO-Node Digraph 100 Trials

Sizn:

25

_

1000 Convergence

Iterations

Figure

4.8.

learning

ACSN

performance,

rate scaler:

0.2.

1250

1600

68

Number

30

of

Trlalo

...................................

25

20

15

10

5

0 25

260

500

750

Iterlllonl Horizontal Step 10-Node Digraph 100 Trial.=

Size:

26

1000

11o Convergence

Iteratlonl

Figure 4.9. learning

ACSN performance, rate scaler: 0.3.

1250

1600

69

Number

ol

Trlsls

35

3O

25

20

15

10

5

0 25

250

600

750

Itoratlons Horizontal Step lO-Node Oigreph 100 Trials

Size:

25

1000

to Convergence

Iteration8

Figure

4.10.

learning

ACSN

performance,

rate scaler:

0.4,

1250

1500

70

Number

of

Trills

35

3O

25

0

.......

15

10

5

0 26

260

500 Iterations

Horizontal Step 10-Node Digraph 100 Trials

Size:

25

750

1000

to Convergenoe

Iterations

Figure

4.11.

learning

ACSN

performance,

rate scaler:

0.5.

1260

1500

71

Number

ol

Triall

35

30

25

20

15

10

5

0 25

250

500 Iterations

Horizontal Slep IO-N ode Digraph 100 Trials

Size:

25

750

1000

to Convergence

Iteration|

Figure

4.12.

learning

ACSN

performance,

rate scaler:

0.6.

1260

1600

72

Number

o1 Trlala

35

30

25

2O

15

10

0 26

250

600 Iteration8

Horizontal Step IO-N ode Olgraph 100 Trlall

Size:

25

760

1000

to Convergence

Iterations

Figure 4.13. learning

ACSN performance, rate scaler: 0.7.

1260

1600

73

Number

35

ot Trials

I

............ H

30 125

_

..............

..................

...................

2O

15

10

5

0 25

250

600

750

iterations

1000

to Convergence

Horizontal Step Slzs: 26 Iterations lO-Node Digraph 100 Trials

Figure 4.14. learning

ACSN performance, rate scaler: 0.8.

1250

1500

74

Number

el

Trials

35

30

25

2O

15

10

0 26

260

600

750

Iterations HorlTnntal Step lO-Node Digraph 100 Trials

Size:

25

Figure

to

1000 Convergence

Iterations

4.15.

learning

ACSN

performance,

rate scaler:

0.9.

1250

1600

75 Note peaks

and

that

for learning

spread

out

considerably

Although

the distribution

is notably

less than

rate

belongs

to the

frequency the

possible

associated

Experiment

search The

of the successful

maximum

in the

of 0.1 has a high peak, with

learning

rates

of 5 constraints

to the far

left,

the width

of the peak

rate value

problem

We will consider

[0.4, 0.6].

was for the learning

can claim

is defined

interval

lower

[0.4, 0.6].

one

optimality

have

in the interval

Clearly,

trials.

is located peak

of this

distribution

tested.

that for

based

value

a 10-node

on the

a learning

right-skewed,

the optimal

not spread

digraph

features

rate value and

of

of the

optimal

if

as high

as

width.

to observe

of the successful

can provide.

shown

required Hence,

more only

was conducted in figures

the

trials.

effects

were

number

on network

4.16-18.

Each

network

effort

varying

rate was

network

size

set to 0.5 for all network

initially than

of trials

but the simulations

currently

were

available

attempted

size

in those

figures

for the

computing

for this network

sizes of 5, 10 and 15 with associated step

on the

size.

considered

computational a small

of the

The learning

run for each

of 5, 10, 15 and 20 nodes

Full experiments

distributions

test was

A total of 100 trials were

of 20 nodes

resources

iterations.

path

distributions

a trial took to converge

to 1500.

[0.4, 0.6].

distribution

Networks

size.

equal for the

interval

goal

frequency

network

of iterations

0.9, the distributions

3

The

sizes

number

distribution

with

to the

that of distributions

parameter

associated

compared rate value

is approximately

learning

of 0.2, 0.7, 0.8 and

for a learning

The maximum of 0.9 and

rate values

corresponds

frequency to 100

76

Number

of

Trlall

120

100

8O

6O

4O

2O

0 100

500

1000

1500

2000 Ilerellonl

Horlzonll! 5-Node Leerning

8tap SI;m: Digraph, 100 Rite 8oiler:

2600

3000

3500

to Convergence

100 Iteratlon8 Trtall 0.5

Figure 4.16. Adaptive Hopfield, effect of network size 1.

4000

4500

5000

77

Number

of

Trlale

120

100

8O

6O

40



20

...............

100

600

1000

1600

2000 Iterations

Horizontal lO-Node Learning

2500

3000

Io Convergence

Sled 81a: 100 Iterations Digraph, 1OO Trials Raqe S(mler: 0.5

Figure 4.17. Adaptive Hopfield, effect of network size 2.

3600

4000

4600

5000

78

Number

of

Trlsla

120

i

.................................................................

100

8O

6O

4O

2O

0 100

600

1000

1500

2000 Iteratlona

Hori=ontal 16-Node Learning

2500

3000

to

Conver_enoe

3500

Step 8ira: 100 Iter_lonl Digraph, 100 Trie|a Rate 8¢.Mer: 0.5

Figure 4.18. Adaptive Hopfield, effect of network size 3.

4000

4600

6000

79 In the caseof 5 nodes,all trials convergedwithin iterations

for more

iterations

to convergence

of the trials small

than

90%

for a 10-node

converged

percentage

to solution

needed

to network

the behavior

of the function

in that

behaves region

approximately iterations 10001,

that

2000.

7156

average

A couple

iterations

iterations

4.19.

a 20-node

is not less than that value, with average

equal

maximum

number

Approximately,

of average there

digraph.

number

of 80%

Only

a

of iterations

is not enough

if one follows

past 15 nodes for

up to 300

to converge.

Although

of iterations with

The

for a 15-node

variation

15 nodes,

It took

of 500.

iterations

shows

in the region

of trials

for convergence

and 5741

which

number

on the order

to 4000

sizes above

manner

to converge.

1000

in Figure

for network

in a predictable

necessary

of 2000

size is shown

shows

was

state within

to the function

with respect

networks

digraph

on the order

An approximation

function

of the 10-node

100 iterations.

digraph

that

of the function

digraph

indicates

five trials

to infer

the assumption

an extension

a 20-node

data

is more that

resulted

than

number

in 3806,

of

5711,

to 6482.

4.5 Conclusions The of its proposed problem. known

proposed

type.

An

algorithm Two

adaptive earlier

differences

by the adaptation

in the interval

algorithm

but addresses

main

of [-1, +1].

constraint

block

satisfaction

developed

a somewhat

by

different

network

is the first closed-loop

Ackley

[23]

type of constraint

are that the function and the adaptation

has

signal

to be optimized provided

similar

algorithm

features

to the

satisfaction/optimization in Ackley's

by the same

block

case

is not

is a scalar

80

Average

Iterations

(Thousmnda)

10

8

6

4

2

0

.........

........................................................

I

1

20

I

40

60 Digraph

Figure

4.19.

network

I

81ze

Adaptive

80 (Nodea)

Hopfield,

size vs. performance.

I

100

120

81 The

results

of experiment

moderate

size problems.

be seen.

The performance

network

as shown

learning

rate

optimal

4.6 Future

by

for learning

algorithm

networks

for

remains

to

Hopfield

after each iteration

is visibly

trials.

noted

is most

such

The that

effects there

of the

exists

an

successful.

of the

analysis

by adapting of random

adaptation

of the effects

the weights, update

algorithm

order

and

of restructuring

effects

of the initial

on the performance

the

closed-loop

of the energy state

vector

on

of the closed-loop

algorithm.

algorithm.

adaptation

expect

which

the proposed

algorithms.

backpropagation or the value

[1].

will be analyzed

algorithm

algorithm

The closed-loop of learning

algorithm

A reinforcement

to the proposed

value

description

space

rate and the effects

The

would

is also

promising

classical

of the successful

for which

sized

to closed-loop

initialized

performance

This includes

in N-dimensional

convergence search

mathematical

will be attempted.

function

for larger

is very

Research

A complete system

algorithm

network

compared

randomly

distributions

rate values

proposed

network

and state vectors

on the

the

of the algorithm

of the adaptive

the frequency

parameter

interval

that

The performance

with gain parameters

superior

1 show

utilizes

algorithm

adaptive

generally

to converge

algorithm

It is also of interest

a viewpoint

employs

more detailed

At one extreme

itself might

from

a one-bit

is located

be optimized

some

the adaptation

cycle,

learning compared thus one

faster.

somewhere

is reinforcement to identify

piece of information

data during

much

of a reinforcement

in the middle

learning

and

sort of quantity

by the adaptation

algorithm.

in the spectrum

at the other to use whose

extreme expected

is

82 The testingof since

the initial

selected size

study

only

included

such that the performance

of the problem

clearly

the proposed

observed.

which

implies

algorithm

with

the path

search

of the algorithm the size

a complete

set of problems

problem.

with respect

of the network

and

The

needs

set of problems

to be done should

be

to the number

of constraints,

the

the learning

rate parameter

are

83 References

[1] [21

D. E. Rumelhart

and J. L. McClelland,

The

1986.

MIT Press,

J. J. Hopfield Problems,"

[31

J. J.

D. W.

Biological

J. J. Hopfield 233,

[4]

and

pp. 625-632,

Proc.

Nat. Acad.

G. E. Hinton

Cognitive

[12]

S. Geman

and D. Geman,

New

Englewood

Cliffs,

New

USA

A Model,"

with

Emergent

79, pp. 2554-2558,

Machine,"

Science

Collective 1982.

Proc.

oflCNN,

vol.

Jersey:

and E. W. Page, June,

A. Ginzburg, Algebraic Academic Press, 1968.

Relaxation,

Trans.

Pattern

of lEEE

June

Structures

Gibbs

with Applications

Inc.,

Distributions

Machine

Structure

"Optimization

for Boltzmann

to

1975.

Anal.

Prentice-Hall

Algorithm

1985.

Inc.,

R. E., Algebraic

II, pp. 325-332,

Proceedings

"A Learning

Mathematical

McGraw-Hill

and P. Sadayappan,

III, pp. 741-747, [13]

IEEE

Steams,

G. A. Tagliarini

Sci.,

of the Boltzmann

Discrete

"Stochastic

J. and

vol.

Networks:

Systems

vol. 9, pp. 147-169,

York:

Hartmanis,

Networks,"

Model

and R. Monahar,

Science.

IJCNN-88,

1985.

Physical

and T. J. Sejnowski,

Science,

Computer

J. Ramanujam

in Optimization

1987.

Restoration of Images," 721-741, 1984.

[11]

MA:

with Graded Response have Collective Computational Properties Neurons," Proc. Nat. Acad. Sci., USA, vol. 81, pp. 3088-3092,

"A Mathematical

III, pp. 157-163,

[10]

Cambridge,

of Decisions

with Neural

Properties,"

P. K. Mazaika,

J. P. Tremblay

"Computing

and

[6]

[91

52, pp. 141-152,

Networks

J. J. Hopfield, "Neurons like those of Two-State 1984.

[8]

Processing.

Computations

"Neural

[5]

Machines,"

Distributed

1986.

Hopfield,

D. H. Ackley,

"Neural

Cybernetics

and D. W. Tank,

Computational

[7]

Tank,

Parallel

Intell.,

Theory

and the Bayesian vol.

PAMI-6,

of Sequential

pp.

Machines.

1966. by Neural

Networks,"

Proceedings

of

1988.

"Solving

Constraint

First International

Satisfaction Conference

Problems on Neural

with Neural Networks,

vol.

1987. Theory

of Automata.

New

York:

ACM

Monograph

series,

84

[14]

T. J. Sejnowski, Neural Networks

"Higher Order Boltzmann Machines," AlP for Computing, Snowbird, Utah, 1986.

[15]

C. L. Masti

D. L. Livingston,

Problem Networks, [16]

and

in Task Planning,"

"Neural

Proceedings

IJCNN-90-WASH-DC,

Y. C. Lee,

G. Doolen,

for Addressing

of the International

Joint

H. H. Chen,

G. Z. Sun, T. Maxwell,

[19]

S.

Computational

G. W. Davis

and A. Ansari,

J. Bruck

III, pp. 325-328,

B. Widrow Prentice-Hall

[23]

and

and

"Sensitivity

on Neural

B.

Proc.

Levy,

Proc.

IJCNN

Analysis

22D,

'88, vol.

of

pp. 276-306,

II, pp.

of Memory 1989.

"Determination

IJCNN

and L. C. Giles,

115-125,

Clusters

in the

Parameters

in

'88, vol. II, pp. 291-298,

of Hopfield

Neural

Net,"

a

1988.

Proc.

IJCNN

1989. "A Generalized

in Combinatorial

S. D. Stearns,

Inc.,

W.

Network,"

and J. W. Goodman,

and Its Applications 649-656, 1989. [22]

Sweet

Hopfield/Tank

'89, vol. [21]

L.

Nets,"

Physica

A. Krowitz, L. Rendell and B. Hohensee, "The State Space Hopfield Network," Proc. IJCNN "89, vol. III, pp. 339-346, J.

in Neural

Network,"

[18]

[20]

"Convergence

Correlation

M.W. 1988.

Hegde,

151,

the Decomposition Conference

H. Y. Lee,

[17]

U.

Proceedings

1990.

"Machine Learning a Higher Order North-Holland, Amsterdam, 1986. Hirsch,

Networks

Conference

Adaptive

Convergence

Optimization,"

Signal

Theorem Proc.

Processing.

for Neural

IJCNN

'89,

Englewood

Networks

vol.

III, pp.

Cliffs,

N.J.:

1985.

D. H. Ackley, A Connectionist Academic Publishers, 1987.

Machine

for

Genetic

Hillclimbing.

Boston:

Kluwer