Linguistics 431/631: Connectionist language modeling

Ben Bergen

# Meeting 2: Neurons and neurodes

August 24, 2006

## Neurons

Structure of the neuron

• Most of the input to the neuron is collected through the dendrites
• The neuron also has a cell body, which is responsible for maintaining the functioning of the cell.
• The axon stretches out to other neurons, creating interface points at their dendrites or cell body.

Each neuron can be seen as an information processing unit

• Electrochemical charge is the common currency.
• Charge is input into the neuron across synapses from other neurons
• The neuron processes this charge and arrives at its own charge
• If the neuronÕs charge goes beyond a particular threshold, it outputs charge to other neurons

## Neurodes are like neurons

### Neurodes

Firing rate

Activation

Synapses

Connections

Synapse efficiency

Connection strength or weight

Excitatory/inhibitory synapses

Positive/Negative connections

Neurodes (mostly) perform two functions

• Sum inputs
• Pass them through an activation function

Input summation is a very simple process. For a given node i:

• Take the activation of each node with a connection to i (or simply each input to i) and multiply it by the strength of its connection to i.
• Sum these products

For example, take a node i, which has inputs from two other nodes, g and h. The activations of g and h are 2 and Ð0.5, respectively. Their connection weights are Ð0.1 and Ð2, respectively. What is the sum of the inputs to node i? What would the activation be if the connection weights were 0.1 and 2?

The second step is the passing of this sum of products to the activation function. In principle, any function is possible, but in practice, only a small number are used.

• The simplest is a linear function, which just makes the activation of neurode i equal to the sum of products passed into it.
• But neurons mostly donÕt work this way Ð they have thresholds under which they do not become active and over which they become active to some degree. Most connectionist models therefore make use of an activation function demonstrating this behavior Ð a sigmoid function.

Sigmoid functions in connectionist models have the following properties

• They force the activation of a node to fall between given values, in most cases 0 and 1.
• Within a particular range, they are very sensitive to net inputs Ð that is, a small change in net input yields a large change in activation.
• Globally, with large changes in net input, they display an on-off activation character.

Some more exercises:

á      What would the activation of our node i be if it had the sigmoid function shown below?

á      What would the activation be with inputs 1 and 1 and a sigmoid neurode?

á      How could you change the weights such that the activation resulting from inputs 1 and 1 was 1?

Activation

Net input

Pattern matching

á      Imagine that we want a network to perform a really simple function, like output=input.

á      For example, imagine we want the following inputs to give rise to the corresponding outputs:

 IDENTITY Input Output 0 0 1 1

á      Easy with linear activation function, but how about sigmoid? If so, how? If not, why not?

Bias nodes

á      Sometimes itÕs useful to give nodes default activations other than 0 (if linear) or .5 (if sigmoid).

á      E.g., in the case above with a sigmoid node, 1 needs to have default activation of 0. What does the sum input have to be for this activation to be produced by the activation function?

á      We can solve this problem by implementing another node, a bias node, which always has an activation of 1 and which may be connected to any given node with some weight.

á      What bias node weight would give the right sum input to node 1 when the input is otherwise 0?

á      What does the strength of the input to node 1 from i1 have to be to overcome the bias?

LetÕs see how OR can be modeled, also using bias nodes

á      There are two inputs and one output Ð the basic idea is that if either input is 1, the output is 1. Otherwise the output is 0.

á      The input-output function is as follows:

 OR Input1 Input2 Output 0 0 0 0 1 1 1 0 1 1 1 1

á      Can this function be modeled with no bias, using either a linear or a sigmoid node? Why (not)?

á      What should the weights of the three connections to node 1 be?