© Copyright JASSS


Neural Networks for Economic and Financial Modelling

Andrea Beltratti, Sergio Margarita and Pietro Terna
London: International Thomson Computer Press
Cloth: ISBN 1-85-032169-8

Order this book

Reviewed by
Robert Marks
Australian Graduate School of Management, University of New South Wales, Sydney, NSW 2052, Australia.

Cover of book

I first came across artificial neural networks (ANNs) when a colleague directed an inquiry my way about ten years ago. As an economist in a business school, I have become used to fielding inquiries and suggestions for solving the world's problems, so when I heard of a new technique for predicting the credit-worthiness of prospective borrowers, a black-box software program which emulated aspects of the brain's ability to recognise patterns, I was not, as an analytically trained economist (whose professional credibility was already under review given my dalliance with the computer simulation of evolution), inclined to embrace this new approach wholeheartedly, despite the information that it was being used by a growing number of U.S. banks. I let it slip past me, although I have continued to use Genetic Algorithms (GAs) in the analysis of firms' behaviour in oligopolies.

I was probably mistaken. For several years before and after my rejection of the new technique - one of three software techniques (along with simulated annealing and GAs) recently inspired by the physical and biological world - Hal White and colleagues at the University of California San Diego had been working to provide a theoretical link between non-linear estimation and ANNs. Specifically, White (1992) showed that ANNs can be seen as non-linear models. As Beltratti, Margarita and Terna (hereafter BMT) put it, "The specific functional forms used in nonlinear models imply of course that in general the function that generates the data is different from the one implied by ANNs ..." (p. 8), so that the appropriate econometric theory for ANNs is that for misspecified non-linear models. They continue by pointing out that ANNs have the ability to approximate any continuous function and its derivatives arbitrarily well, which can mean approximating noisy data too well, unless the modeller is cautious. Even simple three-layer ANNs have been dubbed "universal approximators" (Hornik et al. 1989).

Yet I was not the first to turn away from ANNs. Based on work in the psychology of learning first published in 1949, early ANNs were developed by computer scientists in the 1960s. But this work was flawed - with only two layers (input - or receptor - nodes and output nodes) these early ANNs could not solve the classic case of a simple exclusive OR (XOR), a linear inseparable problem. Development halted for fifteen years until the elaboration of additional, "hidden" layers in the 1980s and White's interpretation of ANNs as mis-specified, non-linear estimators.

BMT give an excellent summary description of how modern ANNs work: signals are received at the input layer, are transmitted to the hidden layer (or layers), which transforms them, usually using a smooth logistic approximation to a step or threshold function, and then transmits the transformed signal to an output layer, where the signals are transformed again, before being transmitted to the world. Links in the ANN mean that a single incoming signal can be transformed by many hidden nodes and by many output nodes; at each node all incoming signals are aggregated before transformation. Each of these links (from input node to hidden node or nodes, and from hidden node to output node or nodes) is characterised by a weight (positive for exciting, negative for inhibiting) and the network is characterised by the set of weights. Note that a fully connected ANN is one in which each input node is connected to all hidden nodes, and each of these in turn is connected to all output nodes.

"Learning" occurs as the weights are changed, which also alters the network. With target output signals, a measure of the error of the ANN (often the sum of errors squared) can be calculated. BMT use matrix notation and a worked example to show how "back propagation" of this error measure can be used, in an iterated process, to improve the weights of each linkage, and so reduce the future errors. This is known as "supervised learning."

But not all problems have known targets - we may know that we want to model an agent as profit-maximising without knowing exactly what price the firm should charge, or how much it should offer for sale, given some market power. Unsupervised learning (or quasi-supervised learning) can be achieved using GAs - my continuing interest - since GAs are global optimisers, while back propagation is hill-climbing, or local optimisation.

BMT's target readership is both economists interested in ANNs and non-economists (such as computer scientists and engineers) looking for new fields of application for their ANN skills. Their second chapter provides revision for the first group and a brief introduction for the second concerning "some important concepts of economic analysis that represent a good starting place for exploring possible applications of ANNs to economics and finance" (p. 47), specifically a neo-classical view of learning and interaction among economic agents. One conclusion of this chapter is that these models, for reasons of tractability and comparability, impose strong restrictions on the behaviour of economic agents, their computational abilities and structural knowledge. Since BMT want to use ANNs to relax some of these assumptions, they discuss learning, both at the individual and at the organisational level, and discuss the economy as a complex system, summarising the previous work of others.

After a review of rational expectations, a theory I see as allowing solution of some otherwise difficult problems, but a theory which BMT allow is wildly unrealistic for almost all real markets, they focus on the computational and information-gathering costs implicit in the neoclassical model, and argue that these, as well as learning, should be explicit in models of economic systems using agents - complex systems. Bounded rationality may be a better model of how human beings deal with the computational demands of information processing - at one extreme embodied in the use of rules of thumb. But which version (or versions) of bounded rationality? Abandoning perfect rationality means abandoning global optimisation using all available information as an ideal, leaving many possible models, all of them boundedly rational. BMT note that Simon's definition of bounded rationality is consistent with the recent emergence of the study of the economy as a complex system, the epicentre of which is the Santa Fe Institute: but still the issue of how much computational ability to impute to economic agents remains.

BMT describe a research programme in which the economy is an evolving complex system with agents who continually learn and adapt, not only to secular events, but also to the learning and adaptation of other economic agents. Their book explores ANNs as one methodology for carrying out this research programme, with particular application to financial markets. In fact the book's title is something of a misnomer, since there are only really two non-financial models included.

The second of three parts of the book discusses computer experiments with artificial agents, paying specific attention to ANNs, as a preliminary to studying the application of ANNs to financial markets in the third part. ANNs, as connectionist structures are: parallel, subsymbolic, self-organising, fault-tolerant and redundant, as BMT discuss.

What they are not is deductive, and so they cannot provide necessary conditions. This is not a critique of ANNs per se: all simulation of its nature can only exemplify sufficiency, not necessity. On the other hand, using simulation and numerical methods may allow the solution of problems not amenable to closed-form, deductive solutions (Judd 1998). I looked in vain for discussion of these points by BMT.

Given my initial concern over the "black box" nature of ANNs, I was interested to read a section of BMT in which they spend twenty pages examining various techniques for trying to impute behavioural rules ("rules of thumb"?) to the weights associated with the hidden node. Their motivations are (1) to allay fears of practitioners that they are using a "pig in a poke", with signals to buy and sell, say, in a financial market demanding a degree of faith in ANNs absent clear and simple rules, and (2) to provide theoreticians with interpretations of just how economic agents are being modelled in the ANN. (I note that GAs also suffer from lack of transparently simple rules.) In doing this, they acknowledge Friedman's classic defence of "black box" models - "It's the performance, stupid!", as he might have put it today - but argue that for models of learning, the researcher needs to see the rules otherwise implicit in the black box.

I did not find any of these techniques particularly convincing in shedding light on the behavioural meaning of the weights, but the discussion about the effect of the number of nodes in the hidden layer - "... if we increase the number of hidden nodes, network outputs become strictly related to examples (with errors approximating zero) and derivatives become meaningful ... a compromise between complexity and clarity of derivative meanings ..." (p. 99) - was insightful, although newcomers would need more guidance on the appropriate number of hidden nodes for any problem, as I mention below.

BMT propose three types of structures: single-agent models, single-population models and multiple-population models. The latter two can be used when there is explicit interaction between agents, respectively, identical and distinct (or asymmetric).

The literature on complex systems has stressed the emergence of phenomena of interest from the interaction of large numbers of agents. BMT devote a chapter to a technique - cross-target - which attributes a central role to learning mechanisms and which can be applied "... without introducing, either explicitly or implicitly, economic rules in order to influence or to characterise agents' behaviour." (p. 110) The only requirement is a limited computational ability - to take simple decisions and to compare guesses with results. BMT note that this type of artificial adaptive agent (AAA) appears to an observer to operate with goals and plans, symbolic entities in this case which are inventions of the observer. They note that observations and analysis of real-world agents' behaviour can suffer from the same bias, a reverse case of the Turing test.

Using the cross-target technique, BMT obtain agents that behave on the basis of the development of consistency among guesses about their actions and related effects: this kind of consistency is sufficient to obtain self-developed micro-mechanisms which are very simple, but sufficient to characterise realistic economic behaviour. A series of simple searching models (food foraging) are followed by two simple single-agent models, one of which considers the capability to react to price changes, while the other characterises portfolio decisions. The former is shown to derive a relationship which is a kind of demand curve but without optimisation. The latter derives an agent trying to avoid risk: selling all shares and holding only money in the face of a sinusoidally changing share price. This provides a short introduction to a stock-market model, using twenty agents in two populations, with imitation and some random noise. Simple micro-mechanisms are sufficient to develop complex behaviour, and the authors' cross-training structure is sufficient to develop these mechanisms autonomously.

The second half of the book comprises three models: a single-agent model of a genetic-based price-taking trader, a single-population model of a financial market with behavioural ANNs and three multiple-population models: one with two sorts of price-taking traders, one examining the coevolution of an on-screen stock market with dealer and trader agents and one which models a banking system, with the banks represented as ANNs.

The simplest of these models is the first: a single-agent model of a price-taking trader. BMT discuss the general structure of single-agent models: the exogeneity of the environment (not affected by actions of the agent), the irrelevance of other agents if present (not affected and treated as a subset of the environment), the unidirectionality of the flow of information (from the environment to the agent), actions (which may or may not be based on a preceding forecast) and forecasts (explicit or implicit in the ANN).

This last point leads to a discussion of various structures of single-agent ANNs: a fully connected net with a forecast output node and an action output node; or a net with two subnets - the forecast output of one is an input node of the second, which results in a single output node, the action; or a fully connected net in which forecasts are implicit: a single output node for actions.

This last structure is the basis for their model of a trader facing a time series of Fiat stock prices from 1987 to 1989. Since the single output is the action {sell, wait, buy}, supervised learning is not available: there are no targets. Instead, the model-builder sets the general goal of wealth maximisation and in this case uses a GA to search for the best weights in the network links of the trader agent. BMT also show an exhaustive search for the optimal number of hidden nodes. As remarked above, the issue of choosing a good structure for the hidden layer is one that should be addressed explicitly by ANN modellers, even if only to derive a rough relationship between this number and the numbers of input and output nodes for a fully connected ANN, or an upper limit to this number. Simulations with white noise inputs and outputs might shed light on this.

The best trader agent is validated in this model using a hold-out series of Fiat prices in 1990, and is shown to do almost as well (within 1%) as the best possible trader (with perfect foresight), a result made more impressive by a difference in the pattern of Fiat share prices in the latter period.

BMT suggest that interpretation of the economic meaning of the ANN after training may be helped by using standardised inputs and then comparing the outputs (here, the buy/sell/wait decision) with suggestions from experts or the actual decision maker. Although hardly rigorous, this technique provides some insights and reminds me of the calibrations that the Energy Modeling Forum at Stanford has performed on computer models of energy markets for the past twenty years. Borrowing the EMF's calibration methodology might be useful here. BMT argue that, anyway, if ANNs are complex and their easy interpretation is fraught, this may just reflect the complexity of historical traders' behaviour. Even if true, this would not provide practitioners with the confidence to use ANNs as decision rules with actual money.

When interacting agents are fundamentally similar in terms of strategies and goals, but differ with respect to some characteristics, then the model requires a population of heterogenous agents. BMT present a financial market model in which agents are similar with respect to their goals and decision rules, even if otherwise heterogeneous. The ANNs are non-linear forecasting tools for wealth-maximising agents who want to forecast future values of relevant variables. This model's outputs are extensively analysed.

A second model allows the agents more flexibility: they do not follow pre-specified rules, but create their own behavioural rules as their experience about the structure of a financial market accumulates over time, on the basis of targets that need to be reached. This second model allows traders to decide both price and now quantity, a simple modification which has major consequences foe model output. Again the model's results are extensively analysed.

The final models are multiple-population models: in the first there are three populations of traders - "smart", "dumb" and "naive" traders. The model allows their proportions to be endogenous, and so simulates the evolution of the market. Learning may occur by imitation, at least for the first two populations - the naive types don't learn. The model's outputs (shares, market prices, agents' wealth positions) are extensively analysed.

The second multiple-population model is to examine the coevolution of traders, dealers and market rules in an on-screen stock market. Marks et al. (1998) use coevolution in multiple populations of asymmetric sellers in an oligopoly, but they use GAs and the Casper Market Model, not (yet) ANNs. Coevolution implies that there may be symbiosis (from biology) or that the interaction is not zero-sum (from game theory). Analysis of the model's outputs shows clear relationships between the decisions of the traders and the dealers.

The final model in the book includes a banking system with stochastic firms and ANN banks: there is now production as well as exchange. Banks can learn, as a function of the business cycle, given the risk of firms defaulting on their loans in bankruptcy. Again the results are extensively explored.

The book concludes by revising its main themes. Some of these relate to the specific models and their results, and we shall not comment further on them, but others apply more generally to the use of ANNs in economic modelling and even more generally to computer simulation experiments:

This reviewer found BMT to be of great interest. I am inspired to attempt to fit ANNs to the mapping from market state to firm response in a repeated oligopoly which I have been studying with GAs. This would allow an environment for "breeding" brand agents as finite automata using GAs, a project that has been hindered by short time series of data previously. The fitted ANNs might even provide insights into the stimulus-response behaviour of the historical brand managers. A final point: some discussion of availability of ANN packages would have made the text even more attractive to the neophyte researcher in this area.

* References

HORNIK K., M. Stinchcombe and H. White 1989. Multilayer Feed-Forward Networks are Universal Approximators, Neural Networks, 2:359-366.

JUDD K. L. 1998. Numerical Methods in Economics, The M.I.T. Press, Cambridge, MA.

MARKS R. E., D. F. Midgley and L. G. Cooper 1998. Refining the Breeding of Hybrid Strategies, Working Paper 98-017, Australian Graduate School of Management, Sydney.

WHITE H. 1992. Artificial Neural Networks: Approximation and Learning Theory, Basil Blackwell, Oxford.

ButtonReturn to Contents of this issue

© Copyright Journal of Artificial Societies and Social Simulation, 1999