We're living in period of rapid change: changing climate, species extinction, species transported by humans changing food webs, landscapes changed by farming, forestry, mining, urbanisation, and more. Sometimes the impact of these changes can be small and other times it has been unexpectedly large. There's a lot we just don't know.
An important area of modelling and research comes from looking at historic patterns and finding simple models that behave in the similarly. Then exploring their robustness and perhaps using them for planning and prediction or at least refining our intuition about what the consequences of our actions in the world might be.
One domain that is important now is the historical record of extinctions of species. We know that there are bigger and smaller extinction events. For perfectly human reasons we expect big extinction events to have big causes, like the meteorite strike that contributed so dramatically to the extinction of the dinosaurs. But for some big extinction events we can't seem to find such smoking guns. And there are events like the meteorite strike that formed Chesapeake Bay 35 million years ago, comparible in size to the K-T boundry event, but with nothing like the same impact on species extinction.
What we see when we analyse the size of extinction events in the fossil record is that they follow a power law.
A lot has been written about power laws and they have some important properties if you find a system that follows one. It doesn't make sense to analyse them in terms of averages and deviations as you would for a normal distribution; your intuition and training about statistics are going to be wrong when applied to things that are organised according to a power law. They are scale invariant, look at all the events from size 10 to size 100 and you'll get a histogram with with same shape as all the events from size 1 to size 10. There's a precise number called a power law exponent that characterises the shape of that histogram. The predictions you can make fit a template like this: over the next period of time I expect N events in size range X and M events in that size range Y, but I can't tell you what will happen tomorrow given the history of events.
One intuition about extinction events is that if a species goes extinct then species that are closely connected to it as prey or predator are suddenly affected as well, they go extinct together, which we call co-extinction.
Sneppen and Bak have a simple model of webs of species and co-extinction: use a graph to model a web of species each connected to two other species. Represent fitness as a number randomly assigned to each species. Iteratively replace the least fit species and the two connected to it with three new randomly fit species. An interesting effect happens with the over all fitness: because you are eliminating the least fit, the new randomly fit species are initially likely to be fitter on average, but after a time that stabilises out so that most species are above the average random fitness. Then, in that phase, the random fitness of new species can put old very fit species at risk leading to a cascading effect of reduced fitness across the whole network. Measuring the size of those cascades, for example through the distribution of species ages, demonstrates a power law, apparently, but I haven't implemented the model and tested it for myself. That something so simple displays a power law at all is an important foundation. But we're told that it has the wrong power law exponent compared to the fossil record so in some important way the model is wrong.
From foundations like that many more elaborate food webs and fitness models can be imagined and compared to details of the fossil record. Hopefully such models give us insight into which features of food webs make them resilient or what states of food webs might make them prone to particular patterns of extinctions.
Amaral and Mayer focus on an intuitively reasonable extra detail in their model: trophic layering. Trophic levels represent the idea that there are predators and prey in a food web, and that there are classes of species with similar trophic roles, (carnivorous predators, photosynthetic food producers, strict herbivores, etc). They introduce a few new ideas to the model: speciation events split a random species by introducing a new species at the same or an adjacent trophic level but with a different random assignement of predators and prey, (always just from the layers above and below respectively). There is no numeric fitness but there are random extinction events that remove species from the lowest trophic layer. If all of a species' prey go extinct then that species also goes extinct. Falling overall number of species represent our extinction events.
Apparently, this again leads to a model with extinction events with sizes that follow a power law. And, apparently, for seeming reasonable numbers of trophic levels and seemingly reasonable numbers of preditor and prey connections, this model produces extinction events that show a power law distribution with an exponent very like that seen in extinction events in the fossil record.
Amaral and Mayer's model has appealed to me since I first read about it. It seems to me to be both concise but also to represent things that intuitively seem very important in the real world.
So, I'm going to implement something like the Amaral and Mayer model. An important thing to realise is that it shouldn't have to be exactly their model. If their model is robust then most implementations should have the same sorts of properties.
There's a lot that I'm not in any position to verify: what power laws are typical of the fossil record, how many trophic levels are reasonable, how many prey and predators are reasonable. But I can see that it does produce a series of extinction events with sizes following a power law distribution. And I can explore it's robustness by seeing how the power law exponent changes with the number of trophic levels and the respective rates of speciation and extinction.
I'm also looking forward to seeing how a very strict functional implementation might be work out. The first stage will produce an infinite list of food web states. Each successive state representing a time step with the effect of just one speciation event or one extinction event (and it's possible cascade). This will give us a chance to make use of functional data structures for efficient memory use sharing the unchanged parts of the web between each time step. That allows us to separate out the analysis as a separate module operating over that list of states; a neat separation of concerns that is hard to get with an iterative implementation manipulating a single state data structure. It also makes it much easier to analyse sets of adjacent states. I'm really quite excited by the clarity that this type of implementation could bring to this class of models.
The implementation will once again be in Java with Javaslang for the functional data structures and infinite series. If it's interesting to anyone we could in future look at implementations in other languages or with other programming styles. I'm very curious to see if the choice of programming tools and idioms could influence the choice of questions researchers ask, (and hope that a menu of implementation choices will one day make researchers feel even more free to ask new questions).
Further reading:
The Coded Lens
Saturday, January 9, 2016
Wednesday, December 30, 2015
Finding Nash Equilibria, part 3
Don't trust me. I've shown you some pretty graphs and spun some stories but you really shouldn't trust me. My code for this model is available and is fairly easy to run with open source tools that are available for almost any computer. You can find all the code for the blog on GitHub: https://github.com/scrawlings/CodedLens and I'll be making it even easier to use as I go along as well as suggesting improvements and pointing out my weaknesses.
I've used Java to code this model. You'll need an up-to-date Java JDK installed as I've used some of the latest Java features and some state-of-the art libraries. I may in future use some other languages but they will all build and run from the same build script.
You'll need the git source code management system installed, and you'll need to clone The Coded Lens GitHub repository (search a bit, or ask around, for the best way to do that on your system).
At the time of writing you'll need to have Gradle installed to build the code, the first thing you'll need to do is run the command: gradle build
The main routine and some constants to set N, the window size, and the number of rounds are in ConverganceVisualisation.java
There are also some useful examples of working with different parts of the model written as tests.
I like to do my development in IntelliJ IDEA and the community edition will work just fine for our purposes; you should build the IntelliJ project files with the command: gradle idea
How the model works
There are two parts to the system: the model itself which is run each round and learns from the outcome, and the series of differences in payout from which the charted metric is derived.
The model is implemented in an object oriented style with mutation: each player object contains an array that holds the weightings for it's strategy and those weightings are changed after each round. There is also a mutating structure, a circular buffer, to hold the current window of differences in payout to help efficiently recalculate the sum over a window of most recent differences in payout for the score analysis.
The question you should be asking is why am I emphasising that these things use mutation...
By contrast the actual playing of the series games, which I've called a tournament, and the publication of the difference metric from the iteration of rounds is implemented as an infinite stream, (not the loop you may have expected). It is processed using a programming idiom call list processing.
This contrast of styles between the mutation of single long lived object versus generating new objects with each round illustrates one of the major differences in programming style much debated by programming professionals.
We transition to immutable structures and infinite lists with the results of each round, but the players themselves are mutating at each round based on the outcome of playing that round. This means the play of each round is a serial activity and it's not the worst outcome to handle it with mutating structures.
The results being immutable and in an infinite list open up the possibility of very light weight distribution across many processing cores before the results are reassembled into the dataset to be rendered in our chart. The mapping of the sum of square of differences function to each step of output is computationally expensive but because of this implementation it would be perfectly safe to process in parallel.
Functional programming with the Javaslang library
I want to rave about the Javaslang library! It provides a complimentary set of facilities that together enable tidy implementations of many of the functional programming idioms you might want to play with. The core of java was moving in this direction and disappointed many professional programmers by stopping short of a complete suite of functional programming tools. Techniques that you'll see touted as strengths of many new languages are now available in Java through Javaslang.
Actually I've only used the streams and infinite streams facilities and the tuple objects in this example. But even those two things made this implementation much easier.
They key is that a functional world we don't allow mutation, so an iterative model like this is implemented as a list representing the series of successive states. Taken to it's logical conclusion this means we would be able to inspect the complete state at any point in the series. Critically, functional data structures can make holding that much data quite efficient, effectively only holding the delta from the previous state. Sadly, in my implementation, I've used mutating players and we lose that information.
As an example of an infinite stream I've made a test that generates a infinite stream of Fibonacci numbers from which it drops the ones it doesn't need and takes those it cares about. The Stream.cons(head, () -> rest) idiom central to this technique, (clearly this is Lisp style cons list). It looks like recursion but instead of filling up the stack the pseudo-recursive call is wrapped in a lambda and deferred until the value is needed; it's only theoretically infinite, in practice it only generates the values asked for.
Danger! Be Warned! You'll notice in my test and in the tournament itself that I'm not holding onto the reference to the infinite list. I'm not assigning it to some variable with a nice name which in many situations would be considered good style. If I did that I'd be holding onto the head of the list, and the head of the list points to the next item, and so on for every list item generated, eventually running out of memory. This is called holding onto the head and is common gotcha in this style of programming. I also reveals a subtle truth about programming languages: processing lines in order, one after the other, implies the holding of state from one line to the next and that mutations made in one line are available in subsequent lines, this seemingly obvious and necessary truth causes all sorts of grief particularly when you want to start making use of all the cores on your computer or when your models get so big that you need to distribute them in networks and cloud computing settings.
I'm going to be using these functional programming techniques a lot, and eventually I'll show you how they can actually be used to take advantage of parallelism in modern computing architectures to get better performance of larger and more complex models.
What next? Hanging onto the state of the players at each round for later analysis.
One frustration I had with my implementation is that by the time I was doing analysis of data I only had the difference of payout in each round to work with. Because I'd mutated the players at each round I couldn't look at them to see what patterns and changes to strategies might underlie my visible results.
I mutate two different structures. First, a rolling window of recent values so that i don't have to repeat the summation work across the whole window each round. Basically I subtract the value that's fallen out of the window and add the new value that is the new head of the window and keep a rolling sum that way. This structure is a queue and Javaslang comes to the rescue with a functional data structure for the job.
Functional data structures are immutable, if you use an operation to change them, what really happens is you get a new reference to structure that shares the unchanged parts of the old structure but covers it with some shadowing elements that represent your changes. Importantly, if you don't hold onto the references then the shadowed parts can be released. This helps with safe sharing in multithreaded systems and also helps us to be efficient with memory use.
The other mutating element is the one we really want to make immutable. That is our players. Players contain an array of weightings for strategy selection and Javaslang has a functional array which we could use instead. There are two mutators in our players, increase and reduce, that can be changed to return new players built with new references from updating a functional version of the strategy array.
At which point, instead of just an infinite stream of Double as the scores output from the tournament we could have richer round outcome object with updated players, chosen strategies, scores.
Functional data structures give us a great tool for efficiently holding all the data generated by the model. More data makes future analysis easier. That's reason number two for why you want to learn these techniques for modelling, (reason number one being their good fit for modern parallelisation).
There is one other mutating variable in the model but it's hidden. That's the source of the random numbers used to make the players strategy choices in each round. That can be resolved by using an explicitly seeded instance of Random and would allow us to regenerate the same series of results for deeper analysis if we wanted.
What next? Hidden variables and unexplored state space.
The learning strategy I've implemented seems pretty dependent on the range of possible weightings. I've used a model with N^2 tokens shared between N strategies. I found that other values could be very unstable and never converge, (e.g. 100 tokens per strategy to start with). That suggests this learning strategy might not be very robust. What we really want is for the expected outcomes to turn up easily for all sorts of setups, that would mean our model was robust, and robust models are very good things indeed.
I've also used a window size for the analysis that is related to the number of strategies available, and I'm limited to one window because of my technique for calculating sums over the window. There's an efficient technique for calculating the sum of any sub-series in an array of numbers that involves keeping another list that holds the total sum since element zero at each step but in its naive implementation has problems with arithmetic overflow. If we knew we never wanted more than some maximum window size we could make a smarter data structure that regenerated itself sufficiently for the required maximum window size when it overflowed. This would be an interesting programming exercise in it's own right, (and I imagine I'll have to do it sooner or later anyway...).
I start each player with evenly weighted strategy choices. It would, as a rule, be better to start with random weightings. Very often with dynamic systems like this we are trying to understand how they converge on particular states. And if you start from a very regular initial state you may have accidentally limit the system to a very narrow part of the model's possible state space. I don't think that's true in this case, but it's worth exploring, and it's good experimental practice.
Conclusions
Finding Nash equilibria is hard. Functional programming could open doors to neater model implementations.
We could spend a lot more energy on this model: refining it's implementation to give us access to more data to analyse or so we could take advantage of parallelism in analysing the data; exploring different learning strategies and initial states; and playing with more sophisticated metrics, visualisations, and types of analysis.
But there are many more models I want to try out, and each of them will give us opportunities to refine our vocabulary of implementation techniques and analysis techniques, so, let's move on...
I've used Java to code this model. You'll need an up-to-date Java JDK installed as I've used some of the latest Java features and some state-of-the art libraries. I may in future use some other languages but they will all build and run from the same build script.
You'll need the git source code management system installed, and you'll need to clone The Coded Lens GitHub repository (search a bit, or ask around, for the best way to do that on your system).
At the time of writing you'll need to have Gradle installed to build the code, the first thing you'll need to do is run the command: gradle build
The main routine and some constants to set N, the window size, and the number of rounds are in ConverganceVisualisation.java
There are also some useful examples of working with different parts of the model written as tests.
I like to do my development in IntelliJ IDEA and the community edition will work just fine for our purposes; you should build the IntelliJ project files with the command: gradle idea
How the model works
There are two parts to the system: the model itself which is run each round and learns from the outcome, and the series of differences in payout from which the charted metric is derived.
The model is implemented in an object oriented style with mutation: each player object contains an array that holds the weightings for it's strategy and those weightings are changed after each round. There is also a mutating structure, a circular buffer, to hold the current window of differences in payout to help efficiently recalculate the sum over a window of most recent differences in payout for the score analysis.
The question you should be asking is why am I emphasising that these things use mutation...
By contrast the actual playing of the series games, which I've called a tournament, and the publication of the difference metric from the iteration of rounds is implemented as an infinite stream, (not the loop you may have expected). It is processed using a programming idiom call list processing.
This contrast of styles between the mutation of single long lived object versus generating new objects with each round illustrates one of the major differences in programming style much debated by programming professionals.
We transition to immutable structures and infinite lists with the results of each round, but the players themselves are mutating at each round based on the outcome of playing that round. This means the play of each round is a serial activity and it's not the worst outcome to handle it with mutating structures.
The results being immutable and in an infinite list open up the possibility of very light weight distribution across many processing cores before the results are reassembled into the dataset to be rendered in our chart. The mapping of the sum of square of differences function to each step of output is computationally expensive but because of this implementation it would be perfectly safe to process in parallel.
Functional programming with the Javaslang library
I want to rave about the Javaslang library! It provides a complimentary set of facilities that together enable tidy implementations of many of the functional programming idioms you might want to play with. The core of java was moving in this direction and disappointed many professional programmers by stopping short of a complete suite of functional programming tools. Techniques that you'll see touted as strengths of many new languages are now available in Java through Javaslang.
Actually I've only used the streams and infinite streams facilities and the tuple objects in this example. But even those two things made this implementation much easier.
They key is that a functional world we don't allow mutation, so an iterative model like this is implemented as a list representing the series of successive states. Taken to it's logical conclusion this means we would be able to inspect the complete state at any point in the series. Critically, functional data structures can make holding that much data quite efficient, effectively only holding the delta from the previous state. Sadly, in my implementation, I've used mutating players and we lose that information.
As an example of an infinite stream I've made a test that generates a infinite stream of Fibonacci numbers from which it drops the ones it doesn't need and takes those it cares about. The Stream.cons(head, () -> rest) idiom central to this technique, (clearly this is Lisp style cons list). It looks like recursion but instead of filling up the stack the pseudo-recursive call is wrapped in a lambda and deferred until the value is needed; it's only theoretically infinite, in practice it only generates the values asked for.
Danger! Be Warned! You'll notice in my test and in the tournament itself that I'm not holding onto the reference to the infinite list. I'm not assigning it to some variable with a nice name which in many situations would be considered good style. If I did that I'd be holding onto the head of the list, and the head of the list points to the next item, and so on for every list item generated, eventually running out of memory. This is called holding onto the head and is common gotcha in this style of programming. I also reveals a subtle truth about programming languages: processing lines in order, one after the other, implies the holding of state from one line to the next and that mutations made in one line are available in subsequent lines, this seemingly obvious and necessary truth causes all sorts of grief particularly when you want to start making use of all the cores on your computer or when your models get so big that you need to distribute them in networks and cloud computing settings.
I'm going to be using these functional programming techniques a lot, and eventually I'll show you how they can actually be used to take advantage of parallelism in modern computing architectures to get better performance of larger and more complex models.
What next? Hanging onto the state of the players at each round for later analysis.
One frustration I had with my implementation is that by the time I was doing analysis of data I only had the difference of payout in each round to work with. Because I'd mutated the players at each round I couldn't look at them to see what patterns and changes to strategies might underlie my visible results.
I mutate two different structures. First, a rolling window of recent values so that i don't have to repeat the summation work across the whole window each round. Basically I subtract the value that's fallen out of the window and add the new value that is the new head of the window and keep a rolling sum that way. This structure is a queue and Javaslang comes to the rescue with a functional data structure for the job.
Functional data structures are immutable, if you use an operation to change them, what really happens is you get a new reference to structure that shares the unchanged parts of the old structure but covers it with some shadowing elements that represent your changes. Importantly, if you don't hold onto the references then the shadowed parts can be released. This helps with safe sharing in multithreaded systems and also helps us to be efficient with memory use.
The other mutating element is the one we really want to make immutable. That is our players. Players contain an array of weightings for strategy selection and Javaslang has a functional array which we could use instead. There are two mutators in our players, increase and reduce, that can be changed to return new players built with new references from updating a functional version of the strategy array.
At which point, instead of just an infinite stream of Double as the scores output from the tournament we could have richer round outcome object with updated players, chosen strategies, scores.
Functional data structures give us a great tool for efficiently holding all the data generated by the model. More data makes future analysis easier. That's reason number two for why you want to learn these techniques for modelling, (reason number one being their good fit for modern parallelisation).
There is one other mutating variable in the model but it's hidden. That's the source of the random numbers used to make the players strategy choices in each round. That can be resolved by using an explicitly seeded instance of Random and would allow us to regenerate the same series of results for deeper analysis if we wanted.
What next? Hidden variables and unexplored state space.
The learning strategy I've implemented seems pretty dependent on the range of possible weightings. I've used a model with N^2 tokens shared between N strategies. I found that other values could be very unstable and never converge, (e.g. 100 tokens per strategy to start with). That suggests this learning strategy might not be very robust. What we really want is for the expected outcomes to turn up easily for all sorts of setups, that would mean our model was robust, and robust models are very good things indeed.
I've also used a window size for the analysis that is related to the number of strategies available, and I'm limited to one window because of my technique for calculating sums over the window. There's an efficient technique for calculating the sum of any sub-series in an array of numbers that involves keeping another list that holds the total sum since element zero at each step but in its naive implementation has problems with arithmetic overflow. If we knew we never wanted more than some maximum window size we could make a smarter data structure that regenerated itself sufficiently for the required maximum window size when it overflowed. This would be an interesting programming exercise in it's own right, (and I imagine I'll have to do it sooner or later anyway...).
I start each player with evenly weighted strategy choices. It would, as a rule, be better to start with random weightings. Very often with dynamic systems like this we are trying to understand how they converge on particular states. And if you start from a very regular initial state you may have accidentally limit the system to a very narrow part of the model's possible state space. I don't think that's true in this case, but it's worth exploring, and it's good experimental practice.
Conclusions
Finding Nash equilibria is hard. Functional programming could open doors to neater model implementations.
We could spend a lot more energy on this model: refining it's implementation to give us access to more data to analyse or so we could take advantage of parallelism in analysing the data; exploring different learning strategies and initial states; and playing with more sophisticated metrics, visualisations, and types of analysis.
But there are many more models I want to try out, and each of them will give us opportunities to refine our vocabulary of implementation techniques and analysis techniques, so, let's move on...
Finding Nash Equilibria, part 2
Can a simple learning model help us understand Nash equilibria? Let's look at some results.
Given the setup described in the previous post the following graphs show the results of many runs of the model for a given N. They show the variation in the difference in payout over small windows of runs. If the variation in difference of payout over time is small then the players choices of strategies represent a stable equilibrium state.
This model is reproducing a finding that's fairly well attested. Even though Nash proved that stable equilibria always exist in games like this, it turns out that they can be enormously difficult to find. This is particularly true when they games are being used as models of some natural system and there are realistic constraints on the equilibrium seeking algorithm; the lesson is that just because something exists in theory don't expect that it will manifest in a natural system.
So, what is the equilibrium seeking algorithm we are using here? It's a type of lottery. Imagine a bucket with tickets for each strategy, say five tickets for strategy 1, and five for strategy 2, etc. Each player makes their choice by drawing a ticket from the bucket and then returning the ticket to the bucket. If they win they randomly pick another ticket and replace it with a new ticket for the strategy that just won. If they lost they find the ticket for the losing strategy and randomly replace it with a ticket for another strategy. Nothing changes for a tie. This is a particularly simple minded learning strategy: it has no knowledge of the other player's strategy, nor the actual payouts, nor any memory of the specific history of plays, all the memory is embodied in the probabilities, and the only feedback is win or lose. I set it up such that for a game of N strategies each player has N tickets for each strategy, N*N tickets in total. (I've called it button shuffling in some of my code and you'll see that mentioned on the graphs below.)
What does simple convergence look like? (N=5)
For five strategies, N=5, figure 1 shows data points that represent the difference in payout over the previous 25 rounds.
Notice that there's some really wide variation before around 300 rounds but after then they settle into some narrow range of values. That's the sign of the equilibria we're looking for, (the payouts are between -1 and +1, and in fact we're seeing the average of the square of differences in payouts over the window, so positive values only).
The blue series settles into a narrow band of choices until it finds a single stable strategy and converges on zero. This is the sort of pattern we were expecting for prisoner's dilemma style situations where in every round both players are getting the same payout.
The yellow and red series settle into some narrow non-zero bands that suggest they're each in a probabilistic equilibrium state, like rock-scissors-paper converging on 50:50 win:lose. The red series in particular is interesting; it's seems to be tending toward a single stable non-zero number which we'd expect to see for a simple probabilistic equilibria; but that means in any round one or the other player is going to lose and so they'll be constantly tweaking their probabilities meaning it doesn't get to sit smoothly on that equilibrium state, (the equilibrium is really about settling on a set of probabilities for choosing a particular strategy to play).
The green series is interesting for having a slightly noisy phase typical of a probabilistic equilibrium, but around round 3400 it experiences a transition to an even more stable probabilistic equilibrium. The fact that it's equilibrium is so close to zero may be meaningful. I'll talk about how we might efficiently hold onto the player states and other information to give as more data so we look more deeply into it's behaviour in the next post.
Quirky periodic noise and late discovery of optima, (N=8)
Figure 2 is a set of outcomes for some random games with 8 strategies, N=8, window size of 64 for the variation of differences metric.
Again we see all the series settling into a stable state after around 3000 rounds, (much later than the 300 rounds of N=5)
The blue and the red series are nice clear examples of the settling into a probabilistic equilibrium that we know should be possible. That's probably true of the yellow as well. Notice how the red actually finds a simple equilibrium just after 40000 rounds; remember it's just shuffling 64 tickets betweens 8 strategies!
But what is happening with green? It's clearly in a very different state to the others we've looked at so far. Up until about 6000 rounds it seems to be tending toward a some wide and high band of values for it's equilibrium state, but then the band drops to a lower but wider band of differences. Sometimes it finds a large variation in the payout differences over the window of rounds, but then at other times it finds a much smaller variation in the payout difference in the rounds covered by the window. More than that the high variation state seems to hover around a consistent value and low variation state seems like it might be stable as well. We may be looking at a two attractor complex system and some weird phasing between my window size and the size length of time in each of the two competing equilibria.
This type of complex dynamic system with multiple stable modes of operation that it can flip between is a classic in the study of complex systems, if thats what series like this represent then it's not surprising. I'll show more models with this behaviour in future. But for now look up logistic map, bifurcation diagram, and Mitchell Feigenbaum.
Stability is still there to be found, (N=10)
Figure 3 shows a set of outcomes for some random games with 10 strategies, N=10, window size of 100 for the variation of difference metric.
I've decided to include this set of results to illustrate the kind of obvious convergence on a narrow range of differences we expected to see for straight-forward probabilistic equilibria. The only truly interesting thing here is that for only two more strategies over the previous examples it now takes around 8000 rounds to find those equilibria.
Equilibrium keeps getting harder to find, (N=12)
Here is set of outcome for some random games with 12 strategies, N=12, window size of 144 for the variation of difference metric.
Even in the nice green series here it doesn't find equilibrium until after about 16000 rounds. The growth in time to find stability is really steep.
The blue series seems to find an early noisy dynamic equilibrium, (results in a narrow band), but then really sharply, just after 50000 rounds, it settles into a much narrow band, much closer to the output we expect for simple probabilistic equilibrium.
The red series is interesting here. It's clearly settled into some sort of dynamically stable regime, with results tending to either of two values on either edge of the band out outputs, and few in the middle of the band. And it's showing what to my eye look like phasing effects probably from my windowing strategy. I think the yellow series might be doing the same but in a less obvious way.
What does it all mean?
First, simple systems can produce complex behaviour, which will be repeated theme in this blog.
Second, equilibrium is not a simple concept, which means automatically recognising it or quantifying it is also hard.
Third, small growth in variables (increasing N in this case) can have serious impact on the performance of models like this. In this case running enough of the model to see the interesting results took much longer simply because it needed more rounds to see the interesting behaviours.
Fourth, finding the interesting range for variable is an art not a science.
Fifth, we're going to need to look more closely to understand some of these data series.
Given the setup described in the previous post the following graphs show the results of many runs of the model for a given N. They show the variation in the difference in payout over small windows of runs. If the variation in difference of payout over time is small then the players choices of strategies represent a stable equilibrium state.
This model is reproducing a finding that's fairly well attested. Even though Nash proved that stable equilibria always exist in games like this, it turns out that they can be enormously difficult to find. This is particularly true when they games are being used as models of some natural system and there are realistic constraints on the equilibrium seeking algorithm; the lesson is that just because something exists in theory don't expect that it will manifest in a natural system.
So, what is the equilibrium seeking algorithm we are using here? It's a type of lottery. Imagine a bucket with tickets for each strategy, say five tickets for strategy 1, and five for strategy 2, etc. Each player makes their choice by drawing a ticket from the bucket and then returning the ticket to the bucket. If they win they randomly pick another ticket and replace it with a new ticket for the strategy that just won. If they lost they find the ticket for the losing strategy and randomly replace it with a ticket for another strategy. Nothing changes for a tie. This is a particularly simple minded learning strategy: it has no knowledge of the other player's strategy, nor the actual payouts, nor any memory of the specific history of plays, all the memory is embodied in the probabilities, and the only feedback is win or lose. I set it up such that for a game of N strategies each player has N tickets for each strategy, N*N tickets in total. (I've called it button shuffling in some of my code and you'll see that mentioned on the graphs below.)
What does simple convergence look like? (N=5)
For five strategies, N=5, figure 1 shows data points that represent the difference in payout over the previous 25 rounds.
Figure 1, N=5 example games
Notice that there's some really wide variation before around 300 rounds but after then they settle into some narrow range of values. That's the sign of the equilibria we're looking for, (the payouts are between -1 and +1, and in fact we're seeing the average of the square of differences in payouts over the window, so positive values only).
The blue series settles into a narrow band of choices until it finds a single stable strategy and converges on zero. This is the sort of pattern we were expecting for prisoner's dilemma style situations where in every round both players are getting the same payout.
The yellow and red series settle into some narrow non-zero bands that suggest they're each in a probabilistic equilibrium state, like rock-scissors-paper converging on 50:50 win:lose. The red series in particular is interesting; it's seems to be tending toward a single stable non-zero number which we'd expect to see for a simple probabilistic equilibria; but that means in any round one or the other player is going to lose and so they'll be constantly tweaking their probabilities meaning it doesn't get to sit smoothly on that equilibrium state, (the equilibrium is really about settling on a set of probabilities for choosing a particular strategy to play).
The green series is interesting for having a slightly noisy phase typical of a probabilistic equilibrium, but around round 3400 it experiences a transition to an even more stable probabilistic equilibrium. The fact that it's equilibrium is so close to zero may be meaningful. I'll talk about how we might efficiently hold onto the player states and other information to give as more data so we look more deeply into it's behaviour in the next post.
Quirky periodic noise and late discovery of optima, (N=8)
Figure 2 is a set of outcomes for some random games with 8 strategies, N=8, window size of 64 for the variation of differences metric.
Figure 2, example N=8 games
Again we see all the series settling into a stable state after around 3000 rounds, (much later than the 300 rounds of N=5)
The blue and the red series are nice clear examples of the settling into a probabilistic equilibrium that we know should be possible. That's probably true of the yellow as well. Notice how the red actually finds a simple equilibrium just after 40000 rounds; remember it's just shuffling 64 tickets betweens 8 strategies!
But what is happening with green? It's clearly in a very different state to the others we've looked at so far. Up until about 6000 rounds it seems to be tending toward a some wide and high band of values for it's equilibrium state, but then the band drops to a lower but wider band of differences. Sometimes it finds a large variation in the payout differences over the window of rounds, but then at other times it finds a much smaller variation in the payout difference in the rounds covered by the window. More than that the high variation state seems to hover around a consistent value and low variation state seems like it might be stable as well. We may be looking at a two attractor complex system and some weird phasing between my window size and the size length of time in each of the two competing equilibria.
This type of complex dynamic system with multiple stable modes of operation that it can flip between is a classic in the study of complex systems, if thats what series like this represent then it's not surprising. I'll show more models with this behaviour in future. But for now look up logistic map, bifurcation diagram, and Mitchell Feigenbaum.
Stability is still there to be found, (N=10)
Figure 3 shows a set of outcomes for some random games with 10 strategies, N=10, window size of 100 for the variation of difference metric.
Figure 3, N=10 example games
I've decided to include this set of results to illustrate the kind of obvious convergence on a narrow range of differences we expected to see for straight-forward probabilistic equilibria. The only truly interesting thing here is that for only two more strategies over the previous examples it now takes around 8000 rounds to find those equilibria.
Equilibrium keeps getting harder to find, (N=12)
Here is set of outcome for some random games with 12 strategies, N=12, window size of 144 for the variation of difference metric.
Figure 4, N=12 example games
The blue series seems to find an early noisy dynamic equilibrium, (results in a narrow band), but then really sharply, just after 50000 rounds, it settles into a much narrow band, much closer to the output we expect for simple probabilistic equilibrium.
The red series is interesting here. It's clearly settled into some sort of dynamically stable regime, with results tending to either of two values on either edge of the band out outputs, and few in the middle of the band. And it's showing what to my eye look like phasing effects probably from my windowing strategy. I think the yellow series might be doing the same but in a less obvious way.
What does it all mean?
First, simple systems can produce complex behaviour, which will be repeated theme in this blog.
Second, equilibrium is not a simple concept, which means automatically recognising it or quantifying it is also hard.
Third, small growth in variables (increasing N in this case) can have serious impact on the performance of models like this. In this case running enough of the model to see the interesting results took much longer simply because it needed more rounds to see the interesting behaviours.
Fourth, finding the interesting range for variable is an art not a science.
Fifth, we're going to need to look more closely to understand some of these data series.
Finding Nash Equilibria, part 1
John Nash is the patron saint of game theory, the importance of his mathematical discoveries and methods across the breadth of the sciences is hard to overestimate. His profound work is perhaps even more humanly significant given his person struggles. Thanks to "A Beautiful Mind" he is one of a very few celebrity scientists and mathmeticians. His passing this year was a great tragedy and I'd like to dedicate this first model in The Coded Lens to his memory.
Nash's game theory often describes a game where two players pick from a limited menu of strategies to play against each other. Depending on the pair of strategies played in the round each player gets a payout.
The typical structure of Nash's games is as follows:
There are many customised styles of game with different constraints that have been developed to test many different theories. Most people are familiar with two games that fit the constraints listed above: The prisoner's dilemma and rock-scissors-paper.
The prisoner's dilemma is a game with N=2 and the payouts structured so that if both players collaborate they can maximise the payout and share it equally, but if one cheats the cheater wins a bigger share but the net result is worse, and if they both cheat then both lose some and the net result is a loss but they individually lose less than if they play nice and are the victim of cheating. Mutual cheating ends up being the only stable strategy that minimises a players loss and in a blind competition is the rational behaviour. A great deal of moral theorising has been informed by this game.
Rock-scissors-paper is game with N=3 and the payouts structured so that there is no simple strategy to minimise losses. In fact the best way, (baring tricks of human psychology), is to make your choices randomly with equal probability of playing any strategy.
Nash's work proves some interesting truths that are present in those two games. The first is that the best rational stable result may not be the best result if you could organise to collaborate outside of the game; forced blindness cuts us off from the best outcomes. The second is that the stable result may be a probalistic concern over many plays of the game; so we can ensure you win some of the time but you'll also lose some of the time, the net outcome overtime will minimise your loss.
The existence of Nash equilibria is often wrongly taken to mean something like there being an obvious strategy producing an optimal outcome. For instance if markets are seen as Nash games, (a moderately reasonable abstraction), then misinterpreting Nash equilibria to mean that there is a simple optimal strategy choice would (wrongly) prop up the idea of stable markets.
Beyond that there's another problem. Finding the Nash equilibrium is hard. The more strategies, (larger N), the harder it is.
Hunting Nash Equilibrium
This model sets up Nash type games for N strategies by randomly assigning payouts to different strategies pairs and then plays them "endlessly" with players changing the probability of their most recent strategy choice based on if it won or lost and starting from equal probabilities; the probabilities become each player's memory of past play, but the actual payouts are never available to players nor is the opponent's choice.
We're using feedback from play to modify subsequent play, that's a recipe for a complex dynamic system.
The metric we'll look at is the the difference in payout in each play:
Nash's game theory often describes a game where two players pick from a limited menu of strategies to play against each other. Depending on the pair of strategies played in the round each player gets a payout.
The typical structure of Nash's games is as follows:
- N strategies
- 2 players
- Both players are always free to play any strategy
- Playing a strategy against another strategy has a payout
- Payouts are the same for both players
- Play is simultaneous and the payout is oblivious to history
- Rational play is defined as maximising the individual payout, (or minimising loss)
There are many customised styles of game with different constraints that have been developed to test many different theories. Most people are familiar with two games that fit the constraints listed above: The prisoner's dilemma and rock-scissors-paper.
The prisoner's dilemma is a game with N=2 and the payouts structured so that if both players collaborate they can maximise the payout and share it equally, but if one cheats the cheater wins a bigger share but the net result is worse, and if they both cheat then both lose some and the net result is a loss but they individually lose less than if they play nice and are the victim of cheating. Mutual cheating ends up being the only stable strategy that minimises a players loss and in a blind competition is the rational behaviour. A great deal of moral theorising has been informed by this game.
Rock-scissors-paper is game with N=3 and the payouts structured so that there is no simple strategy to minimise losses. In fact the best way, (baring tricks of human psychology), is to make your choices randomly with equal probability of playing any strategy.
Nash's work proves some interesting truths that are present in those two games. The first is that the best rational stable result may not be the best result if you could organise to collaborate outside of the game; forced blindness cuts us off from the best outcomes. The second is that the stable result may be a probalistic concern over many plays of the game; so we can ensure you win some of the time but you'll also lose some of the time, the net outcome overtime will minimise your loss.
The existence of Nash equilibria is often wrongly taken to mean something like there being an obvious strategy producing an optimal outcome. For instance if markets are seen as Nash games, (a moderately reasonable abstraction), then misinterpreting Nash equilibria to mean that there is a simple optimal strategy choice would (wrongly) prop up the idea of stable markets.
Beyond that there's another problem. Finding the Nash equilibrium is hard. The more strategies, (larger N), the harder it is.
Hunting Nash Equilibrium
We're using feedback from play to modify subsequent play, that's a recipe for a complex dynamic system.
The metric we'll look at is the the difference in payout in each play:
- in the simple equilibrium case where both players always get the same payout the the difference in payout should be 0 when the model has converged on the equilibrium;
- if we've settled into a simple probabilistic phase then we should see the difference in payout converge on a non-zero value;
- the hardest case to analyse has multiple phases which may trap any optima seeking heuristic until some interesting perturbation pushes them into a new comparatively stable state;
- if it hasn't converged anywhere the difference in payouts should be chaotically scattered.
In the next couple of posts I'll look at some of the results from running this model and talk about it's implementation.
In the outputs we'll see all the outcomes listed above: convergence on zero, convergence on simple probabilistic equilibria; chaos and phase shifts. We'll also see how the experimental set up makes some of those states noisy. We'll see that finding settling into an equilibrium state takes make longer for small increase to N. And we'll see that some of the interesting outcomes are hard to quantify.
One of my goals with this blog is to demonstrate how some modern programming techniques can be applied to these problems. I'm mostly going to build bespoke models rather than using particular modelling packages; in part because I'm interested in the breadth of modelling approaches that might be used in different domains so presenting them on an equal footing built from the ground up is fair. I also think programming for scratch might best show how simple some of the most important algorithmic models have been Also, I'm a generalist programmer so it comes easy to me.
I'll also make some suggestions for people who want to play more with this model. One of my hopes is to make a resource that students in the sciences can play with to explore how computers and algorithmic models can be a part of their research practice.
Monday, December 14, 2015
Welcome to The Coded Lens
For centuries the microscope and the telescope have been iconic tools of scientific investigation, now we have the rise of algorithmic modelling, a new type of lens through which we can see and better understand the world: The Coded Lens.
My training is in computing science, my day job is commercial software development, but I have enjoyed reading widely in the sciences. For me, one of the most interesting developments over recent decades has been the use of algorithmic models as tools in scientific investigations.
An important part of science is the possibility of independently verifying the results of scientific experiments and now we have suitably powerful laboratory equipment on our desks. I've occasionally implemented models I've read about that seemed interesting, (or at least simple), and this new blog is part of my project to pull some of those implementations together in coherent form: what has been learned through them, how I found out about them, tricks and tips gleaned from implementing them myself, and reimplementing some them for public display. And also to encourage me to go further, do more, and bite off some more challenging models.
One of the unsung heroes of modern culture is popular science writing. Many great ideas are brought to public attention through the genre and the best of our public discourse often involves non-specialists who none-the-less have a solid understanding of a broad range of concerns because of it. (Conversely, some of the worst of public discourse is soured by people who don't bother to read some of these easily available and profoundly enlightening books.) Many of the models I'll write about are found in this literature and I'll refer to it frequently.
Hopefully I'll be able to give you some new tools to better understand our world: books to read, code to run (and to tweak yourself), even some pretty pictures. And I hope also that I can encourage you not to take the results of the scientific endeavour just at face value. Test the tests, understand more deeply than ever before, and from there demand even greater things from our scientists, who are true heroes of our time.
Code will be maintained here: https://github.com/scrawlings/CodedLens.git
My training is in computing science, my day job is commercial software development, but I have enjoyed reading widely in the sciences. For me, one of the most interesting developments over recent decades has been the use of algorithmic models as tools in scientific investigations.
An important part of science is the possibility of independently verifying the results of scientific experiments and now we have suitably powerful laboratory equipment on our desks. I've occasionally implemented models I've read about that seemed interesting, (or at least simple), and this new blog is part of my project to pull some of those implementations together in coherent form: what has been learned through them, how I found out about them, tricks and tips gleaned from implementing them myself, and reimplementing some them for public display. And also to encourage me to go further, do more, and bite off some more challenging models.
One of the unsung heroes of modern culture is popular science writing. Many great ideas are brought to public attention through the genre and the best of our public discourse often involves non-specialists who none-the-less have a solid understanding of a broad range of concerns because of it. (Conversely, some of the worst of public discourse is soured by people who don't bother to read some of these easily available and profoundly enlightening books.) Many of the models I'll write about are found in this literature and I'll refer to it frequently.
Hopefully I'll be able to give you some new tools to better understand our world: books to read, code to run (and to tweak yourself), even some pretty pictures. And I hope also that I can encourage you not to take the results of the scientific endeavour just at face value. Test the tests, understand more deeply than ever before, and from there demand even greater things from our scientists, who are true heroes of our time.
Code will be maintained here: https://github.com/scrawlings/CodedLens.git
Subscribe to:
Comments (Atom)



