Wednesday, December 30, 2015

Finding Nash Equilibria, part 3

Don't trust me. I've shown you some pretty graphs and spun some stories but you really shouldn't trust me. My code for this model is available and is fairly easy to run with open source tools that are available for almost any computer. You can find all the code for the blog on GitHub: https://github.com/scrawlings/CodedLens and I'll be making it even easier to use as I go along as well as suggesting improvements and pointing out my weaknesses. 

I've used Java to code this model. You'll need an up-to-date Java JDK installed as I've used some of the latest Java features and some state-of-the art libraries. I may in future use some other languages but they will all build and run from the same build script.

You'll need the git source code management system installed, and you'll need to clone The Coded Lens GitHub repository (search a bit, or ask around, for the best way to do that on your system).

At the time of writing you'll need to have Gradle installed to build the code, the first thing you'll need to do is run the command: gradle build

The main routine and some constants to set N, the window size, and the number of rounds are in ConverganceVisualisation.java

There are also some useful examples of working with different parts of the model written as tests.

I like to do my development in IntelliJ IDEA and the community edition will work just fine for our purposes; you should build the IntelliJ project files with the command: gradle idea

How the model works

There are two parts to the system: the model itself which is run each round and learns from the outcome, and the series of differences in payout from which the charted metric is derived.

The model is implemented in an object oriented style with mutation: each player object contains an array that holds the weightings for it's strategy and those weightings are changed after each round. There is also a mutating structure, a circular buffer, to hold the current window of differences in payout to help efficiently recalculate the sum over a window of most recent differences in payout for the score analysis.

The question you should be asking is why am I emphasising that these things use mutation...

By contrast the actual playing of the series games, which I've called a tournament, and the publication of the difference metric from the iteration of rounds is implemented as an infinite stream, (not the loop you may have expected). It is processed using a programming idiom call list processing.

This contrast of styles between the mutation of single long lived object versus generating new objects with each round illustrates one of the major differences in programming style much debated by programming professionals.

We transition to immutable structures and infinite lists with the results of each round, but the players themselves are mutating at each round based on the outcome of playing that round. This means the play of each round is a serial activity and it's not the worst outcome to handle it with mutating structures.

The results being immutable and in an infinite list open up the possibility of very light weight distribution across many processing cores before the results are reassembled into the dataset to be rendered in our chart. The mapping of the sum of square of differences function to each step of output is computationally expensive but because of this implementation it would be perfectly safe to process in parallel.

Functional programming with the Javaslang library

I want to rave about the Javaslang library! It provides a complimentary set of facilities that together enable tidy implementations of many of the functional programming idioms you might want to play with. The core of java was moving in this direction and disappointed many professional programmers by stopping short of a complete suite of functional programming tools. Techniques that you'll see touted as strengths of many new languages are now available in Java through Javaslang.

Actually I've only used the streams and infinite streams facilities and the tuple objects in this example. But even those two things made this implementation much easier.

They key is that a functional world we don't allow mutation, so an iterative model like this is implemented as a list representing the series of successive states. Taken to it's logical conclusion this means we would be able to inspect the complete state at any point in the series. Critically, functional data structures can make holding that much data quite efficient, effectively only holding the delta from the previous state. Sadly, in my implementation, I've used mutating players and we lose that information.

As an example of an infinite stream I've made a test that generates a infinite stream of Fibonacci numbers from which it drops the ones it doesn't need and takes those it cares about. The Stream.cons(head, () -> rest) idiom central to this technique, (clearly this is Lisp style cons list). It looks like recursion but instead of filling up the stack the pseudo-recursive call is wrapped in a lambda and deferred until the value is needed; it's only theoretically infinite, in practice it only generates the values asked for.

Danger! Be Warned! You'll notice in my test and in the tournament itself that I'm not holding onto the reference to the infinite list. I'm not assigning it to some variable with a nice name which in many situations would be considered good style. If I did that I'd be holding onto the head of the list, and the head of the list points to the next item, and so on for every list item generated, eventually running out of memory. This is called holding onto the head and is common gotcha in this style of programming. I also reveals a subtle truth about programming languages: processing lines in order, one after the other, implies the holding of state from one line to the next and that mutations made in one line are available in subsequent lines, this seemingly obvious and necessary truth causes all sorts of grief particularly when you want to start making use of all the cores on your computer or when your models get so big that you need to distribute them in networks and cloud computing settings.

I'm going to be using these functional programming techniques a lot, and eventually I'll show you how they can actually be used to take advantage of parallelism in modern computing architectures to get better performance of larger and more complex models.

What next? Hanging onto the state of the players at each round for later analysis.

One frustration I had with my implementation is that by the time I was doing analysis of data I only had the difference of payout in each round to work with. Because I'd mutated the players at each round I couldn't look at them to see what patterns and changes to strategies might underlie my visible results.

I mutate two different structures. First, a rolling window of recent values so that i don't have to repeat the summation work across the whole window each round. Basically I subtract the value that's fallen out of the window and add the new value that is the new head of the window and keep a rolling sum that way. This structure is a queue and Javaslang comes to the rescue with a functional data structure for the job.

Functional data structures are immutable, if you use an operation to change them, what really happens is you get a new reference to structure that shares the unchanged parts of the old structure but covers it with some shadowing elements that represent your changes. Importantly, if you don't hold onto the references then the shadowed parts can be released. This helps with safe sharing in multithreaded systems and also helps us to be efficient with memory use.

The other mutating element is the one we really want to make immutable. That is our players. Players contain an array of weightings for strategy selection and Javaslang has a functional array which we could use instead. There are two mutators in our players, increase and reduce, that can be changed to return new players built with new references from updating a functional version of the strategy array.

At which point, instead of just an infinite stream of Double as the scores output from the tournament we could have richer round outcome object with updated players, chosen strategies, scores.

Functional data structures give us a great tool for efficiently holding all the data generated by the model. More data makes future analysis easier. That's reason number two for why you want to learn these techniques for modelling, (reason number one being their good fit for modern parallelisation).

There is one other mutating variable in the model but it's hidden. That's the source of the random numbers used to make the players strategy choices in each round. That can be resolved by using an explicitly seeded instance of Random and would allow us to regenerate the same series of results for deeper analysis if we wanted.

What next? Hidden variables and unexplored state space.

The learning strategy I've implemented seems pretty dependent on the range of possible weightings. I've used a model with N^2 tokens shared between N strategies. I found that other values could be very unstable and never converge, (e.g. 100 tokens per strategy to start with). That suggests this learning strategy might not be very robust. What we really want is for the expected outcomes to turn up easily for all sorts of setups, that would mean our model was robust, and robust models are very good things indeed.

I've also used a window size for the analysis that is related to the number of strategies available, and I'm limited to one window because of my technique for calculating sums over the window. There's an efficient technique for calculating the sum of any sub-series in an array of numbers that involves keeping another list that holds the total sum since element zero at each step but in its naive implementation has problems with arithmetic overflow. If we knew we never wanted more than some maximum window size we could make a smarter data structure that regenerated itself sufficiently for the required maximum window size when it overflowed. This would be an interesting programming exercise in it's own right, (and I imagine I'll have to do it sooner or later anyway...).

I start each player with evenly weighted strategy choices. It would, as a rule, be better to start with random weightings. Very often with dynamic systems like this we are trying to understand how they converge on particular states. And if you start from a very regular initial state you may have accidentally limit the system to a very narrow part of the model's possible state space. I don't think that's true in this case, but it's worth exploring, and it's good experimental practice.

Conclusions

Finding Nash equilibria is hard. Functional programming could open doors to neater model implementations.

We could spend a lot more energy on this model: refining it's implementation to give us access to more data to analyse or so we could take advantage of parallelism in analysing the data; exploring different learning strategies and initial states; and playing with more sophisticated metrics, visualisations, and types of analysis.

But there are many more models I want to try out, and each of them will give us opportunities to refine our vocabulary of implementation techniques and analysis techniques, so, let's move on...

No comments:

Post a Comment