Now we can get an intuitive understanding of the learning rate parameter α
and the softmax parameter β by changing their values, and seeing how the
simulated subject behaves.
Just set the parameters to a different value...
...save, and run.
Try this for a range of different parameter values.
We have looked at the effect of the parameters, but one of the assumptions that we haven't talked about yet,
is the effect of the initial value that the stimuli have:

On the very first trial, what do we expect to get when we pick a particular stimulus?
So far, we assumed that the subject doesn't have an initial preference, so the initial value is 0.5
for each stimulus.
But, perhaps you are Dutch and have a particular faith in orange.
Or your next door neighbour told you that the blue slotmachine is really good.
In that case, you might start out believing that one stimulus is better than the other.
If you would like to play with this option, then look in the file RLtutorial_simulate.m, change the values of v0:
v0 = .5*ones(1,2); % initial value
You can even set different initial values for each stimulus, for example:
v0 = [0.8 0.5]; % initial value
Then save, and run the RLtutorial_main.m again. What do you conclude about the effect of the initial values?
Don't forget to change the values back to 0.5 when you are done!