Prisoner’s Dilemma
Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communicating with the other. The prosecutors lack sufficient evidence to convict the pair on the principal charge, but they have enough to convict both on a lesser charge. Simultaneously, the prosecutors offer each prisoner a bargain. Each prisoner is given the opportunity either to betray the other by testifying that the other committed the crime, or to cooperate with the other by remaining silent. The offer is:
If A and B each betray the other, each of them serves 2yrs in prison. If A betrays B but B remains silent, A will be set free and B will serve 3yrs in prison (and vice versa). If A and B both remain silent, both will only serve 1yr in prison (on the lesser charge). If two players play prisoner’s dilemma more than once in succession and they remember previous actions of their opponent and change their strategy accordingly, the game is called iterated prisoner’s dilemma. The iterated prisoner’s dilemma game is fundamental to some theories of human cooperation and trust. On the assumption that the game can model transactions between two people requiring trust, cooperative behavior in populations may be modeled by a multi-player, iterated, version of the game. Which I will explore in today’s Anecdote (below).
In the iterated prisoner’s dilemma version, the classic game is played repeatedly between the same prisoners, who continuously have the opportunity to penalize the other for previous decisions. If the number of times the game will be played is known to the players, then (by backward induction) two classically rational players will betray each other repeatedly, for the same reasons as the single-shot variant. But in an infinite or unknown length game there is no fixed optimum strategy, and prisoner’s dilemma tournaments have been held to compete and test algorithms for such cases in an attempt to determine optimal strategies.
In such tournaments, when these encounters are repeated over a long period of time with many players, each with different strategies, greedy strategies tend to do very poorly in the long run while more altruistic strategies do better, as judged purely by self-interest. The winning strategy is “tit for tat”. The strategy is simply to cooperate on the first iteration of the game; after that, the player does what his or her opponent did on the previous move. Depending on the situation, a slightly better strategy can be “tit for tat with forgiveness”. When the opponent defects, on the next move, the player sometimes cooperates anyway. This allows for occasional recovery from getting trapped in a cycle of defections.
Almost all top-scoring strategies are “optimistic” (not defecting before its opponent does); therefore, a purely selfish strategy will not “cheat” on its opponent, for purely self-interested reasons. However, the successful strategy must not be a blind optimist. It must sometimes retaliate. An example of a non-retaliating strategy is “always cooperate”. This is a very bad choice, as “nasty” strategies ruthlessly exploit such players. Successful strategies must also be “forgiving”. Though players will retaliate, they will once again fall back to cooperating if the opponent does not continue to defect. This stops long runs of revenge and counter-revenge. The last important quality is being “non-envious”, that is not striving to score more than the opponent.
* * *
Anecdote
"Global trade is a never-ending prisoner’s dilemma," said the CIO, explaining that we’re collectively better off through cooperation but nevertheless periodically seek short-term advantage by exploiting the blindly optimistic.
"The most successful prisoner’s dilemma strategy is ‘tit for tat’ in which both players cooperate initially." But once one player chooses non-cooperation, the opponent retaliates with non-cooperation in the subsequent round, making both worse off. To interrupt that destructive cycle, one player will periodically choose to cooperate despite their opponent’s non-cooperation, signaling a desire to return to the optimal state where both cooperate and everyone is better off.
"The US and China are not cooperating. And Trump will obviously not do a deal ahead of November elections." Every deal requires compromise, and Trump won’t allow his critics to criticize any concessions ahead of mid-terms, so no deal will be done.
"What makes this round particularly interesting is that in prisoner’s dilemma, both non-cooperators are punished. But today, only China is penalized. Trump is rewarded. With each escalation, his popularity rises as does the S&P 500." The massive tax stimulus and budget deficit have temporarily insulated America from non-cooperation’s consequences.
"This creates a highly unstable equilibrium. It puts China in an impossible negotiating position."
To restore balance, Beijing must impose a material cost on the US for its non-cooperation. “China’s revealed preference to impose pain on America is through a dramatic renminbi devaluation.” China’s currency has now fallen 6% in a straight line over 4wks, which hurt emerging markets but hasn’t yet punished the S&P 500.
"Will Trump continue to turn up the heat unless his popularity falters and the S&P falls hard? Yes. And won’t China continue to attempt to devalue its currency as far as it must to restore its negotiating position? Yes."
"And is there anything else in global markets to focus on right now? No."
Submitted by Eric Peters, CIO of One River Asset Management
Fonte: qui
Nessun commento:
Posta un commento