代码由我原创,有问题可以留言。本次分享会讲解怎么求解囚徒问题,囚徒问题本质上来讲是线性规划问题。
如上是北太天元的版本求解结果,代码可以扫码获取。(北太代码由老师原创)
下面这个是杉数求解器代码:
from coptpy import *
env=Envr()
# 创建一个新的模型
model = env.createModel("two_prisoners_dilemma")
# 创建变量
x = model.addVars(2, vtype=COPT.BINARY, nameprefix="x")
# 设置囚徒问题的奖励矩阵
# CC 对应 (3, 3)
# CD 对应 (0, 10)
# DC 对应 (10, 0)
# DD 对应 (9, 9)
# 设置目标函数为囚徒A的奖励
model.setObjective(3 * x[0] + 10 * (1 - x[0]), COPT.MAXIMIZE)
# 添加约束条件,确保囚徒B的策略与囚徒A的策略相匹配
model.addConstr(x[1] == 1 - x[0], "constraint")
# 求解模型
model.solve()
# 打印结果
if model.status == COPT.OPTIMAL:
strategy_A = "Confess" if x[0].x > 0.5 else "Deny"
strategy_B = "Confess" if x[1].x > 0.5 else "Deny"
print(f"Optimal strategy for Prisoner A: {strategy_A}")
print(f"Optimal strategy for Prisoner B: {strategy_B}")
print(f"Objective value (Prisoner A's reward): {model.objVal}")
else:
print("No optimal solution found")
这个是杉数求解器代码,我们看看思路。
这个地方我们求解的思路和北太不一样,我们的思路是寻找利益最大化,然后反向求解这个问题。
现在我们来看看求解结果:
Cardinal Optimizer v7.1.3. Build date Apr 29 2024
Copyright Cardinal Operations 2024. All Rights Reserved
Model fingerprint: 1d5dbc9
Using Cardinal Optimizer v7.1.3 on Windows
Hardware has 4 cores and 8 threads. Using instruction set X86_AVX512_E1 (14)
Maximizing a MIP problem
The original problem has:
1 rows, 2 columns and 2 non-zero elements
2 binaries
Starting the MIP solver with 8 threads and 32 tasks
Presolving the problem
Problem was solved during presolve
Best solution : 10.000000000
Best bound : 10.000000000
Best gap : 0.0000%
Solve time : 0.02
Solve node : 0
MIP status : solved
Solution status : integer optimal (relative gap limit 0.0001)
Violations : absolute relative
bounds : 0 0
rows : 0 0
integrality : 0
Optimal strategy for Prisoner A: Deny
Optimal strategy for Prisoner B: Confess
Objective value (Prisoner A's reward): 10.0
这个是求解结果,我们反向思考这个问题如果A抗拒,对方利益就最大化了,所以应该选择坦白,B同理可得。