Skip to content

ding.league.algorithm

ding.league.algorithm

pfsp(win_rates, weighting)

Overview

Prioritized Fictitious Self-Play algorithm. Process win_rates with a weighting function to get priority, then calculate the selection probability of each.

Arguments: - win_rates (:obj:np.ndarray): a numpy ndarray of win rates between one player and N opponents, shape(N) - weighting (:obj:str): pfsp weighting function type, refer to weighting_func below Returns: - probs (:obj:np.ndarray): a numpy ndarray of probability at which one element is selected, shape(N)

uniform(win_rates)

Overview

Uniform opponent selection algorithm. Select an opponent uniformly, regardless of historical win rates.

Arguments: - win_rates (:obj:np.ndarray): a numpy ndarray of win rates between one player and N opponents, shape(N) Returns: - probs (:obj:np.ndarray): a numpy ndarray of uniform probability, shape(N)

Full Source Code

../ding/league/algorithm.py

1import numpy as np 2 3 4def pfsp(win_rates: np.ndarray, weighting: str) -> np.ndarray: 5 """ 6 Overview: 7 Prioritized Fictitious Self-Play algorithm. 8 Process win_rates with a weighting function to get priority, then calculate the selection probability of each. 9 Arguments: 10 - win_rates (:obj:`np.ndarray`): a numpy ndarray of win rates between one player and N opponents, shape(N) 11 - weighting (:obj:`str`): pfsp weighting function type, refer to ``weighting_func`` below 12 Returns: 13 - probs (:obj:`np.ndarray`): a numpy ndarray of probability at which one element is selected, shape(N) 14 """ 15 weighting_func = { 16 'squared': lambda x: (1 - x) ** 2, 17 'variance': lambda x: x * (1 - x), 18 } 19 if weighting in weighting_func.keys(): 20 fn = weighting_func[weighting] 21 else: 22 raise KeyError("invalid weighting arg: {} in pfsp".format(weighting)) 23 24 assert isinstance(win_rates, np.ndarray) 25 assert win_rates.shape[0] >= 1, win_rates.shape 26 # all zero win rates case, return uniform selection prob 27 if win_rates.sum() < 1e-8: 28 return np.full_like(win_rates, 1.0 / len(win_rates)) 29 fn_win_rates = fn(win_rates) 30 probs = fn_win_rates / fn_win_rates.sum() 31 return probs 32 33 34def uniform(win_rates: np.ndarray) -> np.ndarray: 35 """ 36 Overview: 37 Uniform opponent selection algorithm. Select an opponent uniformly, regardless of historical win rates. 38 Arguments: 39 - win_rates (:obj:`np.ndarray`): a numpy ndarray of win rates between one player and N opponents, shape(N) 40 Returns: 41 - probs (:obj:`np.ndarray`): a numpy ndarray of uniform probability, shape(N) 42 """ 43 return np.full_like(win_rates, 1.0 / len(win_rates))