ding.worker.collector.metric_serial_evaluator¶
ding.worker.collector.metric_serial_evaluator
¶
IMetric
¶
Bases: ABC
gt(metric1, metric2)
abstractmethod
¶
Overview
Whether metric1 is greater than metric2 (>=)
.. note:: If metric2 is None, return True
MetricSerialEvaluator
¶
Bases: ISerialEvaluator
Overview
Metric serial evaluator class, policy is evaluated by objective metric(env).
Interfaces: init, reset, reset_policy, reset_env, close, should_eval, eval Property: env, policy
__init__(cfg, env=None, policy=None, tb_logger=None, exp_name='default_experiment', instance_name='evaluator')
¶
Overview
Init method. Load config and use self._cfg setting to build common serial evaluator components,
e.g. logger helper, timer.
Arguments:
- cfg (:obj:EasyDict): Configuration EasyDict.
reset_env(_env=None)
¶
Overview
Reset evaluator's environment. In some case, we need evaluator use the same policy in different environments. We can use reset_env to reset the environment. If _env is not None, replace the old environment in the evaluator with the new one
Arguments:
- env (:obj:Optional[Tuple[DataLoader, IMetric]]): Instance of the DataLoader and Metric
reset_policy(_policy=None)
¶
Overview
Reset evaluator's policy. In some case, we need evaluator work in this same environment but use different policy. We can use reset_policy to reset the policy. If _policy is None, reset the old policy. If _policy is not None, replace the old policy in the evaluator with the new passed in policy.
Arguments:
- policy (:obj:Optional[namedtuple]): the api namedtuple of eval_mode policy
reset(_policy=None, _env=None)
¶
Overview
Reset evaluator's policy and environment. Use new policy and environment to collect data. If _env is not None, replace the old environment in the evaluator with the new one If _policy is None, reset the old policy. If _policy is not None, replace the old policy in the evaluator with the new passed in policy.
Arguments:
- policy (:obj:Optional[namedtuple]): the api namedtuple of eval_mode policy
- env (:obj:Optional[Tuple[DataLoader, IMetric]]): Instance of the DataLoader and Metric
close()
¶
Overview
Close the evaluator. If end_flag is False, close the environment, flush the tb_logger and close the tb_logger.
__del__()
¶
Overview
Execute the close command and close the evaluator. del is automatically called to destroy the evaluator instance when the evaluator finishes its work
should_eval(train_iter)
¶
Overview
Determine whether you need to start the evaluation mode, if the number of training has reached the maximum number of times to start the evaluator, return True
eval(save_ckpt_fn=None, train_iter=-1, envstep=-1)
¶
Overview
Evaluate policy and store the best policy based on whether it reaches the highest historical reward.
Arguments:
- save_ckpt_fn (:obj:Callable): Saving ckpt function, which will be triggered by getting the best reward.
- train_iter (:obj:int): Current training iteration.
- envstep (:obj:int): Current env interaction step.
Returns:
- stop_flag (:obj:bool): Whether this training program can be ended.
- eval_metric (:obj:float): Current evaluation metric result.
Full Source Code
../ding/worker/collector/metric_serial_evaluator.py