_evaluator
Module for evaluating the performance of a solver.
This module provides functions to evaluate the performance of a solver on a dataset. It includes functions to evaluate the search efficiency and temporal performance of a solver.
get_dataset
def get_dataset(filename="deepcubea-dataset--cube3.json")
Get a dataset from a file or download it if it doesn't exist.
Arguments:
filename
str - The filename of the dataset.
Returns:
dict
- The dataset.
evaluate_search_efficiency
def evaluate_search_efficiency(solver, num_samples=1000, beam_width=2**10 if device.type == "cpu" else 2**13, verbose=False)
Evaluate the model's performance on a downstream temporal performance.
This function evaluates the model's performance by solving a set of scrambles using different beam widths. It then fits a predictor to model the relationship between solution length and time, and predicts the solution length achievable in just a second.
Arguments:
solver
- The solver to evaluate.num_samples
int - The number of samples to use for evaluation.beam_width
int - The beam width to use for evaluation.verbose
bool - Whether to display a progress bar.
Returns:
dict
- The mean results.
evaluate_temporal_performance
def evaluate_temporal_performance(solver, num_samples=1000, t_standard=1.0, beam_width_space=2 ** np.arange(6 if device.type == "cpu" else 10, 16 + 1), verbose=False)
Evaluate the model's performance on a downstream temporal performance.
This function evaluates the model's performance by solving a set of scrambles using different beam widths. It then fits a predictor to model the relationship between solution length and time, and predicts the solution length at t=1.
Arguments:
solver
- The solver to evaluate.num_samples
int - The number of samples to use for evaluation.t_standard
float - The standard time.beam_width_space
array - The beam widths to use for evaluation.verbose
bool - Whether to display a progress bar.
Returns:
float
- The predicted solution length at t=1.