_evaluator
Module for evaluating the performance of a solver.
This module provides functions to evaluate the performance of a solver on a dataset. It includes functions to evaluate the search efficiency and temporal performance of a solver.
NOTE: The functionality in this module requires optional dependencies. To use them,
please install AlphaCube with the 'eval' extra: pip install 'alphacube[eval]'
get_dataset
def get_dataset(
filename="deepcubea-dataset--cube3.json",
cache_dir=os.path.expanduser("~/.cache/alphacube")
)
Get a dataset from a file or download it if it doesn't exist.
Arguments:
filename
str - The filename of the dataset.
Returns:
dict
- The dataset.
evaluate_search_efficiency
def evaluate_search_efficiency(
solver,
num_samples=1000,
beam_width=2**10 if device.type == "cpu" else 2**13,
verbose=False
)
Evaluate the model's search efficiency. (Also available as solver.benchmark
)
This function solves a set of scrambles and reports on key performance metrics, providing a snapshot of the solver's efficiency under specific conditions.
Arguments:
solver
- The solver instance to evaluate.num_samples
int - The number of scrambles to solve for the evaluation.beam_width
int - The beam width to use for the search.verbose
bool - Whether to display a progress bar.
Returns:
dict
- A dictionary containing the mean results for solve time (t
), solution length (lmd
), and nodes expanded (nodes
).
evaluate_temporal_performance
def evaluate_temporal_performance(
solver,
num_samples=1000,
t_standard=1.0,
beam_width_space=2 ** np.arange(6 if device.type == "cpu" else 10, 16 + 1),
verbose=False
)
Evaluate the model's performance on a downstream temporal performance.
This function evaluates the model's performance by solving a set of scrambles using different beam widths. It then fits a predictor to model the relationship between solution length and time, and predicts the solution length at t=1.
Arguments:
solver
- The solver to evaluate.num_samples
int - The number of samples to use for evaluation.t_standard
float - The standard time.beam_width_space
array - The beam widths to use for evaluation.verbose
bool - Whether to display a progress bar.
Returns:
float
- The predicted solution length at t=1.