_evaluator

Module for evaluating the performance of a solver.

This module provides functions to evaluate the performance of a solver on a dataset. It includes functions to evaluate the search efficiency and temporal performance of a solver.

NOTE: The functionality in this module requires optional dependencies. To use them, please install AlphaCube with the 'eval' extra: pip install 'alphacube[eval]'

`get_dataset`

def get_dataset(
    filename="deepcubea-dataset--cube3.json",
    cache_dir=os.path.expanduser("~/.cache/alphacube")
)

Get a dataset from a file or download it if it doesn't exist.

Arguments:

filename str - The filename of the dataset.

Returns:

dict - The dataset.

`evaluate_search_efficiency`

def evaluate_search_efficiency(
    solver,
    num_samples=1000,
    beam_width=2**10 if device.type == "cpu" else 2**13,
    verbose=False
)

Evaluate the model's search efficiency. (Also available as solver.benchmark)

This function solves a set of scrambles and reports on key performance metrics, providing a snapshot of the solver's efficiency under specific conditions.

Arguments:

solver - The solver instance to evaluate.
num_samples int - The number of scrambles to solve for the evaluation.
beam_width int - The beam width to use for the search.
verbose bool - Whether to display a progress bar.

Returns:

dict - A dictionary containing the mean results for solve time (t), solution length (lmd), and nodes expanded (nodes).

`evaluate_temporal_performance`

def evaluate_temporal_performance(
    solver,
    num_samples=1000,
    t_standard=1.0,
    beam_width_space=2 ** np.arange(6 if device.type == "cpu" else 10, 16 + 1),
    verbose=False
)

Evaluate the model's performance on a downstream temporal performance.

This function evaluates the model's performance by solving a set of scrambles using different beam widths. It then fits a predictor to model the relationship between solution length and time, and predicts the solution length at t=1.

Arguments:

solver - The solver to evaluate.
num_samples int - The number of samples to use for evaluation.
t_standard float - The standard time.
beam_width_space array - The beam widths to use for evaluation.
verbose bool - Whether to display a progress bar.

Returns:

float - The predicted solution length at t=1.

get_dataset​

evaluate_search_efficiency​

evaluate_temporal_performance​

`get_dataset`

`evaluate_search_efficiency`

`evaluate_temporal_performance`