Skip to main content

_evaluator

Module for evaluating the performance of a solver.

This module provides functions to evaluate the performance of a solver on a dataset. It includes functions to evaluate the search efficiency and temporal performance of a solver.

NOTE: The functionality in this module requires optional dependencies. To use them, please install AlphaCube with the 'eval' extra: pip install 'alphacube[eval]'

get_dataset

def get_dataset(
filename="deepcubea-dataset--cube3.json",
cache_dir=os.path.expanduser("~/.cache/alphacube")
)

Get a dataset from a file or download it if it doesn't exist.

Arguments:

  • filename str - The filename of the dataset.

Returns:

  • dict - The dataset.

evaluate_search_efficiency

def evaluate_search_efficiency(
solver,
num_samples=1000,
beam_width=2**10 if device.type == "cpu" else 2**13,
verbose=False
)

Evaluate the model's search efficiency. (Also available as solver.benchmark)

This function solves a set of scrambles and reports on key performance metrics, providing a snapshot of the solver's efficiency under specific conditions.

Arguments:

  • solver - The solver instance to evaluate.
  • num_samples int - The number of scrambles to solve for the evaluation.
  • beam_width int - The beam width to use for the search.
  • verbose bool - Whether to display a progress bar.

Returns:

  • dict - A dictionary containing the mean results for solve time (t), solution length (lmd), and nodes expanded (nodes).

evaluate_temporal_performance

def evaluate_temporal_performance(
solver,
num_samples=1000,
t_standard=1.0,
beam_width_space=2 ** np.arange(6 if device.type == "cpu" else 10, 16 + 1),
verbose=False
)

Evaluate the model's performance on a downstream temporal performance.

This function evaluates the model's performance by solving a set of scrambles using different beam widths. It then fits a predictor to model the relationship between solution length and time, and predicts the solution length at t=1.

Arguments:

  • solver - The solver to evaluate.
  • num_samples int - The number of samples to use for evaluation.
  • t_standard float - The standard time.
  • beam_width_space array - The beam widths to use for evaluation.
  • verbose bool - Whether to display a progress bar.

Returns:

  • float - The predicted solution length at t=1.