In order to evaluate your own model on the benchmark, we have made available two notebooks showing how to do it from tensorflow or pytorch.

Or, you can simply use the api directly as follows:

from harmonization.common import load_clickme_val
from harmonization.evaluation import evaluate_clickme

clickme_dataset = load_clickme_val(batch_size = 128)

scores = evaluate_clickme(model = model, # tensorflow or pytorch model
                          clickme_val_dataset = clickme_dataset,


If you are using a Pytorch model, you need to specify a explainer function (see the pytorch notebook).