Scoring attributions using QuAC
At this point, we will have generated a set of candidate regions, using various heuristic methods.
Now, we will obtain the mask, counterfactual, and score for our final explanation using the run_evaluation.py
script.
It has the following arguments, all of which are optional:
dataset
: Which of the datasets to run the translation on. By default this will be the “test” dataset, if that does not exist it will revert to the “validation” dataset.attrs
: Where the attributions are. You should only set this if you used a customoutput
argument in the script above. By default, this is in the experiment root directory underattributions
input_fake
: Where the generated images are. You should only set this if you used a customoutput
argument in the image generation step. By default, this is in the experiment root directory undergenerated_images
names
: A selection of attribution methods to run evaluation on. This is a useful argument if you want to run evaluation on each method simultaneously, e.g. on a cluster. By default, we will sequentially run evaluation on all the methods in theattrs
directory.
To run using the defaults, simply run:
python run_evaluation.py
You can use the following command to get help with formatting your arguments.
python run_evaluation.py -h
Output
Here is the output organization that you should expect at this point.
<solver.root_dir>/
├── checkpoints/
├── generated_images/
├── attributions/
├── counterfactuals/
├── discriminative_deeplift/
│ └── class_A/class_B/... # image files
└── discriminative_ig/
│ └── class_A/class_B/...
├── masks/
├── discriminative_deeplift/
│ └── class_A/class_B/... # numpy files
└── discriminative_ig/
│ └── class_A/class_B/...
└── reports/
├── discriminative_deeplift/
│ └── default.json
└── discriminative_ig/
└── default.json