probe¶
Data Loader¶
- class diagnnose.probe.data_loader.DataLoader(corpus: Corpus, activations_dir: Optional[str] = None, model: Optional[LanguageModel] = None, activation_names: Optional[List[Tuple[int, str]]] = None, train_selection_func: Optional[Callable[[int, Example], bool]] = None, test_activations_dir: Optional[str] = None, test_corpus: Optional[Corpus] = None, test_selection_func: Optional[Callable[[int, Example], bool]] = None, control_task: Optional[Callable[[int, Example], Union[str, int]]] = None, train_test_ratio: Optional[float] = None, create_new_activations: bool = False)[source]¶
Bases:
object
Reads in pickled activations that have been extracted, and creates a train/test split of activations and labels.
Train/test split can be created in multiple ways: 1. Using the test activations from a different corpus 2. Using the activations from the same corpus for both, but the train/test split is defined on 2 separate selection_funcs. 3. Based on a random 90/10 split.
- Parameters
corpus (Corpus) – Corpus containing the labels for each sentence.
activations_dir (str, optional) – Directory containing the extracted activations. If not provided, new activations will be extracted. If
create_new_activations
is set toTrue
, the newly extracted activations will be stored in this directory.test_activations_dir (str, optional) – Directory containing the extracted test activations. If not provided the train activation set will be split and partially used as test set.
test_corpus (Corpus, optional) – Corpus containing the test labels for each sentence. Must be provided if test_activations_dir is provided.
train_selection_func (SelectFunc, optional) – Selection function that determines whether a corpus item should be taken into account for training. If not provided all extracted activations will be used and split into a random train/test split.
test_selection_func (SelectFunc, optional) – Selection function that determines whether a corpus item should be taken into account for testing. If not provided all extracted activations will be used and split into a random train/test split.
control_task (ControlTask, optional) – Control task function of Hewitt et al. (2019), mapping a corpus item to a random label.
train_test_ratio (float, optional) – Ratio of the train/test split. If separate test activations are provided this split won’t be used. Defaults to None, but must be provided if no test_selection_func is passed.
create_new_activations (bool, optional) – Toggle to create new activations based on corpus and model. Overwrites existing activations that might be present in
activations_dir
.
- load(activation_name: Tuple[int, str]) DataDict [source]¶
Creates train/test data split of activations
- Parameters
activation_name (ActivationName) – (layer, name) tuple indicating the activations to be read in
- Returns
data_dict – Dictionary containing train and test activations, and their corresponding labels and, optionally, control labels.
- Return type
DataDict
DC Trainer¶
- class diagnnose.probe.dc_trainer.DCTrainer(data_loader: DataLoader, save_dir: str, lr: float = 0.01, max_epochs: int = 10, rank: Optional[int] = None, lambda1: float = 0.0, verbose: int = 0)[source]¶
Bases:
object
Trains Diagnostic Classifiers (DC) on extracted activation data.
For each activation that is part of the provided activation_names argument a different classifier will be trained.
- Parameters
data_loader (DataLoader) –
DataLoader
that contains the activations and labels on which the DCs will be trained. ADataLoader
can contain activations for multiple layers and gates of a model, for which separate DCs will be trained and evaluated.save_dir (str) – Directory to which trained models will be saved.
lr (float, optional) – Learning rate of the linear classifier that is used during training. Defaults to 0.01.
max_epochs (int, optional) – Maximum number of training epochs used for cross-validation. Defaults to 10.
rank (int, optional) – Matrix rank of the linear classifier. Defaults to the full rank if not provided.
lambda1 (float, optional) – Coefficient for L1 regularization that can be increased to induce sparsity in the diagnostic classifier. Defaults to 0., indicating no L1 regularization.
verbose (int, optional) – Set to any positive number for verbosity. Defaults to 0.
- classifier¶
Current classifier that is being trained.
- Type
Classifier
Logreg¶
- class diagnnose.probe.logreg.L1NeuralNetClassifier(*args: Any, **kwargs: Any)[source]¶
Bases:
NeuralNetClassifier
- class diagnnose.probe.logreg.LogRegModule(ninp: int, nout: int, rank: Optional[int] = None)[source]¶
Bases:
Module
- forward(inp: Tensor, create_softmax=True)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶