training_structures package

Submodules

training_structures.MCTN_Level2 module

Implements training pipeline for 2 Level MCTN.

training_structures.MCTN_Level2.single_test(model, testdata, max_seq_len=20)

Get accuracy for a single model and dataloader.

Parameters:
  • model (nn.Module) – MCTN2 Model

  • testdata (torch.utils.data.DataLoader) – Test Dataloader

  • max_seq_len (int, optional) – Maximum sequence length. Defaults to 20.

Returns:

_description_

Return type:

_type_

training_structures.MCTN_Level2.test(model, test_dataloaders_all, dataset, method_name='My method', is_packed=False, criterion=CrossEntropyLoss(), task='classification', auprc=False, input_to_float=True, no_robust=True)

Test MCTN_Level2 Module on a set of test dataloaders.

Parameters:
  • model (nn.Module) – MCTN2 Module

  • test_dataloaders_all (list) – List of dataloaders

  • dataset (Dataset) – Dataset Name

  • method_name (str, optional) – Name of method. Defaults to ‘My method’.

  • is_packed (bool, optional) – (unused). Defaults to False.

  • criterion (_type_, optional) – (unused). Defaults to nn.CrossEntropyLoss().

  • task (str, optional) – (unused). Defaults to “classification”.

  • auprc (bool, optional) – (unused). Defaults to False.

  • input_to_float (bool, optional) – (unused). Defaults to True.

  • no_robust (bool, optional) – Whether to not apply robustness transformations or not. Defaults to True.

training_structures.MCTN_Level2.train(traindata, validdata, encoder0, decoder0, encoder1, decoder1, reg_encoder, head, criterion_t0=MSELoss(), criterion_c=MSELoss(), criterion_t1=MSELoss(), criterion_r=L1Loss(), max_seq_len=20, mu_t0=0.01, mu_c=0.01, mu_t1=0.01, dropout_p=0.1, early_stop=False, patience_num=15, lr=0.0001, weight_decay=0.01, op_type=<class 'torch.optim.adamw.AdamW'>, epoch=100, model_save='best_mctn.pt', testdata=None)

Train a 2-level MCTN Instance

Parameters:
  • traindata (torch.util.data.DataLoader) – Training data loader

  • validdata (torch.util.data.DataLoader) – Test data loader

  • encoder0 (nn.Module) – Encoder for first Seq2Seq Module

  • decoder0 (nn.Module) – Decoder for first SeqSeq Module

  • encoder1 (nn.Module) – Encoder for second Seq2Seq Module

  • decoder1 (nn.Module) – Decoder for second Seq2Seq Module

  • reg_encoder (nn.Module) – Regularization encoder.

  • head (nn.Module) – Actual classifier.

  • criterion_t0 (nn.Module, optional) – Loss function for t0. Defaults to nn.MSELoss().

  • criterion_c (nn.Module, optional) – Loss function for c. Defaults to nn.MSELoss().

  • criterion_t1 (nn.Module, optional) – Loss function for t1. Defaults to nn.MSELoss().

  • criterion_r (nn.Module, optional) – Loss function for r. Defaults to nn.L1Loss().

  • max_seq_len (int, optional) – Maximum sequence length. Defaults to 20.

  • mu_t0 (float, optional) – mu_t0. Defaults to 0.01.

  • mu_c (float, optional) – mu_c. Defaults to 0.01.

  • mu_t1 (float, optional) – mu_t. Defaults to 0.01.

  • dropout_p (float, optional) – Dropout Probability. Defaults to 0.1.

  • early_stop (bool, optional) – Whether to apply early stopping or not. Defaults to False.

  • patience_num (int, optional) – Patience Number for early stopping. Defaults to 15.

  • lr (float, optional) – Learning rate. Defaults to 1e-4.

  • weight_decay (float, optional) – Weight decay coefficient. Defaults to 0.01.

  • op_type (torch.optim.Optimizer, optional) – Optimizer instance. Defaults to torch.optim.AdamW.

  • epoch (int, optional) – Number of epochs. Defaults to 100.

  • model_save (str, optional) – Path to save best model. Defaults to ‘best_mctn.pt’.

  • testdata (torch.utils.data.DataLoader, optional) – Data Loader for test data. Defaults to None.

training_structures.Supervised_Learning module

Implements supervised learning training procedures.

class training_structures.Supervised_Learning.MMDL(encoders, fusion, head, has_padding=False)

Bases: Module

Implements MMDL classifier.

__init__(encoders, fusion, head, has_padding=False)

Instantiate MMDL Module

Parameters:
  • encoders (List) – List of nn.Module encoders, one per modality.

  • fusion (nn.Module) – Fusion module

  • head (nn.Module) – Classifier module

  • has_padding (bool, optional) – Whether input has padding or not. Defaults to False.

forward(inputs)

Apply MMDL to Layer Input.

Parameters:

inputs (torch.Tensor) – Layer Input

Returns:

Layer Output

Return type:

torch.Tensor

training: bool
training_structures.Supervised_Learning.deal_with_objective(objective, pred, truth, args)

Alter inputs depending on objective function, to deal with different objective arguments.

training_structures.Supervised_Learning.single_test(model, test_dataloader, is_packed=False, criterion=CrossEntropyLoss(), task='classification', auprc=False, input_to_float=True)

Run single test for model.

Parameters:
  • model (nn.Module) – Model to test

  • test_dataloader (torch.utils.data.Dataloader) – Test dataloader

  • is_packed (bool, optional) – Whether the input data is packed or not. Defaults to False.

  • criterion (_type_, optional) – Loss function. Defaults to nn.CrossEntropyLoss().

  • task (str, optional) – Task to evaluate. Choose between “classification”, “multiclass”, “regression”, “posneg-classification”. Defaults to “classification”.

  • auprc (bool, optional) – Whether to get AUPRC scores or not. Defaults to False.

  • input_to_float (bool, optional) – Whether to convert inputs to float before processing. Defaults to True.

training_structures.Supervised_Learning.test(model, test_dataloaders_all, dataset='default', method_name='My method', is_packed=False, criterion=CrossEntropyLoss(), task='classification', auprc=False, input_to_float=True, no_robust=False)

Handle getting test results for a simple supervised training loop.

Parameters:
  • model – saved checkpoint filename from train

  • test_dataloaders_all – test data

  • dataset – the name of dataset, need to be set for testing effective robustness

  • criterion – only needed for regression, put MSELoss there

training_structures.Supervised_Learning.train(encoders, fusion, head, train_dataloader, valid_dataloader, total_epochs, additional_optimizing_modules=[], is_packed=False, early_stop=False, task='classification', optimtype=<class 'torch.optim.rmsprop.RMSprop'>, lr=0.001, weight_decay=0.0, objective=CrossEntropyLoss(), auprc=False, save='best.pt', validtime=False, objective_args_dict=None, input_to_float=True, clip_val=8, track_complexity=True)

Handle running a simple supervised training loop.

Parameters:
  • encoders – list of modules, unimodal encoders for each input modality in the order of the modality input data.

  • fusion – fusion module, takes in outputs of encoders in a list and outputs fused representation

  • head – classification or prediction head, takes in output of fusion module and outputs the classification or prediction results that will be sent to the objective function for loss calculation

  • total_epochs – maximum number of epochs to train

  • additional_optimizing_modules – list of modules, include all modules that you want to be optimized by the optimizer other than those in encoders, fusion, head (for example, decoders in MVAE)

  • is_packed – whether the input modalities are packed in one list or not (default is False, which means we expect input of [tensor(20xmodal1_size),(20xmodal2_size),(20xlabel_size)] for batch size 20 and 2 input modalities)

  • early_stop – whether to stop early if valid performance does not improve over 7 epochs

  • task – type of task, currently support “classification”,”regression”,”multilabel”

  • optimtype – type of optimizer to use

  • lr – learning rate

  • weight_decay – weight decay of optimizer

  • objective – objective function, which is either one of CrossEntropyLoss, MSELoss or BCEWithLogitsLoss or a custom objective function that takes in three arguments: prediction, ground truth, and an argument dictionary.

  • auprc – whether to compute auprc score or not

  • save – the name of the saved file for the model with current best validation performance

  • validtime – whether to show valid time in seconds or not

  • objective_args_dict – the argument dictionary to be passed into objective function. If not None, at every batch the dict’s “reps”, “fused”, “inputs”, “training” fields will be updated to the batch’s encoder outputs, fusion module output, input tensors, and boolean of whether this is training or validation, respectively.

  • input_to_float – whether to convert input to float type or not

  • clip_val – grad clipping limit

  • track_complexity – whether to track training complexity or not

training_structures.gradient_blend module

Implements training structures for gradient blending.

training_structures.gradient_blend.calcAUPRC(pts)

Calculate AUPRC score given true labels and predicted probabilities.

Parameters:

pts (list) – List of (true, predicted prob) for each sample in batch.

Returns:

AUPRC score

Return type:

float

class training_structures.gradient_blend.completeModule(encoders, fuse, head)

Bases: Module

Implements and combines sub-modules into a full classifier.

__init__(encoders, fuse, head)

Instantiate completeModule instance.

Parameters:
  • encoders (list) – List of nn.Module encoders

  • fuse (nn.Module) – Fusion module

  • head (nn.Module) – Classifier module

forward(x)

Apply classifier to output.

Parameters:

x (list[torch.Tensor]) – List of input tensors

Returns:

Classifier output

Return type:

torch.Tensor

training: bool
training_structures.gradient_blend.gb_estimate(unimodal_models, multimodal_classification_head, fuse, unimodal_classification_heads, train_dataloader, gb_epoch, batch_size, v_dataloader, lr, weight_decay=0.0, optimtype=<class 'torch.optim.sgd.SGD'>)

Compute estimate of gradient-blending score.

Parameters:
  • unimodal_models (list) – List of encoder modules

  • multimodal_classification_head (nn.Module) – Classifier given fusion instance

  • fuse (nn.Module) – Fusion module

  • unimodal_classification_heads (list) – List of unimodal classifiers

  • train_dataloader (torch.utils.data.Dataloader) – Training data loader

  • gb_epoch (int) – Number of epochs for gradient-blending

  • batch_size (int) – Batch size

  • v_dataloader (torch.utils.data.Dataloader) – Validation dataloader

  • lr (float) – Learning Rate

  • weight_decay (float, optional) – Weight decay parameter. Defaults to 0.0.

  • optimtype (torch.optim.Optimizer, optional) – Optimizer instance. Defaults to torch.optim.SGD.

Returns:

Normalized weights between unimodal and multimodal models

Return type:

float

training_structures.gradient_blend.getloss(model, head, data, monum, batch_size)

Get loss for model given classification head.

Parameters:
  • model (nn.Module) – Module to evaluate

  • head (nn.Module) – Classification head.

  • data (torch.utils.data.Dataloader) – Dataloader to evaluate on.

  • monum (int) – Unimodal model index.

  • batch_size (int) – (unused) Batch Size

Returns:

Average loss per sample.

Return type:

float

training_structures.gradient_blend.getmloss(models, head, fuse, data, batch_size)

Calculate multimodal loss.

Parameters:
  • models (list) – List of encoder models

  • head (nn.Module) – Classifier module

  • fuse (nn.Module) – Fusion module

  • data (torch.utils.data.Dataloader) – Data loader to calculate loss on.

  • batch_size (int) – Batch size of dataloader

Returns:

Average loss

Return type:

float

training_structures.gradient_blend.multimodalcompute(models, train_x)

Compute encoded representation for each modality in train_x using encoders in models.

Parameters:
  • models (list) – List of encoder instances

  • train_x (List) – List of Input Tensors

Returns:

List of encoded tensors

Return type:

List

training_structures.gradient_blend.multimodalcondense(models, fuse, train_x)

Compute fusion encoded output.

Parameters:
  • models (List) – List of nn.Modules for each encoder

  • fuse (nn.Module) – Fusion instance

  • train_x (List) – List of Input Tensors

Returns:

Fused output

Return type:

torch.Tensor

training_structures.gradient_blend.single_test(model, test_dataloader, auprc=False, classification=True)

Run single test with model and test data loader.

Parameters:
  • model (nn.Module) – Model to evaluate.

  • test_dataloader (torch.utils.data.DataLoader) – Test data loader

  • auprc (bool, optional) – Whether to return AUPRC scores or not. Defaults to False.

  • classification (bool, optional) – Whether to return classification accuracy or not. Defaults to True.

Returns:

Dictionary of (metric, value) pairs

Return type:

dict

training_structures.gradient_blend.test(model, test_dataloaders_all, dataset, method_name='My method', auprc=False, classification=True, no_robust=False)

Test module, reporting results to stdout.

Parameters:
  • model (nn.Module) – Model to test

  • test_dataloaders_all (list[torch.utils.data.Dataloader]) – List of data loaders to test on.

  • dataset (string) – Dataset name

  • method_name (str, optional) – Method name. Defaults to ‘My method’.

  • auprc (bool, optional) – Whether to use AUPRC scores or not. Defaults to False.

  • classification (bool, optional) – Whether the task is classificaion or not. Defaults to True.

  • no_robust (bool, optional) – Whether to not apply robustness variations to input. Defaults to False.

training_structures.gradient_blend.train(unimodal_models, multimodal_classification_head, unimodal_classification_heads, fuse, train_dataloader, valid_dataloader, num_epoch, lr, gb_epoch=20, v_rate=0.08, weight_decay=0.0, optimtype=<class 'torch.optim.sgd.SGD'>, finetune_epoch=25, classification=True, AUPRC=False, savedir='best.pt', track_complexity=True)

Train model using gradient_blending.

Parameters:
  • unimodal_models (list) – List of modules, unimodal encoders for each input modality in the order of the modality input data.

  • multimodal_classification_head (nn.Module) – Classification head that takes in fused output of unimodal models of all modalities

  • unimodal_classification_heads (list[nn.Module]) – List of classification heads that each takes in output of one unimodal model (must be in the same modality order as unimodal_models)

  • fuse (nn.Module) – Fusion module that takes in a list of outputs from unimodal_models and generate a fused representation

  • train_dataloader (torch.utils.data.DataLoader) – Training data loader

  • valid_dataloader (torch.utils.data.DataLoader) – Validation data loader

  • num_epoch (int) – Number of epochs to train this model on.

  • lr (float) – Learning rate.

  • gb_epoch (int, optional) – Number of epochs between re-evaluation of weights of gradient blend. Defaults to 20.

  • v_rate (float, optional) – Portion of training set used as validation for gradient blend weight estimation. Defaults to 0.08.

  • weight_decay (float, optional) – Weight decay of optimizer. Defaults to 0.0.

  • optimtype (torch.optim.Optimizer, optional) – Type of optimizer to use. Defaults to torch.optim.SGD.

  • finetune_epoch (int, optional) – Number of epochs to finetune the classification head. Defaults to 25.

  • classification (bool, optional) – Whether the task is a classification task. Defaults to True.

  • AUPRC (bool, optional) – Whether to compute auprc score or not. Defaults to False.

  • savedir (str, optional) – The name of the saved file for the model with current best validation performance. Defaults to ‘best.pt’.

  • track_complexity (bool, optional) – Whether to track complexity or not. Defaults to True.

training_structures.gradient_blend.train_multimodal(models, head, fuse, optim, trains, valids, epoch, batch_size)

Train multimodal gradient-blending model.

Parameters:
  • models (list) – List of nn.modules for the encoders

  • head (nn.Module) – Classifier, post fusion layer

  • fuse (nn.Module) – Fusion module

  • optim (torch.optim.Optimizer) – Optimizer instance.

  • trains (torch.utils.data.Dataloader) – Training data dataloader

  • valids (torch.utils.data.Dataloader) – Validation data dataloader

  • epoch (int) – Number of epochs to train on

  • batch_size (int) – Batch size

Returns:

metric

Return type:

float

training_structures.gradient_blend.train_unimodal(model, head, optim, trains, valids, monum, epoch, batch_size)

Train unimodal gradient blending module.

Parameters:
  • model (nn.Module) – Unimodal encoder

  • head (nn.Module) – Classifier instance

  • optim (torch.optim.Optimizer) – Optimizer instance

  • trains (torch.utils.data.DataLoader) – Training Dataloader Instance

  • valids (torch.utils.data.DataLoader) – Validation DataLoader Instance

  • monum (int) – Modality index

  • epoch (int) – Number of epochs to train on

  • batch_size (int) – Batch size of data loaders

Returns:

Metric

Return type:

float

training_structures.unimodal module

Implements training pipeline for unimodal comparison.

training_structures.unimodal.single_test(encoder, head, test_dataloader, auprc=False, modalnum=0, task='classification', criterion=None)

Test unimodal model on one dataloader.

Parameters:
  • encoder (nn.Module) – Unimodal encoder module

  • head (nn.Module) – Module which takes in encoded unimodal input and predicts output.

  • test_dataloader (torch.utils.data.DataLoader) – Data Loader for test set.

  • auprc (bool, optional) – Whether to output AUPRC or not. Defaults to False.

  • modalnum (int, optional) – Index of modality to consider for the test with the given encoder. Defaults to 0.

  • task (str, optional) – Type of task to try. Supports “classification”, “regression”, or “multilabel”. Defaults to ‘classification’.

  • criterion (nn.Module, optional) – Loss module. Defaults to None.

Returns:

Dictionary of (metric, value) relations.

Return type:

dict

training_structures.unimodal.test(encoder, head, test_dataloaders_all, dataset='default', method_name='My method', auprc=False, modalnum=0, task='classification', criterion=None, no_robust=False)

Test unimodal model on all provided dataloaders.

Parameters:
  • encoder (nn.Module) – Encoder module

  • head (nn.Module) – Module which takes in encoded unimodal input and predicts output.

  • test_dataloaders_all (dict) – Dictionary of noisetype, dataloader to test.

  • dataset (str, optional) – Dataset to test on. Defaults to ‘default’.

  • method_name (str, optional) – Method name. Defaults to ‘My method’.

  • auprc (bool, optional) – Whether to output AUPRC scores or not. Defaults to False.

  • modalnum (int, optional) – Index of modality to test on. Defaults to 0.

  • task (str, optional) – Type of task to try. Supports “classification”, “regression”, or “multilabel”. Defaults to ‘classification’.

  • criterion (nn.Module, optional) – Loss module. Defaults to None.

  • no_robust (bool, optional) – Whether to not apply robustness methods or not. Defaults to False.

training_structures.unimodal.train(encoder, head, train_dataloader, valid_dataloader, total_epochs, early_stop=False, optimtype=<class 'torch.optim.rmsprop.RMSprop'>, lr=0.001, weight_decay=0.0, criterion=CrossEntropyLoss(), auprc=False, save_encoder='encoder.pt', save_head='head.pt', modalnum=0, task='classification', track_complexity=True)

Train unimodal module.

Parameters:
  • encoder (nn.Module) – Unimodal encodder for the modality

  • head (nn.Module) – Takes in the unimodal encoder output and produces the final prediction.

  • train_dataloader (torch.utils.data.DataLoader) – Training data dataloader

  • valid_dataloader (torch.utils.data.DataLoader) – Validation set dataloader

  • total_epochs (int) – Total number of epochs

  • early_stop (bool, optional) – Whether to apply early-stopping or not. Defaults to False.

  • optimtype (torch.optim.Optimizer, optional) – Type of optimizer to use. Defaults to torch.optim.RMSprop.

  • lr (float, optional) – Learning rate. Defaults to 0.001.

  • weight_decay (float, optional) – Weight decay of optimizer. Defaults to 0.0.

  • criterion (nn.Module, optional) – Loss module. Defaults to nn.CrossEntropyLoss().

  • auprc (bool, optional) – Whether to compute AUPRC score or not. Defaults to False.

  • save_encoder (str, optional) – Path of file to save model with best validation performance. Defaults to ‘encoder.pt’.

  • save_head (str, optional) – Path fo file to save head with best validation performance. Defaults to ‘head.pt’.

  • modalnum (int, optional) – Which modality to apply encoder to. Defaults to 0.

  • task (str, optional) – Type of task to try. Supports “classification”, “regression”, or “multilabel”. Defaults to ‘classification’.

  • track_complexity (bool, optional) – Whether to track the model’s complexity or not. Defaults to True.

Module contents