6.6. UCTB.train package

6.6.1. UCTB.train.EarlyStopping module

class UCTB.train.EarlyStopping.EarlyStopping(patience)

Bases: object

Early stop if a span of newest records are not better than the current best record.

Parameters:patience (int) – The span of checked newest records.
__record_list

list – List of records.

__best

float – The current best record.

__patience

int – The span of checked newest records.

__p

int – The number of newest records that are worse than the current best record.

stop(new_value)

Append the new record to the record list and check if the number of new records than are worse than the best records exceeds the limit.

Parameters:new_value (float) – The new record generated by the newest model.
Returns:True if the number of new records than are worse than the best records exceeds the limit and triggers early stop, otherwise False.
Return type:bool
class UCTB.train.EarlyStopping.EarlyStoppingTTest(length, p_value_threshold)

Bases: object

Early Stop by t-test.

T-test is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. This method takes two intervals according to length in the record list and see if they have identical average values. If so, do early stop.

Parameters:
  • length (int) – The length of checked interval.
  • p_value_threshold (float) – The p-value threshold to decide whether to do early stop.
__record_list

list – List of records.

__best

float – The current best record.

__test_length

int – The length of checked interval.

__p_value_threshold

float – The p-value threshold to decide whether to do early stop.

stop(new_value)

Take two intervals in the record list to do t-test.

Parameters:new_value (float) – The new record generated by the newest model.
Returns:True if p value of t-test is smaller than threshold and triggers early stop, otherwise False.
Return type:bool

6.6.2. UCTB.train.MiniBatchTrain module

class UCTB.train.MiniBatchTrain.MiniBatchFeedDict(feed_dict, sequence_length, batch_size, shuffle=True)

Bases: object

Get small batches of data from dict for training at once.

Parameters:
  • feed_dict (dict) – Data dictionary consisting of key-value pairs.
  • sequence_length (int) – Only divide value in feed_dict whose length is equal to sequence_length into several batches.
  • batch_size (int) – The number of data for one training session.
  • shuffle (bool) – If set True, the input dict will be shuffled. default:True.
get_batch()

For the value in feed_dict whose length is equal to sequence_length, divide the value into several batches, and return one batch in order each time. For those whose length is not equal to sequence_length, do not change `value`and return it directly. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()

Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(data)
class UCTB.train.MiniBatchTrain.MiniBatchTrain(X, Y, batch_size)

Bases: object

Get small batches of data for training at once.

Parameters:
  • X (ndarray) – Input features. The first dimension of X should be sample size.
  • Y (ndarray) – Target values. The first dimension of Y should be sample size.
  • batch_size (int) – The number of data for one training session.
get_batch()

Returns a batch of X, Y pairs each time. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()

Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(X, Y)

Input (X, Y) pairs, shuffle and return it.

class UCTB.train.MiniBatchTrain.MiniBatchTrainMultiData(data, batch_size, shuffle=True)

Bases: object

Get small batches of data for training at once.

Parameters:
  • data (ndarray) – Input data. Its first dimension should be sample size.
  • batch_size (int) – The number of data for one training session.
  • shuffle (bool) – If set True, the input data will be shuffled. default:True.
get_batch()

Returns a batch of data each time. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()

Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(data)