6.6. UCTB.train package¶

6.6.1. UCTB.train.EarlyStopping module¶

class UCTB.train.EarlyStopping.EarlyStopping(patience)¶

Bases: object

Early stop if a span of newest records are not better than the current best record.

Parameters:	patience (int) – The span of checked newest records.

__record_list¶: list – List of records.

__best¶: float – The current best record.

__patience¶: int – The span of checked newest records.

__p¶: int – The number of newest records that are worse than the current best record.

stop(new_value)¶

Append the new record to the record list and check if the number of new records than are worse than the best records exceeds the limit.

Parameters:	new_value (float) – The new record generated by the newest model.
Returns:	`True` if the number of new records than are worse than the best records exceeds the limit and triggers early stop, otherwise `False`.
Return type:	bool

class UCTB.train.EarlyStopping.EarlyStoppingTTest(length, p_value_threshold)¶

Bases: object

Early Stop by t-test.

T-test is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. This method takes two intervals according to length in the record list and see if they have identical average values. If so, do early stop.

Parameters:	length (int) – The length of checked interval. p_value_threshold (float) – The p-value threshold to decide whether to do early stop.

__record_list¶: list – List of records.

__best¶: float – The current best record.

__test_length¶: int – The length of checked interval.

__p_value_threshold¶: float – The p-value threshold to decide whether to do early stop.

stop(new_value)¶

Take two intervals in the record list to do t-test.

Parameters:	new_value (float) – The new record generated by the newest model.
Returns:	`True` if p value of t-test is smaller than threshold and triggers early stop, otherwise `False`.
Return type:	bool

6.6.2. UCTB.train.MiniBatchTrain module¶

class UCTB.train.MiniBatchTrain.MiniBatchFeedDict(feed_dict, sequence_length, batch_size, shuffle=True)¶

Bases: object

Get small batches of data from dict for training at once.

Parameters:	feed_dict (dict) – Data dictionary consisting of key-value pairs. sequence_length (int) – Only divide value in feed_dict whose length is equal to sequence_length into several batches. batch_size (int) – The number of data for one training session. shuffle (bool) – If set True, the input dict will be shuffled. default:True.

get_batch()¶: For the value in feed_dict whose length is equal to sequence_length, divide the value into several batches, and return one batch in order each time. For those whose length is not equal to sequence_length, do not change `value`and return it directly. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()¶: Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(data)¶

class UCTB.train.MiniBatchTrain.MiniBatchTrain(X, Y, batch_size)¶

Bases: object

Get small batches of data for training at once.

Parameters:	X (ndarray) – Input features. The first dimension of X should be sample size. Y (ndarray) – Target values. The first dimension of Y should be sample size. batch_size (int) – The number of data for one training session.

get_batch()¶: Returns a batch of X, Y pairs each time. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()¶: Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(X, Y)¶: Input (X, Y) pairs, shuffle and return it.

class UCTB.train.MiniBatchTrain.MiniBatchTrainMultiData(data, batch_size, shuffle=True)¶

Bases: object

Get small batches of data for training at once.

Parameters:	data (ndarray) – Input data. Its first dimension should be sample size. batch_size (int) – The number of data for one training session. shuffle (bool) – If set True, the input data will be shuffled. default:True.

get_batch()¶: Returns a batch of data each time. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()¶: Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(data)¶