utils.samplers.multipack
utils.samplers.multipack
Multipack Batch Sampler
Classes
Name | Description |
---|---|
MultipackBatchSampler | Batch sampler class for multipack |
MultipackBatchSampler
utils.samplers.multipack.MultipackBatchSampler(self,
sampler,
batch_size,
batch_max_len,
lengths,=1.0,
packing_efficiency_estimate=False,
drop_last=16,
num_count_samples=False,
sequential**kwargs,
)
Batch sampler class for multipack
Functions
Name | Description |
---|---|
allocate_sequentially | Sequential allocator that preserves example order |
allocate_sequentially
utils.samplers.multipack.allocate_sequentially(lengths, rank, c, n)
Sequential allocator that preserves example order
Parameters: - lengths: The lengths of all examples - rank: The current rank (for distributed training) - c: The capacity of each bin (maximum sequence length) - n: Number of ranks
Returns: - result: List of batches for the current rank - total_used: Number of actual example tokens - total_slots: Maximum theoretical number of example tokens (number of bins * bin capacity)