6.7. UCTB.utils package

6.7.1. UCTB.utils.multi_threads module

UCTB.utils.multi_threads.multiple_process(distribute_list, partition_func, task_func, n_jobs, reduce_func, parameters)
Parameters:
  • distribute_list (list) – The “data” list to be partitioned, such as a list of files which will be distributed among different tasks and each task process a part of the files.
  • partition_func (function) – Partition function will be used to cut the distribute_list, it should accept three inputs: distribute_list, i, n_job, where i is the index of jobs (i.e. integer from 0 to n_jobs-1), n_jos is the number of threads; partition function should return a data_list for the job_i
  • task_func (function) – Task function, where the inputs are share_queue, locker, data, parameters, no return. pls refer to the DiDi-Data processing codes for more information.
  • n_jobs (int) – Number of threads
  • reduce_func (function) – Reduce function which combine the outputs from all the threads into one final output.
  • parameters (list) – parameters send to the task function