Parameters: |
- distribute_list (list) – The “data” list to be partitioned, such as a list of files which will be
distributed among different tasks and each task process a part of the files.
- partition_func (function) – Partition function will be used to cut the distribute_list, it should accept
three inputs: distribute_list, i, n_job, where i is the index of jobs (i.e. integer from 0 to n_jobs-1),
n_jos is the number of threads; partition function should return a data_list for the job_i
- task_func (function) – Task function, where the inputs are share_queue, locker, data, parameters, no return.
pls refer to the DiDi-Data processing codes for more information.
- n_jobs (int) – Number of threads
- reduce_func (function) – Reduce function which combine the outputs from all the threads into one final output.
- parameters (list) – parameters send to the task function
|