Python Basics
Multi-Processing
Example below on using multi-processing in python for feature engineering (i.e. creating features from lagged windows).
Terms
Pool: A process pool object which controls a pool of worker processes to which jobs can be submitted.map: A parallel equivalent of the map().join: If the optional argument timeout is None (the default), the method blocks until the process whose join() method is called terminates. If timeout is a positive number, it blocks at most timeout seconds.close: Indicate that no more data will be put on this queue by the current process.
import pandas as pd
from multiprocessing import Pool
pool = Pool(processes=4)
out = pool.map(prep_func, lags=[1,2,3])
df = pd.concat(out,axis=1).reset_index(drop=False)
pool.close(), pool.join()