Withoug going into calculations (which is not relevant): Small data * times + indicators (and associated sub-indicators) * times will end up being something big.
10 processes can be broken down to:4 core computer 2 threads per core
For a total of 4 x 2 workers, plus 2 additional python processes (there may be a master created by the multiprocessing module) for a grand total of 10. Seems right.
If you release the memory from the previous iterations you lose the results. If you don't have complex resample/replay scenarios, the suggestion is to use exactbars=1 when creating/running the cerebro, which tries to reduce the buffers to the minimum.
Or you break your optimization into different runs to make them fit within the limits of your machine.