Multiple Processes Write A Same Csv File, How To Avoid Conflict?
Solution 1:
There is no direct way that I know.
One common workaround is to split the responsibility between "producers" and "outputter".
Get one more process responsible for outputting the CSV from a multiprocess queue and have all the "producers" process pushes to that queue.
I'd advise looking at python's multiprocessing module and especially the part about queues . If you're stuck when trying it, raise new questions here as this can become tricky.
Alternative is to use a "giant lock" which will require each process to wait for availability of the resource (using a system mutex for example). This will make the code easier but less scalable.
Solution 2:
The only proven solution is, as Bruce explained, to have one single process accepting outputs from the "producer" processes and writing to the file. Could be a queue / messaging system, or just a plain old SQL database (from which it's easy to output csv files).
Solution 3:
As a first and easiest attempt, I would try to always flush() the output, this shall force IO to write in the file before accepting next data.
Post a Comment for "Multiple Processes Write A Same Csv File, How To Avoid Conflict?"