November 4, 2020

Headache-free concurrency in Python with Ori

At Neocrym, we write a lot of Python code that we want to go really fast.. and we typically have a great many CPU cores available to us. And we wrote a new Python library to help make it happen.

At Neocrym, we write a lot of Python code that we want to go really fast.. and we typically have a great many CPU cores available to us.

Concurrent execution is harder in CPython than in languages like C++, Java, Go, or Rust because of several reasons:

  • The CPython interpreter has the Global Interpreter Lock, which enforces that only one thread of Python bytecode can run at any given time. Note that this doesn't prevent thread-level parallelism when threads are waiting on I/O. The GIL also does not apply to code that isn't Python bytecode--such as Python extensions written in C or Cython--and libraries like NumPy, Pandas, and TensorFlow use this to their advantage.
  • Running multiple processes is resource-intensive because each one requires a full copy of the Python interpreter and (usually) any relevant data. Other languages can fork processes that avoid copying memory until it is written to, but the CPython reference counter triggers writes even if the programmer only intends to read data. Objects often cannot be passed between Python processes if they are not pickleable.

There are many tools for concurrency in Python. The Python standard library has threading, multiprocessing, asyncio, and concurrent.futures. There are many third-party libraries like Twisted, Trio, eventlet, and gevent--typically focused on using asynchronous I/O to handle many network connections at once. Libraries like Celery and dask.distributed help scale multiprocessing across many machines in a cluster.

Despite all of these options, it is difficult to effectively mix I/O-heavy and compute-heavy workloads in the same Python program. It is easy to run into issues with function color, or with compute-heavy code blocking the I/O loop and locking the GIL.

To solve this, we created Ori--which is a high-level wrapper around around the concurrency primitives in the Python standard library. A few of Ori's most interesting submodules are:

ori.concurrency

ori.concurrency takes a given function and runs it in a background thread or process. Note that if you use a background thread, you are subject to the GIL, and if you use a background process, the function and the data you pass has to be copyable and pickle-able.

ori.poolchain

ori.poolchain contains the PoolChain, which is a class to help create pipelines of threadpools and processpools that iterate over a list of elements.

For example, if you wanted to download some files from the Internet for some compute-heavy processing and save the results, you could organize it with a three-step PoolChain, which would:

  1. create a thread pool to download the files from the Internet.
  2. create a process pool to do the compute-heavy processing.
  3. create another thread p0ol to save the results to, say, Amazon S3 or a network-attached filesystem.

ori.subprocess

ori.subprocess is a tool for running shell commands in the background while collecting the standard output and standard error in real time. This makes it easy to integrate external commands into Python, responding to or logging their I/O just like native Python code.


You can find Ori's documentation at ori.technology.neocrym.com and the source code at github.com/neocrym/ori.

And just in case you were wondering, the name "Ori" comes from the god-like space alien villains in the television show Stargate SG-1.