Multithreading vs Multiprocessing in Python 🐍

Bosco Noronha
3 min readNov 27, 2017
Aren’t they the same!?!?!

Executive Summary

The Python threading module uses threads instead of processes. Threads run in the same unique memory heap. Whereas Processes run in separate memory heaps. This, makes sharing information harder with processes and object instances. One problem arises because threads use the same memory heap, multiple threads can write to the same location in the memory heap which is why the default Python interpreter has a thread-safe mechanism, the “GIL” (Global Interpreter Lock). This prevent conflicts between threads, by executing only one statement at a time (serial processing, or single-threading).

The Global Interpretor Lock (GIL) in CPython prevents parallel threads of execution on multiple cores, thus the threading implementation on python is useful mostly for concurrent thread implementation in web-servers.

What’s Multithreading?

The multithreading library is lightweight, shares memory, responsible for responsive UI and is used well for I/O bound applications. However, the module isn’t killable and is subject to the GIL

Threading library in Python

Multiple threads live in the same process in the same space, each thread will do a specific task, have its own code, own stack memory, instruction pointer, and share heap memory. If a thread has a memory leak it can damage the other threads and parent process.

import threadingdef calc_square(number):
print('Square:' , number * number)
def calc_quad(number):
print('Quad:' , number * number * number * number)
if __name__ == "__main__":
number = 7
thread1 = threading.Thread(target=calc_square, args=(number,))
thread2 = threading.Thread(target=calc_quad, args=(number,))
# Will execute both in parallel
thread1.start()
thread2.start()
# Joins threads back to the parent process, which is this
# program
thread1.join()
thread2.join()
# This program reduces the time of execution by running tasks in parallel

What’s multiprocessing?

The multiprocessing library uses separate memory space, multiple CPU cores, bypasses GIL limitations in CPython, child processes are killable(ex. function calls in program) and is much easier to use. Some caveats of the module are a larger memory footprint and IPC’s a little more complicated with more overhead.

Checkout Multiprocessing library in the Python docs

import multiprocessingdef calc_square(number):
print('Square:' , number * number)
result = number * number
print(result)
def calc_quad(number):
print('Quad:' , number * number * number * number)
if __name__ == "__main__":
number = 7
result = None
p1 = multiprocessing.Process(target=calc_square, args=(number,))
p2 = multiprocessing.Process(target=calc_quad, args=(number,))
p1.start()
p2.start()
p1.join()
p2.join()

# Wont print because processes run using their own memory location
print(result)

An exercise, execute these programs and measure the time delta, between process & threading, relative to never using either of the libraries.

This is my first technical blog post, let me know if you found it interesting to read. It’s mostly a quick brain dump I did on a whim, I can keep doing more if you found it useful.

📝 Read this story later in Journal.

👩‍💻 Wake up every Sunday morning to the week’s most noteworthy stories in Tech waiting in your inbox. Read the Noteworthy in Tech newsletter.

--

--