PYTHON CONCURRENCY FOR HIGH-LOAD MULTICORE PROCESSING

ПАРАЛЛЕЛИЗМ В PYTHON ДЛЯ ВЫСОКОНАГРУЖЕННОЙ ОБРАБОТКИ НА МНОГОПРОЦЕССОРНЫХ СИСТЕМАХ
Zholdybay A. Aituov A.
Цитировать:
Zholdybay A., Aituov A. PYTHON CONCURRENCY FOR HIGH-LOAD MULTICORE PROCESSING // Universum: технические науки : электрон. научн. журн. 2025. 5(134). URL: https://7universum.com/ru/tech/archive/item/20073 (дата обращения: 05.12.2025).
Прочитать статью:
DOI - 10.32743/UniTech.2025.134.5.20073

 

ABSTRACT

This article evaluates Python’s concurrency methods for processor-intensive tasks on multicore systems. Multithreading, multiprocessing, and hybrid strategies were compared for CPU-bound, I/O-bound, and mixed workloads. Benchmarks showed that Python’s Global Interpreter Lock (GIL) prevents threads from accelerating CPU-bound tasks, while multiprocessing achieves near-linear speedup (3.7× on 4 cores) at the cost of memory. For I/O tasks, both methods boost throughput (3×), with threads having lower overhead. A hybrid approach excels in mixed workloads, outperforming multiprocessing by 3.3×. The results, discussed with Amdahl’s Law, highlight GIL limitations and offer guidance on choosing optimal concurrency strategies in Python.

АННОТАЦИЯ

В статье проводится оценка методов параллелизма в Python для задач с высокой загрузкой процессора на многопроцессорных системах. Сравниваются многопоточность, многопроцессорность и гибридный подход. Эксперименты показали, что GIL ограничивает ускорение потоков в CPU-задачах, тогда как многопроцессорность достигает почти линейного ускорения (3.7× на 4 ядрах) при увеличении потребления памяти. Для I/O-нагрузки потоки и процессы увеличивают пропускную способность более чем в 3 раза. Гибридный метод показал лучшие результаты для смешанных задач (превосходя процессы в 3.3 раза). Обсуждаются ограничения GIL и выбор оптимальной стратегии.

 

Keywords: Python, concurrency, multithreading, multiprocessing, Global Interpreter Lock, multi-core, parallel computing, performance

Ключевые слова: Python, параллелизм, многопоточность, многопроцессорность, глобальная блокировка интерпретатора, многопроцессорные системы, параллельные вычисления, производительность

 

Introduction

In order to effectively utilize hardware capabilities, modern multicore CPUs require parallel programming. However, because only one thread may execute Python bytecode at a time, Python's GIL limits concurrent processing in CPU-bound contexts. Due to this restriction, multithreading is only effective for tasks that are I/O-bound and require threads to wait on outside resources.

By using distinct Python interpreters for each process, multiprocessing gets around the GIL. Although this method works well for workloads involving a lot of computation, it has overhead and uses more memory. Hybrid concurrency aims to balance CPU and I/O performance by integrating threads within processes.

To comprehend theoretical and practical speedups, this study compares all three approaches across a range of workloads and applies Amdahl's Law to the results.

Materials and methods

Tests ran on a system with a 4-core (8-thread) Intel CPU and 16 GB RAM using Python 3.10. Up to 4 workers were used to match physical cores.

Workloads:

  • CPU-bound: Intensive calculations (prime finding and SHA-256 hashing).
  • I/O-bound: File and network I/O, where tasks wait for disk or network responses.
  • Hybrid: Image processing (I/O) mixed with CPU filters, simulating real-world mixed workloads.

Concurrency Models:

  • Sequential: Baseline single-threaded execution.
  • Multithreading: 4 threads in a single process.
  • Multiprocessing: 4 independent processes.
  • Hybrid: 2 processes with 2 threads each.

Measurements: execution time, CPU usage, and memory consumption were measured and averaged over five runs to ensure accuracy.

Results

Table 1.

CPU-Bound Tasks

Method

Time (s)

Speedup

CPU Usage (%)

Memory (MB)

Sequential

100

1.0

100

100

Multithreading

105

0.95

100

110

Multiprocessing

27

3.7

390

350

Hybrid

55

1.8

200

180

 

Findings:

  • Multithreading provided no benefit and was slightly slower due to overhead.
  • Multiprocessing nearly achieved ideal speedup, using multiple cores fully.
  • Hybrid used about two cores effectively, offering moderate improvement.

Table 2.

I/O-Bound Tasks Conclusion

Method

Time (s)

Speedup

CPU Usage (%)

Memory (MB)

Sequential

100

1.0

100

100

Multithreading

28

3.57

15

110

Multiprocessing

32

3.13

20

300

Hybrid

30

3.33

20

180

 

Findings:

  • Multithreading performed best due to lower overhead and ability to overlap I/O.
  • Multiprocessing added extra overhead but still significantly reduced execution time.
  • Hybrid method achieved balanced performance between processes and threads.

Table 3.

I/O- Hybrid Workloads (Mixed CPU and I/O)

Method

Time (s)

Speedup

CPU Usage (%)

Memory (MB)

Sequential

100

1.0

100

100

Multithreading

55

1.8

100

110

Multiprocessing

35

2.5

320

300

Hybrid

30

3.3

370

180

 

Findings:

  • Hybrid model outperformed pure multiprocessing due to better task overlap.
  • Multiprocessing was still effective, but idle I/O periods reduced CPU utilization.
  • Multithreading could not fully utilize cores during CPU-heavy phases.

Discussion

Results confirm that Python’s GIL limits multithreading for CPU-bound tasks, making multiprocessing the best choice despite higher memory usage. However, multithreading shines in I/O-bound scenarios due to its lower overhead and ability to overlap I/O tasks efficiently. Hybrid approaches offer the best of both worlds in mixed workloads, outperforming multiprocessing alone by approximately 14%.

Amdahl’s Law helped estimate maximum achievable speedups, though real-world results fell short due to unavoidable overhead. Future developments such as Python's PEP 703 may eliminate the GIL, potentially enhancing multithreading performance for CPU-heavy tasks.

Conclusion

This study demonstrated that selecting an appropriate concurrency strategy in Python depends on workload type:

  • CPU-bound → Multiprocessing delivers near-linear speedup.
  • I/O-bound → Multithreading offers excellent throughput with minimal overhead.
  • Mixed workloads → Hybrid models achieve the best balance.

While Python's GIL presents a challenge, combining multiprocessing and threading effectively mitigates its impact. Future Python versions may resolve these limitations, making concurrency more straightforward.

 

References:

  1. Meier R., Gross T. Reflections on the compatibility, performance, and scalability of parallel Python. // Proceedings of the 15th ACM SIGPLAN International Symposium on Dynamic Languages (DLS ’19). New York: ACM. – 2019. – S. 129–140.
  2. Aziz Z. A., et al. Python parallel processing and multiprocessing: A review. // Academic Journal of Nawroz University. – 2021. – Vol. 10, № 3. – S. 345–354.
  3. Rocklin M. Dask: Parallel computation with blocked algorithms and task scheduling. // Proceedings of the 14th Python in Science Conference (SciPy 2015). – 2015. – S. 126–132.
  4. Pérez F., Granger B. E. IPython: A system for interactive scientific computing. // Computing in Science & Engineering. – 2011. – Vol. 13, № 2. – S. 21–29.
  5. Sodian L., et al. Concurrency and parallelism in speeding up I/O and CPU-bound tasks in Python 3.10. // Proceedings of the 2nd International Conference on Computer Science, Electronic Information Engineering & Intelligent Control (CEI 2022). – 2022. – S. XX–XX.
  6. Krivtsov S., et al. Performance evaluation of Python libraries for multithreading data processing. // Modern Information Systems. – 2024. – Vol. 8, № 1. – S. 37–45.
  7. Gustafson J. L. Reevaluating Amdahl’s law. // Communications of the ACM. – 1988. – Vol. 31, № 5. – S. 532–533.
  8. Hill M. D., Marty M. R. Amdahl’s law in the multicore era. // Computer. – 2008. – Vol. 41, № 7. – S. 33–38.
  9. Gross S. PEP 703 – Making the global interpreter lock optional in CPython. // Python Enhancement Proposal. – 2023. – Available at: https://peps.python.org/pep-0703/ (accessed 20.04.2025).
  10. Castro O., Bruneau P., Sottet J.-S., Torregrossa D. Landscape of high-performance Python to develop data science and machine learning applications. // ACM Computing Surveys. – 2023. – Vol. 56, № 3. – S. 1–30.
Информация об авторах

Master Student, School of Information Technologies and Engineering, Kazakh-British Technical University, Kazakhstan, Almaty

магистрант, Школа информационных технологий и инженерии, Казахстанско-Британский технический университет, Казахстан, г. Алматы

PhD, Senior Lecturer, School of Information Technologies and Engineering Kazakh-British Technical University, Almaty, Kazakhstan

PhD, старший преподаватель, Школа информационных технологий и инженерии, Казахстанско-Британский технический университет, Казахстан, г. Алматы

Журнал зарегистрирован Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор), регистрационный номер ЭЛ №ФС77-54434 от 17.06.2013
Учредитель журнала - ООО «МЦНО»
Главный редактор - Звездина Марина Юрьевна.
Top