Skip to content Skip to sidebar Skip to footer

Why Standard Logging Module Is So Slow When Disabled?

I did this benchmark: import logging import copy import pstats import cProfile from io import StringIO def difference(first_stats, second_stats): first_stats = copy.deepcopy

Solution 1:

I guess it's not about logging module it's about Python in general. Consider the following code. Four simple functions. The only difference between them is how many times attribute access is performed. See result below.

class A:
    def __init__(self):
        self.a = False
        self.a_b = B()

class B:
    def __init__(self):
        self.b = False
        self.b_c = C()

class C:
    def __init__(self):
        self.c = False

import time
N = 10_000_000
a = A()

def f_0(N):
    start = time.time()
    for dummy in range(N):
        if False:
            print(111)  # never printed
    stop = time.time()
    return stop - start

def f_1(N):
    start = time.time()
    for dummy in range(N):
        if a.a:
            print(111)
    stop = time.time()
    return stop - start

def f_2(N):
    start = time.time()
    for dummy in range(N):
        if a.a_b.b:
            print(111)
    stop = time.time()
    return stop - start

def f_3(N):
    start = time.time()
    for dummy in range(N):
        if a.a_b.b_c.c:
            print(111)
    stop = time.time()
    return stop - start

result = {'x': list(), 'y': list()}
for run in range(10):
    for x, f in enumerate([f_0, f_1, f_2, f_3]):
        result['x'].append(str(x))
        result['y'].append(f(N))
    print(run)

import matplotlib.pyplot as plt
plt.figure()
plt.plot(result['x'], result['y'], '.')
plt.xlabel('argument access chain length')
plt.ylabel('time, seconds')
plt.show()

Result is (Python 3.6.8 / CPython / Win10 / CPU i5-7600):

enter image description here

The reason is well-known - pure python is slow when cpu is limiting issue. That's why python is not really suitable as a number grinder, for instance. From my experience, switching from custom classes and pandas to just built-in dictionaries made my python code (part with heavy calculations) almost 10 times faster.

Also, you may have a look at https://hackernoon.com/yes-python-is-slow-and-i-dont-care-13763980b5a1 - applicable to the logging part of your code.


Solution 2:

I'm not really able to reproduce your results.

If the logging module is absolutely not configured whatsoever, it's... pretty fast.

test code

import logging
import argparse
import time
import sys


log = logging.getLogger("blep")


def prepare_off():
    pass


def prepare_on():
    logging.basicConfig(level=logging.DEBUG, stream=sys.stdout)


def step(x):
    log.info("now running step %s", x)


def main():
    ap = argparse.ArgumentParser()
    ap.add_argument("-n", type=int, default=500_000)
    ap.add_argument(
        "--mode", required=True, choices=("off", "on")
    )
    args = ap.parse_args()

    prepare = globals()[f"prepare_{args.mode}"]

    prepare()
    t0 = time.perf_counter_ns()
    for x in range(args.n):
        step(x)
    t1 = time.perf_counter_ns()
    print(
        vars(args),
        (t1 - t0) / args.n,
        "ns/iter",
        file=sys.stderr,
    )


if __name__ == "__main__":
    main()

transcript

$ time python3 so54470102.py --mode off >/dev/null
{'n': 500000, 'mode': 'off'} 541.53571 ns/iter
        0.32 real         0.30 user         0.01 sys
$ time python3 so54470102.py --mode on >/dev/null
{'n': 500000, 'mode': 'on'} 26932.005352 ns/iter
       13.52 real        13.01 user         0.47 sys

Remember running under a profiler has a significant overhead (much more when there are more calls involved, e.g. when the logging module does things):

$ time python3 -m cProfile -o /dev/null so54470102.py --mode on >/dev/null
{'n': 500000, 'mode': 'on'} 45733.711544 ns/iter
       22.93 real        22.23 user         0.57 sys
$ time python3 -m cProfile -o /dev/null so54470102.py --mode off >/dev/null
{'n': 500000, 'mode': 'off'} 962.122468 ns/iter
        0.54 real         0.52 user         0.01 sys

Post a Comment for "Why Standard Logging Module Is So Slow When Disabled?"