9 min read, 1900 words

Logging and capturing messages from libraries and nested modules

Logging in Python, for the most part, is a simple task: import logging and logging.warning() your way to victory. This is fine for most basic scripts, you end up with sensible defaults: logging to the console, for all log levels higher than INFO.

import logging

logging.info("starting demo")
logging.warning("yolo logging")

Running this script produces:

$ python demo.py
WARNING:root:yolo logging

Once you get into utilizing libraries and more specifically, libraries that implement the recommended python logging configuration for libraries, logging via module name can get confusing. Unless you know better and configure logging correctly, you will most likely be throwing away all log messages from libraries. While not a huge problem for most users with third party libraries, if these are instead your own libraries that you are re-using, you probably are actually interested in them.

Python helpfully allows you to toss a new logger on the stack by simply calling logger = logging.getLogger(__name__). This will give you a logging object that will inherit the config of any upstream calling modules in the stack.

All files used in this demo can be found in this git repo for anyone interested in following along or playing around with the configuration.

Consider the following simple, yet verbose, configuration file config.ini:

[loggers]
keys = root, main_app

[handlers]
keys = console

[formatters]
keys = generic

[logger_root]
level = INFO
handlers = console

[logger_main_app]
level = INFO
handlers =
qualname = main_app

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic

[formatter_generic]
format = %(asctime)s %(levelname)-5.5s [%(name)s] [%(threadName)s] %(message)s

This will give us a basic hierarchy of loggers, typical of a simple modular application: a root logger with a specific handler and a main_app logger with no handler specified. Due to the hierarchy, loggers without a specific handler, level, qualname, etc will defer to their parent, in this case root unless propagate = 0 is passed as well.

Additionally, there a single handler: console, which is referenced by the root logger and a single formatter, generic, referenced by the console handler.

So this example flows correctly as expected, any messages logged from within main_app are cascaded correctly up to the root logger, handed off to the console handler, formatted via the generic formatter and eventually emitted via the stderr stream.

This post aims to take us through from a simple application through to complex imports and sub module logging configurations, with the end result having a layout on disk as such:

├── config.ini
├── first_library
│   └── __init__.py
├── main_app
│   ├── app.py
│   └── __init__.py
├── run.py
└── second_library
    ├── __init__.py
    └── nested_thing.py

All the following examples will be run via the following script run.py:

from main_app.app import MainApp

app = MainApp()

def main():
    app.run()


if __name__ == "__main__":
    main()

First off, a demo on how log messages follow the logger, handler, and formatter hierarchies starting with the simple configuration above.

main_app/app.py:

import logging
import logging.config


from main_app.thing import AppThing

logger = logging.getLogger(__name__)

class MainApp(object):
    def __init__(self):
        logging.config.fileConfig('config.ini')

    def run(self):
        logger.info("main_app checking in")

Executing this simplistic class with the above config should result in the following output:

$ python run.py
2018-06-13 18:48:44,775 INFO  [main_app.app] [MainThread] main_app checking in

Excellent, we have emitted a log with some useful context. Albeit, this is with a rather more complex configuration file than needed simply to get console logging, but it will give us the structure needed to continue into the rest of the demo.

Alright, how about a sub module of main_app? With no modification to the config and the addition of a sub module in main_app/thing.py:

import logging

logger = logging.getLogger(__name__)

class AppThing(object):
    def thingy(self):
        logger.info("AppThing.thingy() checking in")

Additionally, main_app/app.py has been changed to reflect this new module:

import logging
import logging.config


from main_app.thing import AppThing

logger = logging.getLogger(__name__)

class MainApp(object):
    def __init__(self):
        logging.config.fileConfig('config.ini')

    def run(self):
        logger.info("main_app checking in")
        app_thing = AppThing()
        app_thing.thingy()
$ python run.py
2018-06-13 19:01:55,487 INFO  [main_app.app] [MainThread] main_app checking in
2018-06-13 19:01:55,487 INFO  [main_app.thing] [MainThread] AppThing.thingy() checking in

Sweet. Logging in sub modules for free, no additional config. Magical propagation up the logger hierarchy, all loggers and handlers doing the right thing.

Now, lets add in a separate module that logs in the exact same way, with no additional configuration. Meet first_library/__init__.py:

import logging

logger = logging.getLogger(__name__)

class ThingOne(object):
    def thingy(self):
        logger.info("ThingOne.thingy() checking in")
$ python run.py
2018-06-13 19:15:37,027 INFO  [main_app.app] [MainThread] main_app checking in
2018-06-13 19:15:37,027 INFO  [main_app.thing] [MainThread] AppThing.thingy() checking in

Hmm… No new log lines. I guess we need to tell the logging configuration about this new library. Lets add a new logger to the list of loggers at the top and then define the logger. New config.ini:

[loggers]
keys = root, main_app, first_library

[handlers]
keys = console

[formatters]
keys = generic

[logger_root]
level = INFO
handlers = console

[logger_main_app]
level = INFO
handlers =
qualname = main_app

[logger_first_library]
level = INFO
handlers =
qualname = first_library

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic

[formatter_generic]
format = %(asctime)s %(levelname)-5.5s [%(name)s] [%(threadName)s] %(message)s

Most of this should look familiar, just adding in a single new logger that will also cascade up to the root logger. Lets try again with this new config:

$ python run.py
2018-06-13 19:17:43,659 INFO  [main_app.app] [MainThread] main_app checking in
2018-06-13 19:17:43,659 INFO  [main_app.thing] [MainThread] AppThing.thingy() checking in
2018-06-13 19:17:43,659 INFO  [first_library] [MainThread] ThingOne.thingy() checking in

OK Cool, we now are collecting messages from a (for example’s sake) 3rd party library.

Now, what about libraries with sub modules? Consider second_library:

second_library
├── __init__.py
└── nested_thing.py

For the sake of this example __init__.py is blank. nested_thing.py contains:

import logging

logger = logging.getLogger(__name__)

class ThingTwo(object):
    def thingy(self):
        logger.info("ThingTwo.thingy() checking in")

As we learned before, we will need a new logger to collect messages from this new library, so lets add the logger to the list of loggers and append the following to the config:

[logger_second_library]
level = INFO
handlers =
qualname = second_library

Now to test with the new library!

$ python run.py
2018-06-13 19:28:02,328 INFO  [main_app.app] [MainThread] main_app checking in
2018-06-13 19:28:02,328 INFO  [main_app.thing] [MainThread] AppThing.thingy() checking in
2018-06-13 19:28:02,328 INFO  [first_library] [MainThread] ThingOne.thingy() checking in
2018-06-13 19:28:02,328 INFO  [second_library.nested_thing] [MainThread] ThingTwo.thingy() checking in

Everything looks good, we have two libraries, one logging from a sub module, one logging from a top level module.

Note: The above config will not function correctly on Python 2.7.6 (default shipping with Ubuntu 14.04.x LTS) due to an ordering bug in the logging config parsing, which is the entire reason I went down this path of a minimal proof of concept to show that I am not crazy. Lo and behold, when I moved the project off of my 14.04 instance and onto a more up to date system, the issue I had just reproduced was gone. Apparently I was hitting another case that was fixed in Python issue 17508, where handlers were not being associated correctly when specified in specific orders in the config file.

Other (still reproducible) caveats: If you specify the same handler multiple times in the hierarchy, without setting propagate to 0 or changing level, you WILL get duplicate messages emitted, as the handlers would be attached multiple times. As an example, if we give logger_main_app an explicit handler as such:

[logger_main_app]
level = INFO
handlers = console
qualname = main_app

We now get the following output:

$ python run.py
2018-06-14 07:48:34,635 INFO  [main_app.app] [MainThread] main_app checking in
2018-06-14 07:48:34,635 INFO  [main_app.app] [MainThread] main_app checking in
2018-06-14 07:48:34,635 INFO  [main_app.thing] [MainThread] AppThing.thingy() checking in
2018-06-14 07:48:34,635 INFO  [main_app.thing] [MainThread] AppThing.thingy() checking in
2018-06-14 07:48:34,635 INFO  [first_library] [MainThread] ThingOne.thingy() checking in
2018-06-14 07:48:34,635 INFO  [second_library.nested_thing] [MainThread] ThingTwo.thingy() checking in

Note that only messages picked up by the main_app logger are duplicated, as the other loggers are still deferring their handler selection to the root logger. This behavior, while unhelpful in this particular configuration, is helpful for when one wants to either log to multiple handlers while still cascading up to the default handler. A quick example would be if we were interested in out of band notification of all log messages for main_app but not as concerned about anything logged by libraries, instead of adding a console handler to logger_main_app we instead add a SMTPHandler so that we are emailed the contents of any log message. One last demo showing this, still using console handlers for the sake of display with the following config and alteration to main_app:

config.ini:

[loggers]
keys = root, main_app, first_library, second_library

[handlers]
keys = console, console_warn

[formatters]
keys = generic, notgeneric

[logger_root]
level = INFO
handlers = console

[logger_main_app]
level = INFO
handlers = console_warn
qualname = main_app

[logger_first_library]
level = INFO
handlers =
qualname = first_library

[logger_second_library]
level = INFO
handlers =
qualname = second_library

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic

[handler_console_warn]
class = StreamHandler
args = (sys.stderr,)
level = WARN
formatter = notgeneric

[formatter_generic]
format = %(asctime)s %(levelname)-5.5s [%(name)s] [%(threadName)s] %(message)s

[formatter_notgeneric]
format = "OMG" %(asctime)s %(levelname)-5.5s [%(name)s] [%(threadName)s] %(message)s

main_app/thing.py:

@@ -5,3 +5,4 @@ logger = logging.getLogger(__name__)
 class AppThing(object):
     def thingy(self):
         logger.info("AppThing.thingy() checking in")
+        logger.warning("AppThing.thingy() checking in AS A WARNING")
$ python run.py
2018-06-14 08:32:11,744 INFO  [main_app.app] [MainThread] main_app checking in
2018-06-14 08:32:11,745 INFO  [main_app.thing] [MainThread] AppThing.thingy() checking in
"OMG" 2018-06-14 08:32:11,745 WARNI [main_app.thing] [MainThread] AppThing.thingy() checking in AS A WARNING
2018-06-14 08:32:11,745 WARNI [main_app.thing] [MainThread] AppThing.thingy() checking in AS A WARNING
2018-06-14 08:32:11,745 INFO  [first_library] [MainThread] ThingOne.thingy() checking in
2018-06-14 08:32:11,745 INFO  [second_library.nested_thing] [MainThread] ThingTwo.thingy() checking in

As you can see, the second handler has both its level set to WARN and a different formatter, which for visibility in the demo prepends “OMG” to anything sent its way. As such, we get one copy of the message through the new handler and formatter, but we still get a copy of the warning sent up to the root handler as before. If we had instead set the level on logger_main_app to WARN and leave the handler level NOTSET we would have gotten that prepended warning message, but not the INFO level log emitted by either main_app.app or main_app.thing, indicating that the level does indeed filter any messages propagated up the stack. Additionally, if we set propagate = 0 on logger_main_app we would see similar behavior, but without the warning message making it to the root handler.

With the above configuration changes:

$ python run.py
"OMG" 2018-06-14 08:40:17,626 WARNI [main_app.thing] [MainThread] AppThing.thingy() checking in AS A WARNING
2018-06-14 08:40:17,626 WARNI [main_app.thing] [MainThread] AppThing.thingy() checking in AS A WARNING
2018-06-14 08:40:17,626 INFO  [first_library] [MainThread] ThingOne.thingy() checking in
2018-06-14 08:40:17,626 INFO  [second_library.nested_thing] [MainThread] ThingTwo.thingy() checking in

And with the above reverted and just propagate = 0

$ python run.py
"OMG" 2018-06-14 08:37:46,202 WARNI [main_app.thing] [MainThread] AppThing.thingy() checking in AS A WARNING
2018-06-14 08:37:46,202 INFO  [first_library] [MainThread] ThingOne.thingy() checking in
2018-06-14 08:37:46,202 INFO  [second_library.nested_thing] [MainThread] ThingTwo.thingy() checking in

Well, that wraps up what was going to be a proof of concept post surrounding erratic behavior in the configuration of python logging, which ended up leading me to discover that the issue I had spent several hours digging into was not only fixed, but had been fixed in 2013 and I can’t even reproduce it any more.

Typical.


Packaging Nightmares Creating Debian packages for reproducible Node installations


comments powered by Disqus