Logging library in python has been in standard library since a very long time. In many aspects, this logging framework bears similiarties with log4j
libraries in Java. In this blog post, I will go over some of the good approaches when using python log formatting.
Avoid using root logger
Avoiding root logger provides the following benefits:
- Better descriptive names regarding where the log originated from. Helps during searching of logs
- Allows better log customization
Many times, we find it quite easy to start logging using logging.info/logging.error
methods. However, using any of the methods under logging
module will result in root logger
being invoked. Take a look at the examples below involving two modules: logger_usage.py
and user_record.py
logger_usage.py
file:
import logging
import user_record
logging.basicConfig(level=logging.DEBUG)
logging.info("Hello World from root logger")
logging.error("failed to process\n")
user_record.process(-100)
user_record.process(100)
user_record.py
import logging
def process(user_id):
logging.info("processing user record = %s", user_id)
if user_id < 0:
logging.error("Invalid user id = %s", user_id)
After invoking logging_usage.py
script, we get the following output:
python logger_usage.py
INFO:root:Hello World from root logger
ERROR:root:failed to process
INFO:root:processing user record = -100
ERROR:root:Invalid user id = -100
INFO:root:processing user record = 100
If you carefully observe the output, the logs from both modules contain the entry root
(which refers to root logger name).
If you change the user_record.py
to the following:
import logging
# you could use any name. However, it is good convention to use module name.
# here __name__ => is the module name (as this is not invoked as startup script)
#https://docs.python.org/3/reference/import.html?highlight=__name__#name__
logger = logging.getLogger(__name__)
def process(user_id):
logger.info("processing user record = %s", user_id)
if user_id < 0:
logger.error("Invalid user id = %s", user_id)
After doing this change, notice the changes in the output. Instead of root
we get user_record
in logs originating from user_record
module.
python root_logger_usage.py
INFO:root:Hello World from root logger
ERROR:root:failed to process
INFO:user_record:processing user record = -100
ERROR:user_record:Invalid user id = -100
INFO:user_record:processing user record = 100
Now, let’s say in the future you realize that there are too many processing user record
logs and you only want error
or higher priority logs from user_record
module. You can achieve this by customizing the logger in user_record.py
:
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.ERROR)
def process(user_id):
logger.info("processing user record = %s", user_id)
if user_id < 0:
logger.error("Invalid user id = %s", user_id)
Running the script again will output the following result. Note how the info logs are missing from user_record
module.
python logger_usage.py
INFO:root:Hello World from root logger
ERROR:root:failed to process
ERROR:user_record:Invalid user id = -100
This is just the tip of the iceberg of what we can achieve with a custom logger. Although this is a basic example of how we can customize logging, in most cases one would use a centralized logging config
Use %s string formatting instead of constructing a string
Using %s
string formatting is more efficient if the log never appears. Creating unnecessary objects causes pressure on garbage collector.
Consider the following different ways to format a log string containing a user_id:
logger.info("processing user record id =%s", user_id)
- No extra string is created as this uses delayed evaluation and final string is only constructed if corressponding level is enabled.
logger.info(f"processing user record id = {user_id}")
- An extra string is constructed and then discarded.
logger.info("processing user record id={user_id}".format(user_id=user_id))
- Same here as well. Extra string is constructed and then discarded.
Use isEnabledFor
method to avoid expensive computations
Consider a scenario where you want to output some debugging information and computing/fetching this debugging information takes lot of time.
prev_config = fetch_previous_state(new_config.id) # Makes a db call
new_formatted_config = format_config_as_json(new_config) # just formats as json (cpu based)
prev_formatted_config = format_config_as_json(prev_config)
logger.debug("new_config=%s old_config=%s", new_formatted_config, prev_formatted_config)
In the above scenario, the log line never appears if the logging level is > DEBUG
. In this case, the time spent in making a call to fetch_previous_state
and 2 calls to format_config_as_json
are wasted.
A better way to achive the following is to use isEnabledFor
.
if logger.isEnabledFor(logging.DEBUG):
prev_config = fetch_previous_state(new_config.id)
new_formatted_config = format_config_as_json(new_config)
prev_formatted_config = format_config_as_json(prev_config)
logger.debug("new_config=%s old_config=%s", new_formatted_config, prev_formatted_config)