Some Python tricks
command line
Beside calling python <some script file>
directly, we can
give extra options to the python command.
The -c
option executes the string input directly. For example,
python3 -c "print('hello world')"
The -m
option executes the module as a script. This requires the module to
have <pkg>.__main__
entry point. For example,
python3 -m pip
python3 -m venv
python3 -m IPython
It is more convenient to alias
them in the .bashrc
.
The -O
option removes assert
statements.
More information about these options can be found on the page of Command line and environment.
profile
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. — Donald Knuth
The python standard library ships two profilers cProfile
and profile
with
the same interface. cProfile
is faster than profile
, but less convenient to
extend.
I use a tool called snakeviz to visualize the profile results interactively in a webpage. To install, run
pip3 install snakeviz
To do the profiling and generate the visualization, use the following commands
python3 -m cProfile -o program.prof my_program.py
snakeviz program.prof
More information about the profiler can be found on the page of The Python Profilers.
pytest
The pytest library uses fixtures to manage resources and dependency injection. For example, if several tests use the same resource, we can load it once and share among them:
import pytest
@pytest.fixture(scope='session')
def data1():
d = load_data('some file name')
return preprocess(d)
This code can be placed in a file called conftest.py
, which is loaded by
pytest automatically. Any test files located in sub-directories of the conftest.py
’s
direction can use this fixture. We can also create multiple conftest.py
at
different locations to control the accessibility of different fixtures.
In the test code, one can pass data1
as argument to the test function directly.
The scope could be session
, module
, class
, and function
.
More information is on the pytest fixture page. For example, one can even parametrize the fixture.
In the following example, I combine several tricks together
import pytest
from unittest.mock import patch, mock_open
@patch("import/path/to/a/function")
@pytest.mark.parametrize('input, expected', [
(100, ['a', 123]),
(200, ['b', 345]),
pytest.param((999, ['c', 222]), marks=pytest.mark.xfail(reason="some reason")),
])
def test_something(mock_f, input, expected, data1, data2, tmpdir):
some_arg = get_arg(data2)
with tmpdir.as_cwd():
result = some_func(input, data1)
assert expected == result
with patch('builtins.open',
mock_open(read_data='some data')) as mock_file:
result = another_func()
assert result == 'some result'
mock_file.assert_called_once_with('/expected/file/path/to/open')
patch
mocks a functionparametrize
generates multiple tests with the same structureparam
andxfail
denotes failing testtmpdir
is a fixture defined by pytest to simplify the use of temporary directorydata1
anddata2
are fixtures defined elsewheremock_open
patches the fileopen
function andread_data
is the content of the file
It’s easy to get test coverage and parallelization too. To install
pip3 install pytest-cov, pytest-xdist
To use, run
pytest tests/ --cov=./gita -n=auto
Here tests/
is the directory containing the test files, ./gita
is the package
being tested, and -n
specified the number of processes for parallelization.
Pytest also has some useful options
-k <name>
: only do tests that (partially) matchname
-x
: fail fast--lf
: only run the last failed tests
logging
In some cases we need to log the python traceback.
import sys
import traceback
import logging
logger = logging.getLogger('nos')
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
if __name__ == '__main__':
try:
1/0
except Exception as e:
logger.exception('Got exception on main handler')
# logger.error(e, exc_info=True)
# print(''.join(traceback.format_tb(e.__traceback__)))