XClose

Research Software Engineering Summer School

Home
Menu

Advanced Testing Techniques

Unit and integration tests are great, but they are often not enough for large and complex codebases. There are several other advanced testing techniques that are being adopted as a new standard throughout organisations. We discuss a few of them below.

Mocking

Mock: verb,

  1. to tease or laugh at in a scornful or contemptuous manner
  2. to make a replica or imitation of something

Mocking

Replace a real object with a pretend object, which records how it is called, and can assert if it is called wrong

Mocking frameworks

Recording calls with mock

Mock objects record the calls made to them:

In [1]:
from unittest.mock import Mock
function = Mock(name="myroutine", return_value=2)
In [2]:
function(1)
Out[2]:
2
In [3]:
function(5, "hello", a=True)
Out[3]:
2
In [4]:
function.mock_calls
Out[4]:
[call(1), call(5, 'hello', a=True)]

The arguments of each call can be recovered

In [5]:
name, args, kwargs = function.mock_calls[1]
args, kwargs
Out[5]:
((5, 'hello'), {'a': True})

Mock objects can return different values for each call

In [6]:
function = Mock(name="myroutine", side_effect=[2, "xyz"])
In [7]:
function(1)
Out[7]:
2
In [8]:
function(1, "hello", {'a': True})
Out[8]:
'xyz'

We expect an error if there are no return values left in the list:

In [9]:
function()
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Cell In[9], line 1
----> 1 function()

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/unittest/mock.py:1139, in CallableMixin.__call__(self, *args, **kwargs)
   1137 self._mock_check_sig(*args, **kwargs)
   1138 self._increment_mock_call(*args, **kwargs)
-> 1139 return self._mock_call(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/unittest/mock.py:1143, in CallableMixin._mock_call(self, *args, **kwargs)
   1142 def _mock_call(self, /, *args, **kwargs):
-> 1143     return self._execute_mock_call(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/unittest/mock.py:1200, in CallableMixin._execute_mock_call(self, *args, **kwargs)
   1198     raise effect
   1199 elif not _callable(effect):
-> 1200     result = next(effect)
   1201     if _is_exception(result):
   1202         raise result

StopIteration: 

Using mocks to model test resources

Often we want to write tests for code which interacts with remote resources. (E.g. databases, the internet, or data files.)

We don't want to have our tests actually interact with the remote resource, as this would mean our tests failed due to lost internet connections, for example.

Instead, we can use mocks to assert that our code does the right thing in terms of the messages it sends: the parameters of the function calls it makes to the remote resource.

For example, consider the following code that downloads a map from the internet:

In [10]:
# sending requests to the web is not fully supported on jupyterlite yet, and the
# cells below might error out on the browser (jupyterlite) version of this notebook
import requests

def map_at(lat, long, satellite=False, zoom=12, 
           size=(400, 400)):

    base = "https://static-maps.yandex.ru/1.x/?"
    
    params = dict(
        z = zoom,
        size = ",".join(map(str,size)),
        ll = ",".join(map(str,(long,lat))),
        lang = "en_US")
    
    if satellite:
        params["l"] = "sat"
    else:
        params["l"] = "map"
        
    return requests.get(base, params=params)
In [11]:
london_map = map_at(51.5073509, -0.1277583)
from IPython.display import Image
In [12]:
%matplotlib inline
Image(london_map.content)
Out[12]:
No description has been provided for this image

We would like to test that it is building the parameters correctly. We can do this by mocking the requests object. We need to temporarily replace a method in the library with a mock. We can use "patch" to do this:

In [13]:
from unittest.mock import patch
with patch.object(requests,'get') as mock_get:
    london_map = map_at(51.5073509, -0.1277583)
    print(mock_get.mock_calls)
[call('https://static-maps.yandex.ru/1.x/?', params={'z': 12, 'size': '400,400', 'll': '-0.1277583,51.5073509', 'lang': 'en_US', 'l': 'map'})]

Our tests then look like:

In [14]:
def test_build_default_params():
    with patch.object(requests,'get') as mock_get:
        default_map = map_at(51.0, 0.0)
        mock_get.assert_called_with(
        "https://static-maps.yandex.ru/1.x/?",
        params={
            'z':12,
            'size':'400,400',
            'll':'0.0,51.0',
            'lang':'en_US',
            'l': 'map'
        }
    )
test_build_default_params()

That was quiet, so it passed. When I'm writing tests, I usually modify one of the expectations, to something 'wrong', just to check it's not passing "by accident", run the tests, then change it back!

Testing functions that call other functions

In [15]:
def partial_derivative(function, at, direction, delta=1.0):
    f_x = function(at)
    x_plus_delta = at[:]
    x_plus_delta[direction] += delta
    f_x_plus_delta = function(x_plus_delta)
    return (f_x_plus_delta - f_x) / delta

We want to test that the above function does the right thing. It is supposed to compute the derivative of a function of a vector in a particular direction.

E.g.:

In [16]:
partial_derivative(sum, [0,0,0], 1)
Out[16]:
1.0

How do we assert that it is doing the right thing? With tests like this:

In [17]:
from unittest.mock import MagicMock

def test_derivative_2d_y_direction():
    func = MagicMock()
    partial_derivative(func, [0,0], 1)
    func.assert_any_call([0, 1.0])
    func.assert_any_call([0, 0])
    

test_derivative_2d_y_direction()

We made our mock a "Magic Mock" because otherwise, the mock results f_x_plus_delta and f_x can't be subtracted:

In [18]:
MagicMock() - MagicMock()
Out[18]:
<MagicMock name='mock.__sub__()' id='140313841025632'>
In [19]:
Mock() - Mock()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[19], line 1
----> 1 Mock() - Mock()

TypeError: unsupported operand type(s) for -: 'Mock' and 'Mock'

Static type hints

Although static type hints are not actual "tests," they can be checked under test runs (or CI pipelines) using static typing tools and libraries. Checking if the codebase is statically typed and the types are correct can help in finding silent bugs, dead code, and unreachable statements, which is often missed during unit and integration testing.

Detecting dead code

For example, let's consider the following piece of code:

In [20]:
%%writefile static_types_example.py
def smart_square(a: float | int | bool | str) -> int | float:
    if isinstance(a, (float, int)):
        return a * a
    elif isinstance(a, (str, bool)):
        try:
            result = float(a) * float(a)
            return result
        except ValueError:
            raise ValueError(f"a should be of type float/int or convertible to float; got {type(a)}")
    elif not isinstance(a, (float, int, bool, str)):
        raise NotImplementedError
Writing static_types_example.py

The code looks good enough, squaring the argument if it is of type float or int and attempting to convert it to float if it is not. It looks like the code is clean, and testing it gives us no errors too -

In [21]:
%%writefile test_static_types_example.py
import pytest
from static_types_example import smart_square

def test_smart_square():
    assert smart_square(2) == 4
    assert isinstance(smart_square(2), int)
    assert smart_square(2.) == 4.
    assert isinstance(smart_square(2.), float)
    assert smart_square("2") == 4.
    assert smart_square(True) == 1.

    with pytest.raises(ValueError, match="float/int or convertible to float; got <class 'str'>"):
        smart_square("false")
Writing test_static_types_example.py
In [22]:
%%bash
pytest test_static_types_example.py
============================= test session starts ==============================
platform linux -- Python 3.12.8, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/runner/work/rsd-summerschool/rsd-summerschool
configfile: pyproject.toml
plugins: anyio-3.7.1, cov-6.0.0, mimesis-18.0.0
collected 1 item

test_static_types_example.py .                                           [100%]

============================== 1 passed in 0.02s ===============================

Even though the tests look good, we can notice one peculiar behavior. We cannot test the NotImplementedError because it is not reachable, given that either the if or the elif condition will always be met and the argument type cannot be anything other than float, int, bool, or str; hence, the code will never go to the else statement.

This is called "unreachable" or "dead" code, and having it in your codebase is a bad practice. How do we detect it? Static types!

Let's run mypy with --warn-unreachable -

In [23]:
%%bash
mypy static_types_example.py --warn-unreachable
bash: line 1: mypy: command not found
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
Cell In[23], line 1
----> 1 get_ipython().run_cell_magic('bash', '', 'mypy static_types_example.py --warn-unreachable\n')

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2541, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
   2539 with self.builtin_trap:
   2540     args = (magic_arg_s, cell)
-> 2541     result = fn(*args, **kwargs)
   2543 # The code below prevents the output from being displayed
   2544 # when using magics with decorator @output_can_be_silenced
   2545 # when the last Python token in the expression is a ';'.
   2546 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/IPython/core/magics/script.py:155, in ScriptMagics._make_script_magic.<locals>.named_script_magic(line, cell)
    153 else:
    154     line = script
--> 155 return self.shebang(line, cell)

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/IPython/core/magics/script.py:315, in ScriptMagics.shebang(self, line, cell)
    310 if args.raise_error and p.returncode != 0:
    311     # If we get here and p.returncode is still None, we must have
    312     # killed it but not yet seen its return code. We don't wait for it,
    313     # in case it's stuck in uninterruptible sleep. -9 = SIGKILL
    314     rc = p.returncode or -9
--> 315     raise CalledProcessError(rc, cell)

CalledProcessError: Command 'b'mypy static_types_example.py --warn-unreachable\n'' returned non-zero exit status 127.

The type checker points out that the line 9 (else) statement, is in fact unreachable. This could either be a bug - code that should be reachable but for some reason is not - or just dead code - code that will never be reached and can be removed. In out case it is dead code, and can be removed safely, given that we explicitly tell users what type of arguments should be passed in.

In [24]:
%%writefile static_types_example.py
def smart_square(a: float | int | bool | str) -> int | float:
    if isinstance(a, (float, int)):
        return a * a
    elif isinstance(a, (str, bool)):
        try:
            result = float(a) * float(a)
            return result
        except ValueError:
            raise ValueError(f"a should be of type float/int or convertible to float; got {type(a)}")
Overwriting static_types_example.py
In [25]:
%%bash
mypy static_types_example.py --warn-unreachable
bash: line 1: mypy: command not found
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
Cell In[25], line 1
----> 1 get_ipython().run_cell_magic('bash', '', 'mypy static_types_example.py --warn-unreachable\n')

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2541, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
   2539 with self.builtin_trap:
   2540     args = (magic_arg_s, cell)
-> 2541     result = fn(*args, **kwargs)
   2543 # The code below prevents the output from being displayed
   2544 # when using magics with decorator @output_can_be_silenced
   2545 # when the last Python token in the expression is a ';'.
   2546 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/IPython/core/magics/script.py:155, in ScriptMagics._make_script_magic.<locals>.named_script_magic(line, cell)
    153 else:
    154     line = script
--> 155 return self.shebang(line, cell)

File /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/IPython/core/magics/script.py:315, in ScriptMagics.shebang(self, line, cell)
    310 if args.raise_error and p.returncode != 0:
    311     # If we get here and p.returncode is still None, we must have
    312     # killed it but not yet seen its return code. We don't wait for it,
    313     # in case it's stuck in uninterruptible sleep. -9 = SIGKILL
    314     rc = p.returncode or -9
--> 315     raise CalledProcessError(rc, cell)

CalledProcessError: Command 'b'mypy static_types_example.py --warn-unreachable\n'' returned non-zero exit status 127.

No errors!

Huge real-life codebases always benefit from adding static type and checking the using tools like mypy. These checks can be automated in the CI using pre-commit hooks (for instance, the mypy pre-commit hook) and pre-commit.ci.

Property based testing

Property-based testing is a testing method that automatically generates and tests a wide range of inputs, often missed by tests written by humans.

Hypthesis is a modern property based testing implementation for Python. The library creates unit tests. In a nutshell, Hypothesis can parametrize test, running test function over a wide range of matching data from a "search strategy" established by the library. Through paratemerization, Hypothesis can catch bugs which might go unnoticed by writing manual inputs for the tests.

Property based testing is being adopted by software written in various languages, especially by industries, to ensure the effectiveness of the tests written for their software suite.

Mutation testing

Mutation testing checks the effectiveness of your tests by making minor modification to the codebase and running the test suite. The tests were not specific or good enough if they pass with the modifications made by the mutation testing framework.

mutmut is a mutation testing library for Python which is being adopted recently in testing frameworks of large projects. There exists other libraries that perform a similar task such as -

  • pytest-testmon: pytest plug-in which automatically selects and re-executes only tests affected by recent changes
  • MutPy (unmaintained): mutation testing tool for Python 3.3+ source code
  • Cosmic Ray: mutation testing for Python

Overall, mutation testing is a very powerful way to check is your tests are actually working, and this form of testing is beneficial for projects with a large test suite.