3
votes

I have a lambda (python) which returns a json representation of a customer profile. It does this by starting with a top level account json, then reading linked json files until it runs out of links. The function that does the reading from s3 is recursive, but the recursion will only ever be one level deep.

Here's the method that actually gets the json content from a given key. (bucket is known)

def get_index_from_s3(key):
    try:
        response = s3.get_object(
            Bucket=bucket,
            Key=key
        )
        body = response.get('Body')
        content = body.read().decode('utf-8')
    except ClientError as ex:
        # print 'EXCEPTION MESSAGE: {}'.format(ex.response['Error']['Code'])
        content = '{}'

    message = json.loads(content)
    return message

The code returns the json found at the specified key, or an empty dictionary in the event that get_object fails due to a ClientError (which is what results from NoSuchKey).

I've tested this, and it works. The first call to the function gets a chunk of json. That json is parsed, a link is found, a second call is made, and the profile is built. If I delete the object at the linked key, I just get a default, empty representation, as intended.

My problem comes from testing this. I've written a couple of test classes, each has an arrange method, and they share an act method.

For my happy path, I use the following arrange:

def arrange(self):
    super(WhenCognitoAndNerfFoundTestCase, self).arrange()
    # self.s3_response = self.s3.get_object.return_value
    self.s3_body = self.s3.get_object.return_value.get.return_value
    self.s3_body.read.return_value.decode.side_effect = [
        self.cognito_content,
        self.nerf_content]
    signed_url = "https://this.is/a/signed/url/index.html"
    self.s3.generate_presigned_url.return_value = signed_url

This does exactly what I want. s3_response is the return_value of get_object, which has a Body attribute returned by get, and a subsequent read value returns a json string. I use the side_effect set to a list of json strings so that I can cause a different string to be returned on each call (only two) content = body.read().decode('utf-8')

But when I want to test the case of missing content in the second bucket, I get stymied. My current attempt at this arrangement is as follows:

def arrange(self):
    super(WhenCognitoOnlyFoundTestCase, self).arrange()
    # self.s3_response = MagicMock()
    # botocore.response.StreamingBody
    self.s3.get_object.side_effect = [{},
                                      ClientError]
    # self.s3_response = self.s3.get_object.return_value
    self.s3_body = self.s3.get_object.return_value.get.return_value
    self.s3_body.read.return_value.decode.return_value = \
        self.cognito_content

Running the test yields this:

    def get_index_from_s3(key):
        try:
            response  = s3.get_object(
                Bucket=bucket,
                Key=key
            )
            body = response.get('Body')
>           content = body.read().decode('utf-8')
E            AttributeError: 'NoneType' object has no attribute 'read'

master_profile.py:66: AttributeError

This makes sense because the read method is on the Body attribute of the s3.get_object response, which is None in this scenario.

So my question is, how do I mock this thing so that I can test it? The difficulty of mocking a response of get_object is that, although it's just a dictionary, the Body attribute is a botocore.response.StreamingBody which I do not know how to mock.

1
You might want to check out the botocore stubber - Jordon Phillips
@JordonPhillips : stubber is really some nice new tricks. Another way of doing "not so mock" mock run, is setting up fake-s3. github.com/jubos/fake-s3 - mootmoot

1 Answers

5
votes

As a rule of thumb you should aim to make your questions self-contained. To illustrate some of the things you were doing wrong I slightly modified your initial function to make it self-contained.

Let's imagine we have the s3_module we want to test defined as follows:

import boto3
from botocore.exceptions import ClientError
import json

s3 = boto3.client('s3')

def get_index_from_s3(key):
    try:
        response = s3.get_object(
            Bucket='bucket',
            Key=key
        )
        body = response.get('Body')
        content = body.read().decode('utf-8')
    except ClientError as ex:
        import ipdb; ipdb.set_trace()
        # print 'EXCEPTION MESSAGE: {}'.format(ex.response['Error']['Code'])
        content = '{}'

    message = json.loads(content)
    return message

In order to test it, we could write a another module s3_test with a test similar to this:

import pytest
from unittest.mock import patch, Mock, MagicMock
from botocore.exceptions import ClientError
import json

from s3_module import get_index_from_s3


@patch('s3_module.s3.get_object')
def test_get_index_from_s3(s3_get_mock):

    body_mock = Mock()
    body_mock.read.return_value.decode.return_value = json.dumps('first_response')
    s3_get_mock.side_effect = [{'Body': body_mock}, ClientError(MagicMock(), MagicMock())]

    first_response = get_index_from_s3('key1')
    assert  first_response == 'first_response'
    second_response = get_index_from_s3('key2')
    assert  second_response == {}

Compared to your solution you were missing some points:

  • self.s3.get_object.side_effect should return an object for the first response that works with the rest of your code, i.e. a dictionary containing the Body key whose content can be read(), decoded() and be used by json.load()

  • self.s3.get_object.side_effect should return a ClientError exception properly constructed for the second response

You can check more about how to build the ClientError exception in the botocore docs: http://botocore.readthedocs.io/en/latest/client_upgrades.html#error-handling

You can find more about patching and mocking in the docs: https://docs.python.org/3/library/unittest.mock.html.

Usually the section about where to patch is really useful: https://docs.python.org/3/library/unittest.mock.html#where-to-patch