Using Amazon AWS Cognito `.well-known/jwks.json` data fails to base64 decode some fields

Question

When using Amazon AWS Cognito Federated Identities, and parsing the data at:
https://cognito-identity.amazonaws.com/.well-known/jwks_uri which looks like:

{"keys":[
    {"kty":"RSA",
     "alg":"RS512",
     "use":"sig",
     "kid":"ap-northeast-11",
     "n":"AI7mc1assO5n6yB4b7jPCFgVLYPSnwt4qp2BhJVAmlXRntRZ5w4910oKNZDOr4fe/BWOI2Z7upUTE/ICXdqirEkjiPbBN/duVy5YcHsQ5+GrxQ/UbytNVN/NsFhdG8W31lsE4dnrGds5cSshLaohyU/aChgaIMbmtU0NSWQ+jwrW8q1PTvnThVQbpte59a0dAwLeOCfrx6kVvs0Y7fX7NXBbFxe8yL+JR3SMJvxBFuYC+/om5EIRIlRexjWpNu7gJnaFFwbxCBNwFHahcg5gdtSkCHJy8Gj78rsgrkEbgoHk29pk8jUzo/O/GuSDGw8qXb6w0R1+UsXPYACOXM8C8+E=",
     "e":"AQAB"}, 
 ... }

This works fine decoding the n field using this code (Kotlin calling JDK 8 Base64 class):

Base64.getDecoder().decode(encodedN.toByteArray())

But when using Cognito User Pools which has data at a URL in the form of:
https://cognito-idp.${REGION}.amazonaws.com/${POOLID}/.well-known/jwks.json

It has the same type of data, but it will not decode. Instead I end up with errors such as:

Illegal base64 character 5f

Since that is an underscore _ and in the Base64 URL alphabet, I tried changing my decoding to:

Base64.getUrlDecoder().decode(encodedN.toByteArray())

But then the first set of data no longer decodes correctly because it contains / and other invalid characters for Base64 URL encoding.

Is there a method that can handle both of these jwks sets of data with the same decoder?!?

Note: this question is intentionally written and answered by the author (Self-Answered Questions), so that solutions for interesting problems are shared in SO.

Jayson Minard Jayson Minard · Accepted Answer · 2016-10-11T14:38:29

The issue is that the Amazon AWS Cognito team is using two different Base64 encoding alphabets for basically the same thing. So you will need to detect which is being used.

If the encoded string ends with = or contains + or / then it is definitely the normal Base64.getDecoder(). If it contains a - or _ then it is definitely the Base64.getUrlDecoder(). Otherwise nothing special is there and it is best to use the Base64.getUrlDecoder() because you do not know if the length would need padding or not.

This translates to (in Kotlin, but logically is applicable to any language):

fun base64SafeDecoder(encoded: String): ByteArray {
    val decoder = if (encoded.endsWith('=') || encoded.any { it == '+' || it == '/' }) {
        Base64.getDecoder()
    }
    else {
        Base64.getUrlDecoder()
    }
    return decoder.decode(encoded.toByteArray())
}

This would be a problem for any language that has Base64 decoding in that they might be loose and ignore the invalid character (some do), or they might be strict and throw an exception. Some test websites for Base64 encoding/decoding exhibit both of these behaviors as well, and the silent ignoring of invalid characters is dangerous. You would then have an error later using the results of the decoding later.

Using Amazon AWS Cognito `.well-known/jwks.json` data fails to base64 decode some fields

2 Answers