Tackling the PKCE Puzzle in Robot Framework API Testing
The Security Upgrade & the Testing Headache
Testing APIs secured with the PKCE flow using Robot Framework and libraries like REST or the RequestsLibrary can be quite challenging. PKCE, designed primarily for browser-based interactions, poses compatibility issues with traditional API testing tools.
Stumbling Upon a Solution
After some head-scratching and digging around, a collegue stumbled upon a blog post by Stefaan Lippens (https://www.stefaanlippens.net/oauth-code-flow-pkce.html) that offered a glimmer of hope. He provided a Python script to handle the PKCE flow with Keycloak 7. With some tweaks to the regex, I was able to adapt the script to work with Keycloak 25, our current version.
Sharing the Solution
In the following sections, I’ll delve into the Python script that helped me successfully implement the PKCE flow within Robot Framework. I’ll provide the adapted code and explain its inner workings.
Hopefully, this post saves fellow testers some frustration and empowers them to seamlessly test their secure APIs with Robot Framework.
The Python Script
1
2
3
4
5
6
7
8
import base64
import hashlib
import html
import json
import os
import re
import urllib.parse
import requests
Importing essential libraries for base64 encoding/decoding, hashing, HTML handling, JSON parsing, file operations, regular expressions, URL parsing, and making HTTP requests.
1
2
3
4
5
provider = "https://<YOUR_URL>/auth/realms/<YOUR_REALM>"
client_id = "<CLIENT_ID>"
username = "<USERNAME>"
password = "<PASSWORD>"
redirect_uri = "http://localhost/foobar"
Defining key variables, including the authorization server URL, client ID, user credentials, and the redirect URI used in the OAuth flow. Remember to replace the placeholders (YOUR_URL, YOUR_REALM, CLIENT_ID, USERNAME, PASSWORD) with your actual values when implementing this script.
1
2
3
code_verifier = base64.urlsafe_b64encode(os.urandom(40)).decode('utf-8')
code_verifier = re.sub('[^a-zA-Z0-9]+', '', code_verifier)
code_verifier, len(code_verifier)
Generates a code verifier using os.urandom
for secure random bytes. Encodes it using URL-safe base64 and removes any non-alphanumeric characters using regex. Finally, prints the code verifier and its length for verification.
1
2
3
4
code_challenge = hashlib.sha256(code_verifier.encode('utf-8')).digest()
code_challenge = base64.urlsafe_b64encode(code_challenge).decode('utf-8')
code_challenge = code_challenge.replace('=', '')
code_challenge, len(code_challenge)
Computes the code challenge by hashing the code verifier using SHA-256. Encodes the hash using URL-safe base64, removes padding characters (‘=’), and prints the code challenge and its length.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
state = "fooobarbaz"
resp = requests.get(
url=provider + "/protocol/openid-connect/auth",
params={
"response_type": "code",
"client_id": client_id,
"scope": "openid",
"redirect_uri": redirect_uri,
"state": state,
"code_challenge": code_challenge,
"code_challenge_method": "S256",
},
allow_redirects=False
)
resp.status_code
Initiates the authorization request by sending a GET request to the authorization endpoint with the necessary parameters including ‘code_challenge’ (computed earlier), ‘state’ for security, and ‘code_challenge_method’ set to ‘S256’. Redirects are disabled to capture the intermediate response. Finally, the status code of the response is retrieved.
1
2
3
cookie = resp.headers['Set-Cookie']
cookie = '; '.join(c.split(';')[0] for c in cookie.split(', '))
cookie
Extracts the ‘Set-Cookie’ header from the response, parses it to keep only the essential cookie name-value pairs, and formats them into a semicolon-separated string for further use in requests.
1
2
3
page = resp.text
form_action = html.unescape(re.search('"loginAction": "(.*?)"', page, re.DOTALL).group(1))
form_action
Extracts the login form action URL from the HTML response using regular expressions. The html.unescape
function ensures any HTML entities are properly decoded.
1
2
3
4
5
6
7
8
9
10
resp = requests.post(
url=form_action,
data={
"username": username,
"password": password,
},
headers={"Cookie": cookie},
allow_redirects=False
)
resp.status_code
Submits the login form using a POST request to the extracted form_action
URL. Includes user credentials and the previously obtained cookie in the request headers. Redirects are again disabled to capture the response. The status code of this response is then retrieved.
1
2
redirect = resp.headers['Location']
redirect
Extracts the ‘Location’ header from the response, which contains the redirect URL after successful login. This URL will be used to obtain the authorization code.
1
assert redirect.startswith(redirect_uri)
Verifies that the redirect URL starts with the expected redirect_uri
to ensure the flow is proceeding as intended.
1
2
3
query = urllib.parse.urlparse(redirect).query
redirect_params = urllib.parse.parse_qs(query)
redirect_params
Parses the query parameters from the redirect URL using urllib.parse
. The urlparse
function extracts the query string, and parse_qs
converts it into a dictionary where keys are parameter names and values are lists of corresponding values.
1
2
auth_code = redirect_params['code'][0]
auth_code
Extracts the authorization code from the parsed query parameters. The authorization code is essential for the next step in the OAuth flow, exchanging it for an access token.
1
2
3
4
5
6
7
8
9
10
11
12
resp = requests.post(
url=provider + "/protocol/openid-connect/token",
data={
"grant_type": "authorization_code",
"client_id": client_id,
"redirect_uri": redirect_uri,
"code": auth_code,
"code_verifier": code_verifier,
},
allow_redirects=False
)
resp.status_code
Exchanges the obtained authorization code for an access token by sending a POST request to the token endpoint. Includes necessary parameters like ‘grant_type’, ‘client_id’, ‘redirect_uri’, ‘code’, and ‘code_verifier’. Redirects are disabled, and the status code of the response is retrieved to check if the token exchange was successful.
1
2
result = resp.json()
result
Parses the JSON response from the token endpoint, which should contain the access token and potentially other information like refresh token and token expiration time. The result
variable now holds a Python dictionary representing the parsed JSON data.
1
2
3
def _b64_decode(data):
data += '=' * (4 - len(data) % 4)
return base64.b64decode(data).decode('utf-8')
Defines a helper function _b64_decode
to handle base64 decoding. It adds padding characters (‘=’) if necessary to ensure the input data has the correct length for decoding, then decodes it using base64 and returns the result as a UTF-8 string.
1
2
3
def jwt_payload_decode(jwt):
_, payload, _ = jwt.split('.')
return json.loads(_b64_decode(payload))
Defines a function jwt_payload_decode
to extract and decode the payload from a JWT (JSON Web Token). It splits the JWT into its header, payload, and signature parts, decodes the base64-encoded payload using the _b64_decode
helper function, and parses the decoded payload as JSON, returning the resulting Python dictionary.
1
print(jwt_payload_decode(result['access_token']))
Decodes and prints the payload of the obtained access token using the jwt_payload_decode
function. This allows you to inspect the claims and information embedded within the access token.
1
print(jwt_payload_decode(result['id_token']))
Decodes and prints the payload of the obtained ID token using the jwt_payload_decode
function. This allows you to inspect the claims and user information embedded within the ID token.
Wrapping It Up
Hopefully, this Python script and the accompanying breakdown make the process of obtaining an access token for API testing with Robot Framework using PKCE and Keycloak a bit smoother. Remember to replace the placeholders in the code with your actual Keycloak configuration details.
Happy testing!