Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 105 additions & 0 deletions gateway/it/features/token-based-ratelimit.feature
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,111 @@ Feature: Token-Based Rate Limiting
When I delete the LLM provider template "multi-quota-template"
Then the response status code should be 200

Scenario: Token-based rate limiting extracts tokens from gzipped backend responses
Given I authenticate using basic auth as "admin"

When I create this LLM provider template:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProviderTemplate
metadata:
name: gzip-response-template
spec:
displayName: Gzip Response Template
totalTokens:
location: payload
identifier: $.args.total_tokens[0]
requestModel:
location: payload
identifier: $.args.model[0]
responseModel:
location: payload
identifier: $.args.model[0]
"""
Then the response status code should be 201

Given I authenticate using basic auth as "admin"
When I create this LLM provider:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProvider
metadata:
name: gzip-response-provider
spec:
displayName: Gzip Response Provider
version: v1.0
context: /gzip-response
template: gzip-response-template
upstream:
url: http://echo-backend-multi-arch:8080
auth:
type: api-key
header: Authorization
value: test-api-key
accessControl:
mode: deny_all
exceptions:
- path: /chat/completions
methods: [POST, GET]
policies:
- name: request-rewrite
version: v0
paths:
- path: /chat/completions
methods: [POST, GET]
params:
pathRewrite:
type: ReplaceFullPath
replaceFullPath: "/gzip"
- name: token-based-ratelimit
version: v0
paths:
- path: /chat/completions
methods: [POST]
params:
totalTokenLimits:
- count: 2
duration: "1m"
algorithm: fixed-window
backend: memory
"""
Then the response status code should be 201
And I wait for the endpoint "http://localhost:8080/gzip-response/chat/completions" to be ready

Given I set header "Content-Type" to "application/json"
And I set header "Accept-Encoding" to "gzip"

# First request: consume 1 token from gzipped response body
When I send a POST request to "http://localhost:8080/gzip-response/chat/completions?model=gpt-4&total_tokens=1" with body:
"""
{}
"""
Then the response status code should be 200
And the response header "Content-Encoding" should contain "gzip"
And the response header "X-Ratelimit-Remaining" should be "1"

# Second request: consume final token
When I send a POST request to "http://localhost:8080/gzip-response/chat/completions?model=gpt-4&total_tokens=1" with body:
"""
{}
"""
Then the response status code should be 200
And the response header "X-Ratelimit-Remaining" should be "0"

# Third request should now be blocked
When I send a POST request to "http://localhost:8080/gzip-response/chat/completions?model=gpt-4&total_tokens=1" with body:
"""
{}
"""
Then the response status code should be 429
Comment on lines +367 to +371
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing response body assertion on the 429 response.

Every other 429 in this file also asserts And the response body should contain "Rate limit exceeded" (lines 153, 797, 1102). The third request here only checks the status code, leaving the error message unverified.

🛠️ Proposed addition
     Then the response status code should be 429
+    And the response body should contain "Rate limit exceeded"
 
     And I clear all headers
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
When I send a POST request to "http://localhost:8080/gzip-response/chat/completions?model=gpt-4&total_tokens=1" with body:
"""
{}
"""
Then the response status code should be 429
When I send a POST request to "http://localhost:8080/gzip-response/chat/completions?model=gpt-4&total_tokens=1" with body:
"""
{}
"""
Then the response status code should be 429
And the response body should contain "Rate limit exceeded"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/features/token-based-ratelimit.feature` around lines 367 - 371,
Add a response body assertion to the POST step that currently only checks status
429: locate the "When I send a POST request to
\"http://localhost:8080/gzip-response/chat/completions?model=gpt-4&total_tokens=1\"
with body:" step and, alongside the existing "Then the response status code
should be 429" assertion, add the same body check used elsewhere (e.g., "And the
response body should contain \"Rate limit exceeded\"") so the error message is
verified.


And I clear all headers
Given I authenticate using basic auth as "admin"
When I delete the LLM provider "gzip-response-provider"
Then the response status code should be 200
When I delete the LLM provider template "gzip-response-template"
Then the response status code should be 200

Scenario: Token-based rate limit returns proper headers
Given I authenticate using basic auth as "admin"

Expand Down
Loading