Skip to content

Conversation

@abhishekg999
Copy link
Collaborator

No description provided.

@gemini-code-assist
Copy link

Summary of Changes

Hello @abhishekg999, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant architectural change to the Judgeval Java SDK by making the client project-scoped. This enhancement streamlines interactions with the Judgment platform by associating all operations with a specific project, improving organization and data isolation. The underlying API communication layer has been overhauled to support a new versioned API, expanding the SDK's capabilities for project, dataset, prompt, and custom scorer management. These changes aim to provide a more robust and feature-rich experience for developers using the Judgeval SDK.

Highlights

  • Project-Scoped Client Initialization: The Judgeval client now requires a projectName during initialization, which is then used to resolve a projectId for all subsequent API interactions, enabling project-specific operations.
  • API Client Refactoring and New Endpoints: The internal API clients (JudgmentAsyncClient, JudgmentSyncClient) have been significantly refactored to align with a new /v1 API structure, introducing numerous new endpoints for managing projects, datasets, prompts, and custom scorers.
  • Scorer Model and Factory Updates: The ScorerData and ScoringResult models have been refactored for clarity and better Jackson integration. APIScorer now explicitly handles thresholds and result types, and PromptScorerFactory and CustomScorerFactory have been updated to leverage the new project-scoped client.
  • Build and Development Environment Enhancements: The .gitignore file was updated to include .metals for IDE configuration, and .vscode/settings.json now excludes the **/target directory from file watching. A bun.lock file was added, indicating a shift or addition in dependency management.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added '.metals' to the list of ignored files.
  • .vscode/settings.json
    • Added '**/target' to 'files.watcherExclude' to prevent VS Code from watching the target directory.
  • bun.lock
    • Added a new Bun lock file, indicating a change in JavaScript/TypeScript dependency management.
  • examples/pom.xml
    • Updated the 'judgeval-java' dependency version to '0.5.0'.
  • examples/src/main/java/examples/simple_chat/SimpleChat.java
    • Modified the Judgeval.builder() call to include projectName.
    • Adjusted the tracer creation to remove redundant projectName setting.
  • judgeval-java/pom.xml
    • Updated the project version to '0.5.0'.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/Judgeval.java
    • Modified the constructor to require projectName and internally resolve projectId.
    • Updated tracer(), scorers(), and evaluation() factories to pass projectId and projectName.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/data/ScorerData.java
    • Refactored the class to no longer extend an internal API model, adding explicit Jackson annotations for properties.
    • Removed Javadoc comments for brevity.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/data/ScoringResult.java
    • Refactored the class to no longer extend an internal API model, adding explicit Jackson annotations for properties.
    • Removed Javadoc comments for brevity.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/evaluation/Evaluation.java
    • Added projectId and projectName fields to the class.
    • Removed Javadoc comments for brevity.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/evaluation/EvaluationFactory.java
    • Updated the constructor to accept projectId and projectName.
    • Modified the create() method to pass projectId and projectName to Evaluation.builder().
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/JudgmentAsyncClient.java
    • Added new asynchronous API endpoints for projects, datasets, prompts, custom scorers, and trace operations.
    • Updated ObjectMapper configuration to exclude null values during serialization.
    • Added logging for HTTP requests and responses.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/JudgmentSyncClient.java
    • Added new synchronous API endpoints for projects, datasets, prompts, custom scorers, and trace operations.
    • Updated ObjectMapper configuration to exclude null values during serialization.
    • Added logging for HTTP requests and responses.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/AddProjectRequest.java
    • Renamed from ResolveProjectNameRequest.java.
    • Modified class name and references within the file.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/AddProjectResponse.java
    • Renamed from ResolveProjectNameResponse.java.
    • Modified class name and references within the file.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/AddToRunEvalQueueExamplesResponse.java
    • Added a new model for the response of adding example evaluations to the queue.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/AddToRunEvalQueueTracesResponse.java
    • Added a new model for the response of adding trace evaluations to the queue.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/AddTraceTagsRequest.java
    • Added a new model for requests to add tags to a trace.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/AddTraceTagsResponse.java
    • Renamed from SavePromptScorerResponse.java.
    • Modified class name and fields to represent a response for adding trace tags.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/BaseScorer.java
    • Modified fields: removed threshold, modelClient, strictMode from the base definition.
    • Added minimumScoreRange, maximumScoreRange, requiredParams, strictMode, and usingNativeModel fields.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/CreateDatasetRequest.java
    • Added a new model for requests to create a dataset.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/CreateDatasetResponse.java
    • Added a new model for the response of creating a dataset.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/CustomScorerExistsResponse.java
    • Added a new model for the response checking if a custom scorer exists.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/DatasetInfo.java
    • Added a new model to represent dataset information.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/DeleteProjectResponse.java
    • Added a new model for the response of deleting a project.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/E2EFetchSpanScoreRequest.java
    • Added a new model for requests to fetch span scores in end-to-end traces.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/E2EFetchTraceRequest.java
    • Renamed from EvalResultsFetch.java.
    • Modified class name and fields to represent a request for fetching end-to-end traces.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ErrorResponse.java
    • Added a new model to represent API error responses.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ExampleEvaluationRun.java
    • Modified to use projectId instead of projectName.
    • Added userId and scorers fields.
    • Updated the type of traceAndSpanIds to List<List<String>>.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ExampleScoringResult.java
    • Added a new model to represent scoring results for examples.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ExperimentRunItem.java
    • Added a new model to represent an item in an experiment run.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ExperimentScorer.java
    • Renamed from ScorerData.java.
    • Modified class name and fields to represent an experiment scorer.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/FetchExperimentRunResponse.java
    • Added a new model for the response of fetching an experiment run.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/FetchPromptResponse.java
    • Added a new model for the response of fetching a prompt.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/GetPromptVersionsResponse.java
    • Added a new model for the response of getting prompt versions.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/InsertExamplesRequest.java
    • Added a new model for requests to insert examples into a dataset.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/InsertExamplesResponse.java
    • Added a new model for the response of inserting examples.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/InsertPromptRequest.java
    • Added a new model for requests to insert a prompt.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/InsertPromptResponse.java
    • Added a new model for the response of inserting a prompt.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/LogEvalResultsRequest.java
    • Renamed from EvalResults.java.
    • Modified class name and fields to represent a request for logging evaluation results.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/LogEvalResultsResponse.java
    • Added a new model for the response of logging evaluation results.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/PromptCommitInfo.java
    • Added a new model to represent information about a prompt commit.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/PromptScorer.java
    • Removed the isBucketRubric field.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/PullDatasetResponse.java
    • Added a new model for the response of pulling a dataset.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ResolveProjectRequest.java
    • Added a new model for requests to resolve a project by name.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ResolveProjectResponse.java
    • Added a new model for the response of resolving a project.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/SavePromptScorerRequest.java
    • Removed the SavePromptScorerRequest model.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ScorerConfig.java
    • Removed the strictMode field.
    • Added the resultType field.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/ScoringResult.java
    • Simplified the model by removing many fields, retaining only additionalProperties.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/TagPromptRequest.java
    • Renamed from FetchPromptScorersRequest.java.
    • Modified class name and fields to represent a request for tagging a prompt.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/TagPromptResponse.java
    • Added a new model for the response of tagging a prompt.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/TraceEvaluationRun.java
    • Modified to use projectId instead of projectName.
    • Added userId and scorers fields.
    • Updated the type of traceAndSpanIds to List<List<String>>.
    • Renamed isBucketRun to isBehavior.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/TraceInfo.java
    • Added a new model to represent trace information.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/TraceScoringResult.java
    • Added a new model to represent scoring results for traces.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/TraceSpan.java
    • Renamed from OtelTraceSpan.java.
    • Modified class name and fields to represent a trace span.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/TriggerRootSpanRulesRequest.java
    • Added a new model for requests to trigger root span rules.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/TriggerRootSpanRulesResponse.java
    • Added a new model for the response of triggering root span rules.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/UntagPromptRequest.java
    • Renamed from ScorerExistsRequest.java.
    • Modified class name and fields to represent a request for untagging a prompt.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/UntagPromptResponse.java
    • Added a new model for the response of untagging a prompt.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/UploadCustomScorerRequest.java
    • Added a new model for requests to upload a custom scorer.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/internal/api/models/UploadCustomScorerResponse.java
    • Added a new model for the response of uploading a custom scorer.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/scorers/APIScorer.java
    • Added a threshold field and initialized it to 0.5.
    • Updated getScorerConfig() to set resultType to 'numeric' and removed strictMode.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/scorers/ScorersFactory.java
    • Updated the constructor to accept projectId.
    • Modified promptScorer() and customScorer() methods to pass projectId to their respective factories.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/scorers/custom_scorer/CustomScorer.java
    • Added a projectId field to the class.
    • Modified the builder to accept and set projectId.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/scorers/custom_scorer/CustomScorerFactory.java
    • Updated the constructor to accept projectId.
    • Modified get() methods to pass projectId to CustomScorer.builder().
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/scorers/prompt_scorer/PromptScorer.java
    • Removed strictMode from the getScorerConfig() method.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/scorers/prompt_scorer/PromptScorerFactory.java
    • Updated the constructor to accept projectId.
    • Modified the get() method to use the new getProjectsScorers API call with projectId.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/tracer/BaseTracer.java
    • Modified the constructor to accept projectId directly, removing internal project resolution logic.
    • Updated ObjectMapper to set JsonInclude.Include.NON_NULL for serialization.
    • Modified createEvaluationRun and enqueueEvaluation to use projectId.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/tracer/Tracer.java
    • Modified the constructor and builder to accept and utilize projectId.
  • judgeval-java/src/main/java/com/judgmentlabs/judgeval/tracer/TracerFactory.java
    • Updated the constructor to accept projectName and projectId.
    • Modified the create() method to pass projectName and projectId to Tracer.builder().
  • judgeval-java/src/test/java/com/judgmentlabs/judgeval/JudgevalTest.java
    • Updated tests to include projectName in Judgeval.builder() calls.
  • judgeval-java/src/test/java/com/judgmentlabs/judgeval/evaluation/EvaluationFactoryTest.java
    • Updated setUp method to pass projectId and projectName to EvaluationFactory.
  • judgeval-java/src/test/java/com/judgmentlabs/judgeval/scorers/ScorersFactoryTest.java
    • Updated setUp method to pass projectId to ScorersFactory.
  • judgeval-java/src/test/java/com/judgmentlabs/judgeval/tracer/BaseTracerTest.java
    • Updated constructor calls to TestableBaseTracer to include projectId.
    • Removed mocking for project resolution as it's now handled externally.
  • judgeval-java/src/test/java/com/judgmentlabs/judgeval/tracer/TracerFactoryTest.java
    • Updated setUp method to pass projectName and projectId to TracerFactory.
  • judgeval-java/src/test/java/com/judgmentlabs/judgeval/tracer/TracerTest.java
    • Updated constructor calls to Tracer.builder() to include projectId.
    • Removed mocking for project resolution as it's now handled externally.
  • scripts/generate-client.sh
    • Updated the OpenAPI JSON endpoint URL from http://localhost:8000/openapi.json to http://localhost:10001/openapi/json.
  • scripts/generate_client.py
    • Major refactoring of the client generation script to support new API structure, schema collection, and dependency resolution.
    • Introduced INCLUDE_PREFIXES to filter API paths.
    • Updated schema resolution and Java type mapping logic.
    • Improved method name generation from OpenAPI operation IDs.
    • Added path parameter extraction.
    • Enhanced client class generation to include JsonInclude.Include.NON_NULL for ObjectMapper and added logging.
Activity
  • The pull request is currently marked as a work in progress, indicating ongoing development.
  • The author, abhishekg999, has initiated these changes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly updates version 0.5.0, making the Judgeval client project-scoped and overhauling the internal API client and data models. The projectName is now a required top-level configuration, and the API client has been enhanced with new endpoints, improved logging, and better type safety. However, a security audit identified critical vulnerabilities. API clients are susceptible to path traversal and URL parameter injection due to insecure URL construction and a lack of proper encoding for dynamic parameters. Furthermore, the code generation script is vulnerable to code injection if run against a malicious OpenAPI spec. Addressing these requires implementing robust URL encoding in generated clients and proper string escaping in the code generation script. Additionally, the review noted a widespread removal of Javadoc, which is a regression in code quality impacting maintainability. There's also a minor issue with exception handling where an exception is caught and swallowed without logging. While the architectural changes are positive, the documentation loss and exception handling need to be resolved.

}

public DeleteProjectResponse deleteProjects(String projectId) throws IOException, InterruptedException {
String url = buildUrl("/v1/projects/" + projectId);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The SDK constructs API URLs by directly concatenating dynamic parameters such as projectId, datasetName, and traceId into the URL path. If these parameters contain path traversal sequences (e.g., ../), an attacker could potentially access or manipulate unintended API endpoints. It is recommended to URL-encode all dynamic path segments using URLEncoder.encode(value, StandardCharsets.UTF_8) before appending them to the URL path.

public CompletableFuture<ScorerExistsResponse> scorerExists(ScorerExistsRequest payload) {
String url = buildUrl("/scorer_exists/");
public CompletableFuture<DeleteProjectResponse> deleteProjects(String projectId) {
String url = buildUrl("/v1/projects/" + projectId);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The SDK constructs API URLs by directly concatenating dynamic parameters such as projectId into the URL path. If these parameters contain path traversal sequences (e.g., ../), an attacker could potentially access or manipulate unintended API endpoints. It is recommended to URL-encode all dynamic path segments using URLEncoder.encode(value, StandardCharsets.UTF_8) before appending them to the URL path.

placeholder = f"{{{param['name']}}}"
idx = remaining.index(placeholder)
if idx > 0:
parts.append(f'"{remaining[:idx]}"')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The generate_url_expr function embeds strings directly from the OpenAPI spec into Java code without escaping double quotes. A malicious OpenAPI spec could exploit this to inject arbitrary Java code into the generated client files. Proper escaping should be implemented for all strings retrieved from the OpenAPI spec before they are used in the generated Java source code.

* Main entry point for the Judgment SDK. Provides access to tracer, scorer, and
* evaluation factories.
*/
public class Judgeval {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Javadoc for the Judgeval class and its public methods (tracer, scorers, evaluation, builder) has been removed. This reduces the clarity and usability of the SDK for developers. Please consider restoring the Javadoc to document the purpose and usage of these public APIs.

Comment on lines +66 to +68
} catch (Exception e) {
return Optional.empty();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a generic Exception and returning an empty Optional without logging can hide important errors during project ID resolution, such as network issues or authentication failures. This makes debugging difficult. It's recommended to log the exception to provide visibility into what went wrong.

        } catch (Exception e) {
            Logger.error("Failed to resolve project ID for project '" + name + "'.", e);
            return Optional.empty();
        }

/**
* Factory for creating evaluation builders.
*/
public final class EvaluationFactory {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Javadoc for EvaluationFactory and its create method has been removed. Please restore it to maintain code clarity and document the factory's role in creating Evaluation instances.

/**
* Factory for creating scorer builders and accessing scorer types.
*/
public final class ScorersFactory {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Javadoc for ScorersFactory and its methods has been removed. Please restore the documentation to explain the purpose of this factory and its methods for creating different types of scorers.

* @see <a href="https://docs.judgment.ai/judgeval/cli/upload-scorers">Judgment
* Docs: Upload Scorers</a>
*/
public final class CustomScorer extends APIScorer {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The class-level Javadoc explaining what a CustomScorer is and how it's used has been removed. This information is valuable for developers using the SDK. Please consider restoring it, including the link to the documentation.

*/
import java.util.Optional;

public final class CustomScorerFactory {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Javadoc for CustomScorerFactory and its get methods has been removed. Please restore it to document how to create custom scorers, which is important for SDK usability.

/**
* Factory for creating tracer builders.
*/
public final class TracerFactory {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Javadoc for TracerFactory and its create method has been removed. Please restore it to explain the factory's purpose and how it's used to create Tracer instances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant