Skip to content

Conversation

@esafwan
Copy link
Contributor

@esafwan esafwan commented Feb 7, 2026

Summary

This PR enhances the HUF image generation API with advanced configuration parameters and a new image retrieval tool, addressing key gaps identified in comparison with slide generation workflows.

Motivation

The current image generation API lacks:

  1. Fine-grained control over aspect ratio and resolution (needed for slide generation)
  2. Comprehensive metadata in return values (for tracking and referencing)
  3. Ability for parent agents to access images generated by child agents via run_agent

These limitations prevent effective slide-optimized image generation and parent-child agent workflows.

Changes

1. Advanced Image Configuration Parameters ✅

Added Parameters:

  • aspect_ratio: Control image aspect ratio (e.g., "16:9", "9:16", "1:1", "4:3", "3:4")
  • image_size: Control resolution quality (e.g., "1K", "2K", "4K")

Benefits:

  • Enables slide-optimized generation (16:9 @ 2K resolution)
  • Improves text readability in generated images
  • Maintains backward compatibility (optional parameters)
  • Automatically applied for Google/Gemini providers

Usage Example:

generate_image(
    prompt="Create a professional slide with title 'Q4 Results'",
    aspect_ratio="16:9",
    image_size="2K",
    quality="high"
)

2. Enhanced Return Value with Metadata ✅

New Fields:

  • message_id: Reference to Agent Message document
  • conversation_id: For querying images later
  • Generation parameters: prompt, size, quality, aspect_ratio, image_size

Benefits:

  • Enables precise image referencing via message IDs
  • Full metadata for debugging and logging
  • Allows querying images later using conversation ID

3. New Tool: get_conversation_images ✅

Purpose: Retrieve images from any conversation, solving the run_agent limitation.

Problem Solved:
The run_agent tool cannot return images to parent agents because images are saved as separate Agent Message documents, not included in the text response.

Solution:
New tool retrieves images from any conversation by conversation_id.

Usage Example:

# Parent runs child agent for image generation
result = run_agent(agent_name="image_generator", prompt="Generate logo")

# Parent retrieves images from child's conversation
images = get_conversation_images(conversation_id=result["conversation_id"])

# Parent can now access the images
for img in images["images"]:
    print(f"Image: {img['url']}, Message ID: {img['message_id']}")

Technical Details

Files Modified

  1. huf/ai/sdk_tools.py

    • Updated handle_generate_image() with new parameters
    • Added handle_get_conversation_images() function
    • Enhanced return values with comprehensive metadata
  2. huf/install.py

    • Updated tool registration with new parameters
    • Added create_get_conversation_images_tool() function
    • Updated installation hooks
  3. huf/patches/v1/add_image_tool_advanced_params.py (NEW)

    • Migration patch for existing installations
    • Adds new parameters to existing generate_image tool
  4. huf/patches.txt

    • Registered migration patch

Commit Structure

  1. feat: add aspect_ratio and image_size parameters to image generation

    • Core functionality for advanced image configuration
    • Enhanced return value with metadata
  2. feat: add tool registration for enhanced image generation

    • Tool registration for new parameters
    • New get_conversation_images tool
  3. feat: add migration patch for image generation parameters

    • Database migration for existing installations
    • Ensures smooth upgrade path

Testing

Test Cases

  • Generate image with aspect_ratio="16:9" and image_size="2K"
  • Verify parameters are passed to Google/Gemini models
  • Test backward compatibility (without new parameters)
  • Verify enhanced return value includes message_id and conversation_id
  • Test get_conversation_images() retrieves images correctly
  • Test parent-child agent workflow with images
  • Verify migration patch is idempotent

Manual Testing Steps

  1. Test Advanced Parameters:

    generate_image(
        prompt="Professional slide: Q4 Results",
        aspect_ratio="16:9",
        image_size="2K"
    )
  2. Test Image Retrieval:

    result = run_agent(agent_name="test_agent", prompt="Generate image")
    images = get_conversation_images(conversation_id=result["conversation_id"])
  3. Test Migration:

    bench migrate
    # Verify new parameters in Agent Tool Function

Deployment

Migration Path

# Pull latest code
git pull

# Run migration (applies patch automatically)
bench migrate

# Restart services
bench restart

Rollback Plan

If issues arise:

  1. Revert commits in reverse order
  2. Run bench migrate to apply reverse patches
  3. Restart services

Breaking Changes

None. All changes are backward compatible:

  • New parameters are optional
  • Existing code continues to work without modifications
  • Return value structure is extended, not changed

Documentation

  • Comprehensive documentation added in /workspace/development/Docs/HUF_Imagen.md
  • Code references and usage examples included
  • Comparison with AISlideAppGooGAPI documented

Related Issues

Closes gaps identified in image generation comparison:

  • ✅ Aspect ratio control
  • ✅ Resolution control
  • ✅ Run agent image access
  • ✅ Image metadata tracking

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Comments added for complex logic
  • Documentation updated
  • No linter errors
  • Backward compatibility maintained
  • Migration patch tested
  • Commit messages are clear and descriptive

Screenshots/Examples

Before:

# Limited configuration
generate_image(prompt="...", size="1024x1024")

# No way to access child agent images
result = run_agent(...)  # Images lost

After:

# Advanced configuration
generate_image(
    prompt="...", 
    aspect_ratio="16:9", 
    image_size="2K"
)

# Parent can access child images
result = run_agent(...)
images = get_conversation_images(conversation_id=result["conversation_id"])

Additional Notes

  • Parameters are provider-specific (Google/Gemini/Vertex AI)
  • Other providers ignore new parameters gracefully
  • Tool registry automatically syncs new tool
  • Patch is safe to run multiple times

Reviewers: Please test with Google/Gemini models for full functionality.

Add support for advanced image configuration parameters to enable
slide-optimized image generation with better text readability.

Changes:
- Add aspect_ratio parameter (e.g., "16:9", "9:16", "1:1")
- Add image_size parameter (e.g., "2K", "4K", "1K")
- Parameters are optional and maintain backward compatibility
- Automatically applied for Google/Gemini providers
- Enhanced return value with message_id, conversation_id, and metadata

Benefits:
- Enables slide-optimized generation (16:9 @ 2K resolution)
- Improves text readability in generated images
- Provides comprehensive metadata for image tracking
- Follows AISlideAppGooGAPI pattern for advanced configuration

Technical Details:
- Parameters passed as provider-specific kwargs to LiteLLM
- Only applied when provider is Google/Gemini/Vertex AI
- Return value now includes full generation context
Register new parameters and get_conversation_images tool in the
Agent Tool Function system.

Changes:
- Add aspect_ratio and image_size parameters to generate_image tool
- Create new get_conversation_images tool for image retrieval
- Update after_install() and after_migrate() hooks

New Tool: get_conversation_images
- Retrieves images from any conversation by conversation_id
- Returns image URLs, message IDs, and metadata
- Solves run_agent limitation (parent can access child images)
- Enables parent-child agent image workflows

Tool Parameters:
- conversation_id (optional): Target conversation
- limit (optional): Max images to return (default: 10)

Use Case:
Parent agent can now retrieve images generated by child agents:
1. Parent calls run_agent(agent_name="image_gen", prompt="...")
2. Parent calls get_conversation_images(conversation_id=result["conversation_id"])
3. Parent accesses child's generated images
Add database migration patch to update existing installations with
new image generation parameters.

Changes:
- Create patch: add_image_tool_advanced_params.py
- Register patch in patches.txt
- Patch adds aspect_ratio and image_size to existing generate_image tool

Patch Behavior:
- Checks if generate_image tool exists
- Adds aspect_ratio parameter if not present
- Adds image_size parameter if not present
- Safe to run multiple times (idempotent)

Deployment:
- Runs automatically on 'bench migrate'
- Updates existing installations without breaking changes
- New installations get parameters from create_image_generation_tool()

Migration Path:
[post_model_sync]
huf.patches.add_tool_types
huf.patches.v1.update_image_tool
huf.patches.v1.update_agent_background_color
huf.patches.v1.add_image_tool_advanced_params  <- NEW
…_ratio/image_size

Instead of conditionally applying aspect_ratio and image_size only for
Google/Gemini providers, pass them unconditionally to LiteLLM and let
it handle provider-specific forwarding.

Benefits:
- Follows LiteLLM's design: "Any non-openai params will be treated as
  provider-specific params and sent as kwargs to the provider"
- More future-proof: If other providers add support, they'll work automatically
- Cleaner code: No provider-specific conditionals needed
- Providers that don't support these params will ignore them gracefully

Technical Details:
- Removed provider check before adding aspect_ratio/image_size
- LiteLLM automatically forwards these to providers that support them
- Updated documentation to reflect universal parameter support
…ad of patch

Instead of using a separate migration patch, update create_image_generation_tool()
to handle both creation and updates. This is cleaner and eliminates the need
for a separate patch file.

Changes:
- Function now checks if tool exists and updates it if needed
- Adds missing aspect_ratio and image_size parameters to existing tools
- Removed add_image_tool_advanced_params.py patch (no longer needed)
- Removed patch registration from patches.txt

Benefits:
- Simpler: One function handles both creation and updates
- No separate patch needed - function runs in after_migrate() hook
- Idempotent: Safe to run multiple times
- Cleaner codebase: Less files to maintain

The function is called in after_install() and after_migrate() hooks, so
existing installations will automatically get updated during migration.
@esafwan esafwan force-pushed the feature/enhance-image-generation-api branch from 5245810 to de7d2f4 Compare February 7, 2026 16:40
@esafwan esafwan changed the title Feature/enhance image generation api feat: enhance image generation api Feb 7, 2026
@esafwan esafwan force-pushed the feature/enhance-image-generation-api branch from de7d2f4 to 9617a3e Compare February 7, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant