Skip to content

How can I balance the diffusion steps? #5

@ynicle

Description

@ynicle

By running you default sample code on our H20 GPU.

768 token + 768 steps: it costs 100s to get result. The performance is not so cool.

So I tried to reduce the steps to check if can improve the speed.

  • 768 token + 256 steps: it costs 41s to get correct result.
def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    else:
        pivot = arr[len(arr) // 2]  # Choose the pivot element
        left = [x for x in arr if x < pivot]  # Elements less than pivot
        middle = [x for x in arr if x == pivot]  # Elements equal to pivot
        right = [x for x in arr if x > pivot]  # Elements greater than pivot
        return quick_sort(left) + middle + quick_sort(right)

# Example usage:
arr = [3, 6, 8, 10, 1, 2, 1]
sorted_arr = quick_sort(arr)
print(sorted_arr)
  • 768 token + 128 steps: it costs 24s to get wrong result:
def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    else =        pivot = arr[len(arr) 2 Choose 2  # Choose the pivot element as the middle element
        left = [x for x in arr if x < pivot]
        middle = [x for x in arr if x == pivot]
        right = [x for x in arr if x > pivot]
        return quick_sort(left) + middle + quick_sort(right)

# Example usage:
arr = [3, 6, 8, 10, 1, 2, 1]
sorted_arr = quick_sort(arr)
print(sorted_arr)
  • 768 token + 64 steps: it costs 16s to get very wrong result:

Sure! Below is is one of the the algorithm algorithm... Quick. Quick.. Quick....... Quick......... Quick.......... Quick.......... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick........... Quick.....

It seems reducing the steps can help to improve the performance and 256 steps works.
But I'm not sure if 256 steps can work on other questions.

So what's your recommendation on setting the steps?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions