Wednesday, June 25, 2014

Quicksort

Sorting algorithms are a frequent element to computer science education, conversation amongst programmers, and job interviews. There are many different versions with varying tradeoffs of performance and technique.

I noticed that Rosetta Code has a page on Quicksort implementations. I thought it might make a nice example of translating pseudocode to Factor.

simple quicksort

The "simple quicksort algorithm" has the following pseudocode:

function quicksort(array)
    less, equal, greater := three empty arrays
    if length(array) > 1  
        pivot := select any element of array
        for each x in array
            if x < pivot then add x to less
            if x = pivot then add x to equal
            if x > pivot then add x to greater
        quicksort(less)
        quicksort(greater)
        array := concatenate(less, equal, greater)

We can copy it verbatim using the ability to have named local variables:

:: quicksort ( seq -- sorted-seq )
    seq length 1 > [
        V{ } clone :> less
        V{ } clone :> equal
        V{ } clone :> greater
        seq first :> pivot
        seq [| x |
            x pivot <=> {
                { +lt+ [ x less push ] }
                { +eq+ [ x equal push ] }
                { +gt+ [ x greater push ] }
            } case
        ] each
        less quicksort equal greater quicksort 3append
    ] [ seq ] if ;

Even though local variables can be convenient, we discourage using them if library words or simpler concepts can express the same logic. Noticing that this partitions the sequence, and then joins the parts, we can make it a bit shorter using some available library words:

: quicksort ( seq -- sorted-seq )
    dup empty? [
      unclip [
          '[ _ before? ] partition [ quicksort ] bi@
      ] keep prefix append
    ] unless ;

Neither of these is particularly fast, since they involve the creation of a lot of temporary sequences. There is a better (meaning faster and not really more complex) version available.

better quicksort

The "better quicksort algorithm" is an in-place version that uses swaps to move items into a sorted order. It has the following pseudocode:

function quicksort(array)
    if length(array) > 1
        pivot := select any element of array
        left := first index of array
        right := last index of array
        while left ≤ right
            while array[left] < pivot
                left := left + 1
            while array[right] > pivot
                right := right - 1
            if left ≤ right
                swap array[left] with array[right]
                left := left + 1
                right := right - 1
        quicksort(array from first index to right)
        quicksort(array from left to last index)

We can take a similar translation approach to the first example (using some unsafe words to avoid bounds-checking and mutable local variables) to create this version:

:: (quicksort) ( seq from to -- )
    from to < [
        from to + 2/ seq nth-unsafe :> pivot
        from :> left!
        to :> right!

        [ left right <= ] [
            [ left seq nth-unsafe pivot before? ]
            [ left 1 + left! ] while
            [ right seq nth-unsafe pivot after? ]
            [ right 1 - right! ] while
            left right <= [
                left right seq exchange-unsafe
                left 1 + left!
                right 1 - right!
            ] when
        ] while

        seq from right (quicksort)
        seq left to (quicksort)
    ] when ; inline recursive

: quicksort ( seq -- )
    0 over length 1 - (quicksort) ;

This is faster, although about 3x slower than our current merge sort algorithm. There are probably ways we could make it faster (one I noticed and filed an issue to track that also makes merge sort faster).

I have committed a version of this in the sorting.quick vocabulary that I hope to use for faster in-place sorting in the standard library.

No comments: