Experience is wasted if history does not repeat itself.
Sometime 2009, I was working on a sorting algorithm during my Student Industrial Training days and I got it working to a certain extent, but had to leave it for a while.
It had to do with something that occurred to me then, which I called sorted clusters.
Later, some online research showed that much work had been done in the field, and it belonged to a class of sorting algorithms called distribution sorts.
Earlier this year, I was on a software development team and I happened to be the lead developer and we had to work on an algorithm and we agreed on sorting algorithms and then I remembered the work I did earlier, and I decided to re-work it and see what could come out of it
That is where COBSort came from.
It employs 3 stage cache-optimization and a formula that suggests sort indices for elements of an unsorted array to produce a high speed distribution sort.
There are 2 flavors of this algorithm.
1. Speed Optimized COBsort
2. Memory Optimized COBsort.
Only the first is discussed here. The difference between the two is in the extra memory needed to hold data while the sorting is been carried out.
The first needs 2 extra arrays to do this, whereas the second only needs one, making it one of the few distribution sorts to achieve this. It does this at a certain cost, however: its speed.
Its speed is usually a bit behind that of the first algorithm. A great benefit of using only 1 extra array is that the second algorithm can handle larger input arrays than the former on limited memory systems.
Both handle floating point data arrays and integer data arrays. Their speeds rival those of the quicksorts, mergesorts, shellsorts and the heapsorts of this world and outperform some of these algorithms in some cases.
Java implementations of both algorithms are ready but here I give a generic algorithm for the Speed Optimized COBSort. I will post more later on.
The time complexity of these algorithms are similar to those of the bucket sort algorithm.
Best case = Average case = O(k*n) Worst case =O(n^2) Here is the algorithm for the Speed Optimized COBSort:
Sometime 2009, I was working on a sorting algorithm during my Student Industrial Training days and I got it working to a certain extent, but had to leave it for a while.
It had to do with something that occurred to me then, which I called sorted clusters.
Later, some online research showed that much work had been done in the field, and it belonged to a class of sorting algorithms called distribution sorts.
Earlier this year, I was on a software development team and I happened to be the lead developer and we had to work on an algorithm and we agreed on sorting algorithms and then I remembered the work I did earlier, and I decided to re-work it and see what could come out of it
That is where COBSort came from.
It employs 3 stage cache-optimization and a formula that suggests sort indices for elements of an unsorted array to produce a high speed distribution sort.
There are 2 flavors of this algorithm.
1. Speed Optimized COBsort
2. Memory Optimized COBsort.
Only the first is discussed here. The difference between the two is in the extra memory needed to hold data while the sorting is been carried out.
The first needs 2 extra arrays to do this, whereas the second only needs one, making it one of the few distribution sorts to achieve this. It does this at a certain cost, however: its speed.
Its speed is usually a bit behind that of the first algorithm. A great benefit of using only 1 extra array is that the second algorithm can handle larger input arrays than the former on limited memory systems.
Both handle floating point data arrays and integer data arrays. Their speeds rival those of the quicksorts, mergesorts, shellsorts and the heapsorts of this world and outperform some of these algorithms in some cases.
Java implementations of both algorithms are ready but here I give a generic algorithm for the Speed Optimized COBSort. I will post more later on.
The time complexity of these algorithms are similar to those of the bucket sort algorithm.
Best case = Average case = O(k*n) Worst case =O(n^2) Here is the algorithm for the Speed Optimized COBSort:
Pseudocode
CobSort(inputArray,bucketSizesArray,bucketLocationsArray)
1. n←length[A]
2. MAX_INSERTION_SORT←25
3. IF n > 1000000
4. For i = 1 to 3 do
5. partition(array)//cache optimization
//END LOOP
6. For j = 0 to n-1 do
7. ++bucketSizesArray[guessIndex(inputArray[j])]
///END LOOP
8. x←0, bucketSizesAccumulator←0
9. For k = 0 to n-1 do
10. IF (x←bucketSizesArray[k]) > 0
11. bucketLocationsArray[k] = bucketSizesAccumulator
bucketSizesAccumulator ← bucketSizesAccumulator + x
//END LOOP
12. For i = 0 to n-1 do
13. bucketSizesArray[bucketLocationsArray[guessIndex(inputArray[i])]++] = inputArray[i]
//END LOOP
14. For i = 1 to n-1 do
15. inputArray[i] = bucketSizesArray[i];
16. For index = 0 to n-1 do
17. start←index
18. suggestedIndex←guessIndex(inputArray[j])
19. While index < n do
20. IF suggestedIndex EQUALS guessIndex(index)
21. index++
22. ELSE
23. break
//END While loop
24. IF index GREATER THAN OR EQUALS n
25. index = right
26. bucketSize←index - start + 1
27. IF bucketSize EQUALS 1 && index EQUALS n-1
28. break
29. ELSE IF bucketSize GREATER THAN OR. EQUALS 2 AND bucketSize LESS THAN OR EQUALS MAX_INSERTION_SORT
30. insertionSort(inputArray,start,index)
31. ELSE bucketSort(inputArray,start,index)
//END FOR LOOP
32. END CobSort
Thanks. Comments are welcome.
No comments:
Post a Comment