Opened 15 years ago
Closed 14 years ago
#65 closed enhancement (fixed)
Use nth_element instead of sort for finding median
Reported by: | MatthewWhiting | Owned by: | MatthewWhiting |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Code base | Version: | 1.1.8 |
Severity: | normal | Keywords: | |
Cc: |
Description
Anywhere we are calculating the median (or madfm) in Duchamp, we are sorting the entire dataset. This is unnecessary, as we only want the middle element - we don't care how things are ordered away from the middle.
A better tool would be the nth_element function in the STL. It shows a factor of ~6.8x speedup over sorting (using teststats.x).
Change History (5)
comment:1 Changed 15 years ago by
Status: | new → assigned |
---|
comment:2 Changed 14 years ago by
comment:3 Changed 14 years ago by
Updated setCubeStats() in [647]. However there seems to be a difference in the calculation of the std.dev. Taking the difference in the .hdr files produced before & after this change shows:
1c1 < Results of the Duchamp source finder: Thu Feb 4 15:17:12 2010 --- > Results of the Duchamp source finder: Thu Feb 4 14:19:35 2010 51c51 < Mean = 0.420022 Std.Dev. = 12.7747 --- > Mean = 0.420022 Std.Dev. = 12.7598
This needs chasing up!
comment:4 Changed 14 years ago by
This last error ended up being (I think) because in the old setCubeStats() function the array was being sorted before the std.dev was calculated. The difference in order of the elements, combined with floating-point precision, is probably enough to give the slight difference in the values (recall stddev is the sum of the squares of the differences).
This means that the new version of the code should be OK.
comment:5 Changed 14 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
[650] incorporated the changes to the atrous reconstruction functions, making use of the masked findMedianStats functions. This is probably enough to close this ticket, as that is all the instances of std::sort in this fashion.
Done this for the getStats.cc functions in [646]. There are a number of other places that use it, most importantly the setCubeStats() function in cubes.cc. This function should also make use of the masked versions of the getStats functions, rather than doing it all itself.
Doing some profiling with Shark shows this is quite effective, but that there a few other related changes that could be made to speed things up a bit, particularly in the reconstruction functions, where again the masked versions of the functions should be used.