Opened 15 years ago

Closed 14 years ago

#65 closed enhancement (fixed)

Use nth_element instead of sort for finding median

Reported by: MatthewWhiting Owned by: MatthewWhiting
Priority: normal Milestone:
Component: Code base Version: 1.1.8
Severity: normal Keywords:
Cc:

Description

Anywhere we are calculating the median (or madfm) in Duchamp, we are sorting the entire dataset. This is unnecessary, as we only want the middle element - we don't care how things are ordered away from the middle.

A better tool would be the nth_element function in the STL. It shows a factor of ~6.8x speedup over sorting (using teststats.x).

Change History (5)

comment:1 Changed 15 years ago by MatthewWhiting

Status: newassigned

comment:2 Changed 14 years ago by MatthewWhiting

Done this for the getStats.cc functions in [646]. There are a number of other places that use it, most importantly the setCubeStats() function in cubes.cc. This function should also make use of the masked versions of the getStats functions, rather than doing it all itself.

Doing some profiling with Shark shows this is quite effective, but that there a few other related changes that could be made to speed things up a bit, particularly in the reconstruction functions, where again the masked versions of the functions should be used.

comment:3 Changed 14 years ago by MatthewWhiting

Updated setCubeStats() in [647]. However there seems to be a difference in the calculation of the std.dev. Taking the difference in the .hdr files produced before & after this change shows:

1c1
< Results of the Duchamp source finder: Thu Feb  4 15:17:12 2010
---
> Results of the Duchamp source finder: Thu Feb  4 14:19:35 2010
51c51
< Mean   = 0.420022     Std.Dev. = 12.7747
---
> Mean   = 0.420022     Std.Dev. = 12.7598

This needs chasing up!

comment:4 Changed 14 years ago by MatthewWhiting

This last error ended up being (I think) because in the old setCubeStats() function the array was being sorted before the std.dev was calculated. The difference in order of the elements, combined with floating-point precision, is probably enough to give the slight difference in the values (recall stddev is the sum of the squares of the differences).

This means that the new version of the code should be OK.

comment:5 Changed 14 years ago by MatthewWhiting

Resolution: fixed
Status: assignedclosed

[650] incorporated the changes to the atrous reconstruction functions, making use of the masked findMedianStats functions. This is probably enough to close this ticket, as that is all the instances of std::sort in this fashion.

Note: See TracTickets for help on using tickets.