Function Objects

# Example 4: Sorting by user specified order

Assume, we would like to sort the rows of a 2d matrix by the the last column (representing "age"). This can be done with

// sort by last column
sorted = matrix.viewSorted(matrix.columns()-1);

Or assume, we would like to sort the columns of a 2d matrix by the the last row. Unfortunately, there is no convenience method to directly sort by row. So we need to view columns as rows and rows as columns, then sort, then adjust our view again.

// sort by last row
int lastRow = matrix.rows()-1;
sorted = matrix.viewDice().viewSorted(lastRow).viewDice();

Next, we would like to sort the rows of a 2d matrix by the aggregate sum of values in a row. A comparator object is used to do the job:

// sort by sum of values in a row
DoubleMatrix1DComparator comp = new DoubleMatrix1DComparator() {
public int compare(DoubleMatrix1D a, DoubleMatrix1D b) {
double as = a.zSum(); double bs = b.zSum();
return as < bs ? -1 : as == bs ? 0 : 1;
}
};
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,comp);

Further, we would like to sort the rows of a 2d matrix by the aggregate sum of logarithms in a row (which is a way to achieve sorting by geometric mean when viewing a row as a series of samples). A slightly more complex comparator object is needed:

// sort by sum of logarithms in a row
DoubleMatrix1DComparator comp = new DoubleMatrix1DComparator() {
public int compare(DoubleMatrix1D a, DoubleMatrix1D b) {
double as = a.aggregate(cern.jet.math.Functions.plus,cern.jet.math.Functions.log);
double bs = b.aggregate(cern.jet.math.Functions.plus,cern.jet.math.Functions.log); return as < bs ? -1 : as == bs ? 0 : 1; } }; sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,comp);
This is certainly not most efficient since row sums are recomputed many times (2*rows*log(rows) times, on average), but will suffice as an example. An efficient app will precompute the sums and use cern.colt.GenericSorting to sort the matrix. In general, if comparisons are expensive, precomputation boots performance by a factor 2*log(rows).

Recently, two methods that do exactly that were added to cern.colt.matrix.tdouble.algo.DoubleSorting. One of them works by filling a row into a so-called "bin", which is a multi-set with statistics operations defined upon. Aggregate measures over the row are then computed via a DoubleBinFunction1D. Some prefabricated functions are contained in DoubleBinFunctions1D Here is how to solve the problem efficiently:

// sort by sum of logarithms in a row
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,hep.aida.bin.DoubleBinFunctions1D.sumLog);

// sort by median in a row
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,hep.aida.bin.DoubleBinFunctions1D.median);

// sort by maximum in a row
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,hep.aida.bin.DoubleBinFunctions1D.max);