hats.pixel_math.partition_stats#

Utilities for generating and manipulating object count histograms

Functions#

empty_histogram(highest_order)

Use numpy to create an histogram array with the right shape, filled with zeros.

generate_histogram(data, highest_order[, ra_column, ...])

Generate a histogram of counts for objects found in data

generate_alignment(row_count_histogram[, ...])

Generate alignment from high order pixels to those of equal or lower order

generate_incremental_alignment(row_count_histogram, ...)

Generate alignment for an incremental catalog.

Module Contents#

empty_histogram(highest_order)[source]#

Use numpy to create an histogram array with the right shape, filled with zeros.

Parameters:
highest_orderint

the highest healpix order (e.g. 0-10)

Returns:
np.ndarray

one-dimensional numpy array of long integers, where the length is equal to the number of pixels in a healpix map of target order, and all values are set to 0.

generate_histogram(data: pandas.DataFrame, highest_order, ra_column='ra', dec_column='dec')[source]#

Generate a histogram of counts for objects found in data

Parameters:
datapd.DataFrame

tabular object data

highest_orderint

the highest healpix order (e.g. 0-10)

ra_columnstr

where in the input to find the celestial coordinate right ascension (Default value = “ra”)

dec_columnstr

where in the input to find the celestial coordinate declination (Default value = “dec”)

Returns:
np.ndarray

one-dimensional numpy array of long integers where the value at each index corresponds

Raises:
ValueError

if the ra_column or dec_column cannot be found in the input data.

generate_alignment(row_count_histogram, highest_order=10, lowest_order=0, threshold=1000000, drop_empty_siblings=False, mem_size_histogram=None)[source]#

Generate alignment from high order pixels to those of equal or lower order

We may initially find healpix pixels at order 10, but after aggregating up to the pixel threshold, some final pixels are order 4 or 7. This method provides a map from pixels at order 10 to their destination pixel. This may be used as an input into later partitioning map reduce steps.

Parameters:
row_count_histogramnp.array

one-dimensional numpy array of long integers where the value at each index corresponds to the number of objects found at the healpix pixel.

highest_orderint

the highest healpix order (e.g. 5-10) (Default value = 10)

lowest_orderint

the lowest healpix order (e.g. 1-5). specifying a lowest order constrains the partitioning to prevent spatially large pixels. (Default value = 0)

thresholdint

the maximum number of objects allowed in a single pixel (Default value = 1_000_000)

drop_empty_siblingsbool

if 3 of 4 pixels are empty, keep only the non-empty pixel (Default value = False)

mem_size_histogramnp.array or None

one-dimensional numpy array of long integers where the value at each index corresponds to the memory size (in bytes) of objects found at the healpix pixel. If provided, this will be used to determine the thresholding instead of the param histogram. (Default value = None)

Returns:
tuple

one-dimensional numpy array of integer 3-tuples, where the value at each index corresponds to the destination pixel at order less than or equal to the highest_order. The tuple contains three integers:

  • order of the destination pixel

  • pixel number at the above order

  • the number of objects in the pixel

Raises:
ValueError

if the histogram is the wrong size, or some initial histogram bins exceed threshold.

generate_incremental_alignment(row_count_histogram: numpy.ndarray, existing_pixels: Sequence[tuple[int, int]], highest_order: int = 10, lowest_order: int = 0, threshold: int = 1000000, mem_size_histogram: numpy.ndarray | None = None)[source]#

Generate alignment for an incremental catalog.

We will keep the existing pixels and add new pixels for the points in the histogram that fall out of the existing coverage. Those pixels will be the largest (non-overlapping) possible that obey to the defined pixel threshold.

Unlike generate_alignment there is no global guarantee that the number of points per pixel remains under the previous pixel_threshold.

Parameters:
row_count_histogramnp.ndarray

one-dimensional numpy array of long integers where the value at each index corresponds to the number of objects found at the healpix pixel.

existing_pixelsSequence[tuple[int,int]]

the list of pixels in the existing catalog that we want to keep

highest_orderint

the highest healpix order (e.g. 5-10) (Default value = 10)

lowest_orderint

the lowest healpix order (e.g. 1-5). specifying a lowest order constrains the partitioning to prevent spatially large pixels. (Default value = 0)

thresholdint

the maximum number of objects allowed in a single pixel (Default value = 1_000_000)

mem_size_histogramnp.ndarray or None

one-dimensional numpy array of long integers where the value at each index corresponds to the memory size (in bytes) of objects found at the healpix pixel. If provided, this will be used to determine the thresholding instead of the param histogram. (Default value = None)

Returns:
tuple

one-dimensional numpy array of integer 3-tuples, where the value at each index corresponds to the destination pixel at order less than or equal to the mapping order. The tuple contains three integers:

  • order of the destination pixel

  • pixel number at the above order

  • the number of objects in the pixel