reciprocalspaceship.algorithms.scale_merged_intensities

reciprocalspaceship.algorithms.scale_merged_intensities(ds, intensity_key, sigma_key, output_columns=None, dropna=True, inplace=False, mean_intensity_method='isotropic', bins=100, bw=2.0, minimum_sigma=-inf)[source]

Scales merged intensities using Bayesian statistics in order to estimate structure factor amplitudes. This method is based on the approach by French and Wilson [1], and is useful for improving the estimates of negative and small intensities in order to ensure that structure factor moduli are strictly positive.

The mean and standard deviation of acentric reflections are computed analytically from a truncated normal distribution. The mean and standard deviation for centric reflections are computed by numerical integration of the posterior intensity distribution under a Wilson prior, and then by interpolation with a kernel smoother.

Notes

This method follows the same approach as French and Wilson, with the following modifications:

  • Numerical integration under a Wilson prior is used to estimate the mean and standard deviation of centric reflections at runtime, rather than using precomputed results and a look-up table.

  • Same procedure is used for all centric reflections; original work handled high intensity centric reflections differently.

Parameters:
  • ds (DataSet) – Input DataSet containing columns with intensity_key and sigma_key labels

  • intensity_key (str) – Column label for intensities to be scaled

  • sigma_key (str) – Column label for error estimates of intensities being scaled

  • output_columns (list or tuple of column names) – Column labels to be added to ds for recording scaled I, SigI, F, and SigF, respectively. output_columns must have len=4.

  • dropna (bool) – Whether to drop reflections with NaNs in intensity_key or sigma_key columns

  • inplace (bool) – Whether to modify the DataSet in place or create a copy

  • mean_intensity_method (str [“isotropic” or “anisotropic”]) – If “isotropic”, mean intensity is determined by resolution bin. If “anisotropic”, mean intensity is determined by Miller index using provided bandwidth.

  • bins (int or array) – Either an integer number of n bins. Or an array of bin edges with shape==(n, 2). Only affects output if mean_intensity_method is “isotropic”.

  • bw (float) – Bandwidth to use for computing anisotropic mean intensity. This parameter controls the distance that each reflection impacts in reciprocal space. Only affects output if mean_intensity_method is “anisotropic”.

  • minimum_sigma (float) – Minimum value imposed on Sigma (default: -np.inf, that is: no minimum).

Returns:

DataSet – DataSet with 4 additional columns corresponding to scaled I, SigI, F, and SigF.

References