reciprocalspaceship.stats.compute_completeness

reciprocalspaceship.stats.compute_completeness(dataset, bins=10, anomalous='auto', dmin=None, unobserved_value=nan)[source]

Compute completeness of DataSet by resolution bin.

This function computes the completeness of a given unmerged or merged DataSet. If an unmerged DataSet is provided, the function will compute the completeness in the DataSet’s spacegroup, regardless of whether the reflections are specified in the reciprocal ASU or in P1. If a merged DataSet is provided, the reflections must be specified in the reciprocal ASU.

There are several types of completeness that can be computed using this function:

  • completeness (all): completeness of data before merging Friedel pairs (all reflections in +/- ASU)

  • completeness (non-anomalous): completeness after merging Friedel pairs (all reflections in +ASU)

  • completeness (anomalous): completeness of the anomalous data. Only accounts for acentric Bijvoet mates measured in both +/- ASU. Centric reflections do not factor into this calculation.

Notes

  • If the anomalous flag is ‘auto’, it will be auto set to True if the input DataSet is unmerged or contains columns with Friedel dtypes.

  • If anomalous=False, the completeness (non-anomalous) is computed.

  • If anomalous=True, all three completeness metrics are

  • unobserved_value is only used if anomalous=True, and will be used for filtering any Friedel observations in 1-col anomalous mode with the given value. This is only applied to merged DataSets because unmerged data should not use fill values.

  • MTZ files from sources such as phenix.refine may have additional filled columns that do not reflect data completeness. Pre-filtering may be required in such cases to only include “obs” suffixed columns in order to get the desired results.

  • If anomalous=True and a merged DataSet is provided, MTZInt data columns are removed to avoid potential issues with unobserved_value filtering. For example, this avoids issues with R-free flags in cases when unobserved_value=0, which can be the case for aimless output.

Parameters:
  • dataset (rs.DataSet) – DataSet object to be analyzed

  • bins (int) – Number of resolution bins to use

  • anomalous (bool or ‘auto’) – Whether to compute the anomalous completeness

  • dmin (float) – Resolution cutoff to use. If no dmin is supplied, the maximum resolution reflection will be used

  • unobserved_value (float) – Value of unobserved Friedel mates in dataset. Will be used if anomalous=True for removing unobserved reflections from merged DataSet objects.

Returns:

rs.DataSet – DataSet object summarizing the completeness by resolution bin