reciprocalspaceship.read_crystfel

reciprocalspaceship.read_crystfel(streamfile: str, spacegroup=None, encoding='utf-8', columns=None, parallel=True, num_cpus=None, address='local', **ray_kwargs) DataSet[source]

Initialize attributes and populate the DataSet object with data from a CrystFEL stream with indexed reflections. This is the output format used by CrystFEL software when processing still diffraction data.

This method is parallelized across CPUs speed up parsing. Parallelization depends on the ray library (https://www.ray.io/). If ray is unavailable, this method falls back to serial processing on one CPU. Ray is not a dependency of reciprocalspaceship and will not be installed automatically. Users must manually install it prior to calling this method.

Parameters:
  • streamfile (str) – name of a .stream file

  • spacegroup (gemmi.SpaceGroup or int or string (optional)) – optionally set the spacegroup of the returned DataSet.

  • encoding (str) – The type of byte-encoding (optional, ‘utf-8’).

  • columns (list (optional)) – Optionally specify the columns of the output by a list of strings. The default list is: [ “H”, “K”, “L”, “I”, “SigI”, “BATCH”, “s1x”, “s1y”, “s1z”, “ewald_offset”, “angular_ewald_offset”, “XDET”, “YDET” ] See rs.io.crystfel.StreamLoader().available_column_names for a list of available column names and Notes for a description of the returned columns

  • parallel (bool (optional)) – Read the stream file in parallel using [ray.io](https://docs.ray.io) if it is available.

  • num_cpus (int (optional)) – By default, the model will use all available cores. For very large cpu counts, this may consume too much memory. Decreasing num_cpus may help. If ray is not installed, a single core will be used.

  • address (str (optional)) – Optionally specify the ray instance to connect to. By default, start a new local instance.

  • ray_kwargs (optional) – Additional keyword arguments to pass to [ray.init](https://docs.ray.io/en/latest/ray-core/api/doc/ray.init.html#ray.init).

Returns:

rs.DataSet

Notes

The following columns are included in the returned DataSet object:

  • H, K, L: Miller indices of each reflection

  • I, SigI: Intensity and associated uncertainty

  • BATCH: Image number

  • s1x, s1y, s1z: scattered beam wavevector which points from the sample to the bragg peak

  • ewald_offset: the distance in cartesian space (1/angstroms) between the observed reflection and the ewald sphere

  • angular_ewald_offset: the distance in polar coordinates (degrees) between the observed reflection and the ewald sphere

  • XDET, YDET: Internal detector panel coordinates