Installation
The easiest way to install laue-dials
and its dependencies is using Anaconda. First we update and install the libmamba solver with
conda update -n base conda
conda install -n base conda-libmamba-solver
conda config --set solver libmamba
With Anaconda, we can then create and activate a custom environment for your install by running
conda create --name laue-dials
conda activate laue-dials
Now we are ready to install the main dependency and framework: DIALS. After installing that, we can install laue-dials
using pip, as below:
conda install -c conda-forge dials
pip install laue-dials
All other dependencies will then be automatically installed for us, and we’ll be ready to analyze your first Laue data set! Reopen this notebook with the appropriate environment activated when ready.
Documentation for laue-dials
can be found at here, and entering a command with no arguments on the command line will also print a help page!
Introduction
In this notebook, we will process an anomalous HEWL dataset. The data is comprised of 3049 frames that constitute several rotations of a HEWL crystal.
At the end of processing, we would like an integrated .mtz
file we can then process with careless
for merging. We must run laue-dials
on the whole dataset for completeness, but we can use fewer images for tutorial purposes if needed. The data are spaced by one degree per frame, so 180 frames represents a single full rotation.
Data processing will rely on images found in ./data
and scripts found in ./scripts
.
Importing Data
We can use dials.import
as a way to import the data files written at experimental facilities into a format that is friendly to both DIALS
and laue-dials
. We provide the peak wavelength of the beam, the detector pixel measurements (in mm), and the goniometer axes.
[ ]:
%%time
%%bash
# Import data
dials.import geometry.scan.oscillation=0,1 \
geometry.goniometer.axes=-1,0,0 \
geometry.beam.wavelength=1.05 \
geometry.detector.panel.pixel=0.08854,0.08854 \
input.template=$(pwd)'/data/HEWL_NaI_3_2_####.mccd' \
output.experiments=imported_HEWL_anom_3049.expt \
output.log=dials.import_HEWL_anom_3049.log
Getting an Initial Estimate
After importing our data, the first thing we need to do is get an initial estimate for the experimental geometry. Here, we’ll use some monochromatic algorithms from DIALS to help! This step can be tricky – failure can be due to several causes. In the event of failure, here are a few common causes:
The spotfinding gain is either too high or too low. Try looking at the results of
dials.image_viewer imported.expt strong.refl
as below and seeing if you have too many (or too few) reflections. Lower gain gives you more spots, but also more likely to give false positives.Supplying the space group or unit cell during indexing can be helpful. When supplying the unit cell, allow for some variation in the lengths of the axes, since the monochromatic algorithms may result in a slightly scaled unit cell depending on the chosen wavelength.
You may have intensities that need to be masked. These can come from bad panels or extraneous scatter. You can use
dials.image_viewer
(described below) to create a mask file for your data, and then provide thespotfinder.lookup.mask="pixels.mask"
command below to use that mask during spotfinding.
First we will run spotfinding algorithms, then check using the DIALS image viewer to ensure we have the gain set to an appropriate level, and then try our hand at laue.index
to see the quality of the indexed unit cell.
[ ]:
%%time
%%bash
laue.find_spots imported_HEWL_anom_3049.expt \
spotfinder.mp.nproc=60 \
spotfinder.threshold.dispersion.gain=0.15 \
spotfinder.filter.max_separation=10 \
output.reflections=strong_HEWL_anom_3049.refl \
output.log=laue.find_spots_HEWL_anom_3049.log
Viewing Images
Sometimes it’s helpful to be able to see the analysis data overlayed on the raw data. DIALS has a utility for viewing spot information on the raw images called dials.image_viewer
. For example, the spotfinding gain parameter can be tuned to capture more spots, but lowering it too much finds nonexistent spots. To check this, we can use the image viewer to see what spots were found on images. We need to provide an expt
file and a refl
file – the imported.expt
and strong.refl
files will do for checking spotfinding. This program also has utilities for generating masks if they are needed. The red dots from the checkbox “Mark centers of mass” are the spots found by laue.find_spots
(which in turn makes a call to dials.find_spots
). These are best used for judging whether you need to adjust the gain higher (for fewer spots) or lower (for more) during spotfinding. You can find more details on the image viewer in the DIALS tutorial
here.
[ ]:
%%time
%%bash
# I found it helpful to set the brightness to 30
dials.image_viewer imported_HEWL_anom_3049.expt strong_HEWL_anom_3049.refl
[ ]:
%%time
%%bash
laue.index imported_HEWL_anom_3049.expt strong_HEWL_anom_3049.refl \
indexer.indexing.nproc=60 \
indexer.indexing.known_symmetry.space_group=96 \
indexer.indexing.refinement_protocol.mode=refine_shells \
indexer.refinement.parameterisation.auto_reduction.action=fix \
laue_output.index_only=False \
laue_output.indexed.experiments=indexed_HEWL_anom_3049.expt \
laue_output.indexed.reflections=indexed_HEWL_anom_3049.refl \
laue_output.refined.experiments=refined_HEWL_anom_3049.expt \
laue_output.refined.reflections=refined_HEWL_anom_3049.refl \
laue_output.final_output.experiments=monochromatic_HEWL_anom_3049.expt \
laue_output.final_output.reflections=monochromatic_HEWL_anom_3049.refl \
laue_output.log=laue.index_HEWL_anom_3049.log
Making Stills
Here we will now split our monochromatic estimate into a series of stills to prepare it for the polychromatic pipeline. There is a useful utility called laue.sequence_to_stills
for this.
NOTE: Do not use dials.sequence_to_stills
, as there are data columns which do not match between the two programs.
[ ]:
%%time
%%bash
laue.sequence_to_stills monochromatic_HEWL_anom_3049.expt \
monochromatic_HEWL_anom_3049.refl \
output.experiments=stills_HEWL_anom_3049.expt \
output.reflections=stills_HEWL_anom_3049.refl \
output.log=laue.sequence_to_stills_HEWL_anom_3049.log
Polychromatic Analysis
Here we will use four other programs in laue-dials
to create a polychromatic experimental geometry using our initial monochromatic estimate. Each of the programs does the following:
laue.optimize_indexing
assigns wavelengths to reflections and refines the crystal orientation jointly.
laue.refine
is a polychromatic wrapper for dials.refine
and allows for refining the experimental geometry overall to one suitable for spot prediction and integration.
laue.predict
takes the refined experimental geometry and predicts the centroids of all strong and weak reflections on the detector.
laue.integrate
then builds spot profiles and integrates intensities on the detector.
[ ]:
%%time
%%bash
laue.optimize_indexing stills_HEWL_anom_3049.refl \
stills_HEWL_anom_3049.expt \
output.experiments=optimized_HEWL_anom_3049.expt \
output.reflections=optimized_HEWL_anom_3049.refl \
output.log=laue.optimize_indexing_HEWL_anom_3049.log \
wavelengths.lam_min=0.97 \
wavelengths.lam_max=1.25 \
reciprocal_grid.d_min=1.4 \
nproc=60
[ ]:
%%time
%%bash
laue.refine optimized_HEWL_anom_3049.expt \
optimized_HEWL_anom_3049.refl \
output.experiments=poly_refined_HEWL_anom_3049.expt \
output.reflections=poly_refined_HEWL_anom_3049.refl \
output.log=laue.poly_refined_HEWL_anom_3049.log \
nproc=60
Note: even without maxing out the available cores, jupyterlab has a tendency to crash/think that the above cell is running indefinitely. After confirming that the experiment & reflections files had been successfully written out via terminal, I had to interrupt the kernel, restart it, and then resume processing below.
Check results in image viewer
[ ]:
%%time
%%bash
dials.image_viewer monochromatic_HEWL_anom_3049.expt monochromatic_HEWL_anom_3049.refl
Predictions do not look great - many shoeboxes do not have a predicted spot, and there are also some predicted spots that are off-target or fully false positives.
[ ]:
%%time
%%bash
dials.image_viewer poly_refined_HEWL_anom_3049.expt poly_refined_HEWL_anom_3049.refl
The polychromatic predictions look much better!
Check wavelength spectrum
There is a utility in laue-dials
called laue.plot_wavelengths
. This command generates a histogram of the assigned wavelength spectrum. If you know approximately the shape of your beam spectrum, this can be a useful check to ensure that nothing has gone wrong with wavelength assignment at this stage before predicting the full set of reflections.
[ ]:
%%time
%%bash
laue.plot_wavelengths poly_refined_HEWL_anom_3049.refl \
refined_only=True \
save=True \
show=False \
output=wavelengths_HEWL_anom_3049.png \
log=laue.plot_wavelengths_HEWL_anom_3049.log
[ ]:
from IPython.display import Image
import os
cwd = os.getcwd()
Image(filename=cwd+'/wavelengths_HEWL_anom_3049.png')
Spot prediction
Since the assigned spectrum looks good, we can move on to predicting the full set of reflections. If the assigned beam spectrum ends up narrower than the wavelength limits you provided in laue.optimize_indexing
, you can always narrow down the spectrum here for laue.predict
. The predictor will find the locations of all feasible spots and build profiles for the weak spots based on the observed strong spots. The output reflection table can then be fed along with the refined expt
file
into laue.integrate
to generate mtz
files suitable for merging in a program like careless
.
[ ]:
%%time
%%bash
laue.predict poly_refined_HEWL_anom_3049.expt \
poly_refined_HEWL_anom_3049.refl \
output.reflections=predicted_HEWL_anom_3049.refl \
output.log=laue.predict_HEWL_anom_3049.log \
wavelengths.lam_min=0.97 \
wavelengths.lam_max=1.25 \
reciprocal_grid.d_min=1.4 \
nproc=60
Integration
[ ]:
%%time
%%bash
laue.integrate poly_refined_HEWL_anom_3049.expt \
predicted_HEWL_anom_3049.refl \
output.filename=integrated_HEWL_anom_3049.mtz \
output.log=laue.integrate_HEWL_anom_3049.log \
nproc=12
Conclusion
At this point, you now have integrated mtz
files that you can pass to careless for scaling and merging. We provide an example SLURM-compatible careless
script, found at scripts/sbatch_careless_varied_frames.sh
. There are also several other scripts that can be used for further processing that are described by README.txt
.
Note that throughout this pipeline, you can use DIALS utilities like dials.image_viewer
or dials.report
to check progress and ensure your data is being analyzed properly. We recommend regularly checking the analysis by looking at the data on images, which can be done by
dials.image_viewer FILE.expt FILE.refl
.
These files are generally written as pairs with the same base name, with the exception of combining imported.expt
+ strong.refl
, or poly_refined.expt
+ predicted.refl
.
Also note that you can take any program and enter it on the command-line for further help. For example, writing
laue.optimize_indexing
will print a help page for the program. You can see all configurable parameters by using
laue.optimize_indexing -c
.
This applies to all laue-dials
command-line programs.
For further processing of these data in programs like careless
, the README.txt
file includes instructions for using the programs in /scripts/
(reproduced below).
Congratulations! This tutorial is now over. For further questions, feel free to consult documentation or email the authors.
Post-Laue-DIALS processing
All HEWL anomalous data analysis and figure generation post-laue-dials was done using the 5 scripts below, in order:
HEWL_anom_cut_friedelize_careless.sh
Copies the integrated (unmerged) mtz file produced by the HEWL_anom_laue_dials_processing_final.ipynb notebook into the working directory
Calls the cut_unmerged_mtz_by_frames.py utility to create mtzs with only a subset of the overall images
Calls the friedelize.py utility to split the Friedel mates into two mtz files (*_plus.mtz and *_minus.mtz)
Copies those split mtzs into the appropriate directory
Calls the sbatch_careless_varied_frames.sh utility to scale those mtzs
HEWL_anom_unfriedelize.sh
Calls the unfriedelize.py utility to recombine the Friedel mates into a single mtz file
Moves the resulting mtz to the refinement directory
HEWL_anom_refine.sh
Copies files with a set of custom refinement parameters for each step of refinement in Phenix into the appropriate directory. Refinement 1 is a rigid-body refinment only, while Refinement 2 also refines individual B-factors.
Calls the utility sbatch_phenix_Refine.sh to run Phenix refinement
HEWL_anom_peak_heights.sh
Calls the anomalous_peak_heights.py utility to calculate the anomalous peak heights for each I and S atom accross all frame number sizes and store the resulting outputs in csv files
Calls the concatenate_anomalous_peak_csv.py utility to concatenate the resulting 13 csv files into one
HEWL_anom_figures.sh
Calls the HEWL_anom_peaks.pml utility to generate the PyMOL figure showing anomalous density
Calls the careless.ccanom and careless.cchalf function to prepare data for subsequent plotting in Jupyter notebooks