{ "cells": [ { "cell_type": "markdown", "id": "7f3f2491-8be2-4e3f-b17a-fb66c48558e7", "metadata": {}, "source": [ "# Installation\n", "\n", "The easiest way to install `laue-dials` and its dependencies is using [Anaconda](https://docs.anaconda.com/free/anaconda/install/index.html). First we update and install the libmamba solver with\n", "\n", "```\n", "conda update -n base conda\n", "conda install -n base conda-libmamba-solver\n", "conda config --set solver libmamba\n", "```\n", "\n", "With Anaconda, we can then create and activate a custom environment for your install by running \n", "\n", "```\n", "conda create --name laue-dials\n", "conda activate laue-dials\n", "```\n", "\n", "Now we are ready to install the main dependency and framework: [DIALS](https://dials.github.io). After installing that, we can install `laue-dials` using pip, as below:\n", "\n", "```\n", "conda install -c conda-forge dials\n", "pip install laue-dials\n", "```\n", "\n", "All other dependencies will then be automatically installed for us, and we'll be ready to analyze your first Laue data set! Reopen this notebook with the appropriate environment activated when ready.\n", "\n", "Documentation for `laue-dials` can be found at [here](https://rs-station.github.io/laue-dials/index.html), and entering a command with no arguments on the command line will also print a help page!" ] }, { "cell_type": "markdown", "id": "2b283ddc-4d0b-41f0-bcf1-276c93233db7", "metadata": { "tags": [] }, "source": [ "# Introduction\n", "In this notebook, we will process an anomalous HEWL dataset. The data is comprised of 3049 frames that constitute several rotations of a HEWL crystal.\n", "\n", "At the end of processing, we would like an integrated `.mtz` file we can then process with `careless` for merging. We must run `laue-dials` on the whole dataset for completeness, but we can use fewer images for tutorial purposes if needed. The data are spaced by one degree per frame, so 180 frames represents a single full rotation.\n", "\n", "Data processing will rely on images found in `./data` and scripts found in `./scripts`. \n", "\n", "# Importing Data\n", "\n", "We can use `dials.import` as a way to import the data files written at experimental facilities into a format that is friendly to both `DIALS` and `laue-dials`. We provide the peak wavelength of the beam, the detector pixel measurements (in mm), and the goniometer axes. " ] }, { "cell_type": "code", "execution_count": null, "id": "c347b644-87ab-4c8e-8a20-fbfc10a67902", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "# Import data\n", "dials.import geometry.scan.oscillation=0,1 \\\n", " geometry.goniometer.axes=-1,0,0 \\\n", " geometry.beam.wavelength=1.05 \\\n", " geometry.detector.panel.pixel=0.08854,0.08854 \\\n", " input.template=$(pwd)'/data/HEWL_NaI_3_2_####.mccd' \\\n", " output.experiments=imported_HEWL_anom_3049.expt \\\n", " output.log=dials.import_HEWL_anom_3049.log" ] }, { "cell_type": "markdown", "id": "0643abd5-541c-4113-a17c-04a27cda4b1e", "metadata": { "tags": [] }, "source": [ "# Getting an Initial Estimate\n", "\n", "After importing our data, the first thing we need to do is get an initial estimate for the experimental geometry. Here, we'll use some monochromatic algorithms from DIALS to help! This step can be tricky -- failure can be due to several causes. In the event of failure, here are a few common causes:\n", "\n", "1. The spotfinding gain is either too high or too low. Try looking at the results of `dials.image_viewer imported.expt strong.refl` as below and seeing if you have too many (or too few) reflections. Lower gain gives you more spots, but also more likely to give false positives.\n", "2. Supplying the space group or unit cell during indexing can be helpful. When supplying the unit cell, allow for some variation in the lengths of the axes, since the monochromatic algorithms may result in a slightly scaled unit cell depending on the chosen wavelength.\n", "3. You may have intensities that need to be masked. These can come from bad panels or extraneous scatter. You can use `dials.image_viewer` (described below) to create a mask file for your data, and then provide the `spotfinder.lookup.mask=\"pixels.mask\"` command below to use that mask during spotfinding.\n", "\n", "First we will run spotfinding algorithms, then check using the DIALS image viewer to ensure we have the gain set to an appropriate level, and then try our hand at `laue.index` to see the quality of the indexed unit cell." ] }, { "cell_type": "code", "execution_count": null, "id": "d9e016aa-236e-4048-aeb4-c5dbe69ec649", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.find_spots imported_HEWL_anom_3049.expt \\\n", " spotfinder.mp.nproc=60 \\\n", " spotfinder.threshold.dispersion.gain=0.15 \\\n", " spotfinder.filter.max_separation=10 \\\n", " output.reflections=strong_HEWL_anom_3049.refl \\\n", " output.log=laue.find_spots_HEWL_anom_3049.log" ] }, { "cell_type": "markdown", "id": "09d55f48-db2d-40fb-af3d-c1e568730c84", "metadata": { "tags": [] }, "source": [ "# Viewing Images\n", "\n", "Sometimes it's helpful to be able to see the analysis data overlayed on the raw data. DIALS has a utility for viewing spot information on the raw images called `dials.image_viewer`. For example, the spotfinding gain parameter can be tuned to capture more spots, but lowering it too much finds nonexistent spots. To check this, we can use the image viewer to see what spots were found on images. We need to provide an `expt` file and a `refl` file -- the `imported.expt` and `strong.refl` files will do for checking spotfinding. This program also has utilities for generating masks if they are needed. The red dots from the checkbox \"Mark centers of mass\" are the spots found by `laue.find_spots` (which in turn makes a call to `dials.find_spots`). These are best used for judging whether you need to adjust the gain higher (for fewer spots) or lower (for more) during spotfinding. You can find more details on the image viewer in the [DIALS tutorial here](https://dials.github.io/documentation/tutorials/processing_in_detail_betalactamase.html)." ] }, { "cell_type": "code", "execution_count": null, "id": "0f4f6850-8e4f-41ec-bf61-41a2cb3ef7e2", "metadata": {}, "outputs": [], "source": [ "%%time\n", "%%bash\n", "# I found it helpful to set the brightness to 30\n", "\n", "dials.image_viewer imported_HEWL_anom_3049.expt strong_HEWL_anom_3049.refl" ] }, { "cell_type": "code", "execution_count": null, "id": "fd7d0154-7cbb-4fa7-9aa6-d52cfc3d41e0", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.index imported_HEWL_anom_3049.expt strong_HEWL_anom_3049.refl \\\n", " indexer.indexing.nproc=60 \\\n", " indexer.indexing.known_symmetry.space_group=96 \\\n", " indexer.indexing.refinement_protocol.mode=refine_shells \\\n", " indexer.refinement.parameterisation.auto_reduction.action=fix \\\n", " laue_output.index_only=False \\\n", " laue_output.indexed.experiments=indexed_HEWL_anom_3049.expt \\\n", " laue_output.indexed.reflections=indexed_HEWL_anom_3049.refl \\\n", " laue_output.refined.experiments=refined_HEWL_anom_3049.expt \\\n", " laue_output.refined.reflections=refined_HEWL_anom_3049.refl \\\n", " laue_output.final_output.experiments=monochromatic_HEWL_anom_3049.expt \\\n", " laue_output.final_output.reflections=monochromatic_HEWL_anom_3049.refl \\\n", " laue_output.log=laue.index_HEWL_anom_3049.log" ] }, { "cell_type": "markdown", "id": "ecb0b330-e569-4012-ac9d-077b585b45fe", "metadata": {}, "source": [ "# Making Stills\n", "\n", "Here we will now split our monochromatic estimate into a series of stills to prepare it for the polychromatic pipeline. There is a useful utility called `laue.sequence_to_stills` for this.\n", "\n", "NOTE: Do not use `dials.sequence_to_stills`, as there are data columns which do not match between the two programs." ] }, { "cell_type": "code", "execution_count": null, "id": "3e16f007-5fc7-42bf-abf0-7d5584def634", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.sequence_to_stills monochromatic_HEWL_anom_3049.expt \\\n", " monochromatic_HEWL_anom_3049.refl \\\n", " output.experiments=stills_HEWL_anom_3049.expt \\\n", " output.reflections=stills_HEWL_anom_3049.refl \\\n", " output.log=laue.sequence_to_stills_HEWL_anom_3049.log" ] }, { "cell_type": "markdown", "id": "84ecbd84-ba4a-460a-98d9-017273e3f3dd", "metadata": {}, "source": [ "# Polychromatic Analysis\n", "\n", "Here we will use four other programs in `laue-dials` to create a polychromatic experimental geometry using our initial monochromatic estimate. Each of the programs does the following:\n", "\n", "`laue.optimize_indexing` assigns wavelengths to reflections and refines the crystal orientation jointly.\n", "\n", "`laue.refine` is a polychromatic wrapper for `dials.refine` and allows for refining the experimental geometry overall to one suitable for spot prediction and integration.\n", "\n", "`laue.predict` takes the refined experimental geometry and predicts the centroids of all strong and weak reflections on the detector.\n", "\n", "`laue.integrate` then builds spot profiles and integrates intensities on the detector." ] }, { "cell_type": "code", "execution_count": null, "id": "7c7e051e-4d17-4b60-8781-5068d99a530d", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.optimize_indexing stills_HEWL_anom_3049.refl \\\n", " stills_HEWL_anom_3049.expt \\\n", " output.experiments=optimized_HEWL_anom_3049.expt \\\n", " output.reflections=optimized_HEWL_anom_3049.refl \\\n", " output.log=laue.optimize_indexing_HEWL_anom_3049.log \\\n", " wavelengths.lam_min=0.97 \\\n", " wavelengths.lam_max=1.25 \\\n", " reciprocal_grid.d_min=1.4 \\\n", " nproc=60" ] }, { "cell_type": "code", "execution_count": null, "id": "54134d6c-495e-43c9-972f-4f35fa4437ed", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.refine optimized_HEWL_anom_3049.expt \\\n", " optimized_HEWL_anom_3049.refl \\\n", " output.experiments=poly_refined_HEWL_anom_3049.expt \\\n", " output.reflections=poly_refined_HEWL_anom_3049.refl \\\n", " output.log=laue.poly_refined_HEWL_anom_3049.log \\\n", " nproc=60" ] }, { "cell_type": "markdown", "id": "104cd7e9-b2ca-495d-b35e-4675b6e11541", "metadata": {}, "source": [ "Note: even without maxing out the available cores, jupyterlab has a tendency to crash/think that the above cell is running indefinitely. After confirming that the experiment & reflections files had been successfully written out via terminal, I had to interrupt the kernel, restart it, and then resume processing below." ] }, { "cell_type": "markdown", "id": "c9279bac-665c-4f6d-b1bb-dbd38cfaa48c", "metadata": {}, "source": [ "## Check results in image viewer" ] }, { "cell_type": "code", "execution_count": null, "id": "c9efe7e7-5b2a-43cc-9276-6f89d4a3c9b7", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "dials.image_viewer monochromatic_HEWL_anom_3049.expt monochromatic_HEWL_anom_3049.refl" ] }, { "cell_type": "markdown", "id": "87b79ab1-9442-45be-8f1d-4d1a0bd2d489", "metadata": {}, "source": [ "Predictions do not look great - many shoeboxes do not have a predicted spot, and there are also some predicted spots that are off-target or fully false positives." ] }, { "cell_type": "code", "execution_count": null, "id": "bed52176-3151-4089-9e00-37698bb0ffdc", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "dials.image_viewer poly_refined_HEWL_anom_3049.expt poly_refined_HEWL_anom_3049.refl" ] }, { "cell_type": "markdown", "id": "dd609a5e-d213-4e79-a693-ba7d88cefa5b", "metadata": {}, "source": [ "The polychromatic predictions look much better!" ] }, { "cell_type": "markdown", "id": "9c4ce69d-cd27-4dcf-91e7-c407110260b5", "metadata": {}, "source": [ "## Check wavelength spectrum\n", "\n", "There is a utility in `laue-dials` called `laue.plot_wavelengths`. This command generates a histogram of the assigned wavelength spectrum. If you know approximately the shape of your beam spectrum, this can be a useful check to ensure that nothing has gone wrong with wavelength assignment at this stage before predicting the full set of reflections." ] }, { "cell_type": "code", "execution_count": null, "id": "c656a10d-05f1-4897-b5e4-e3c80a2ec6eb", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.plot_wavelengths poly_refined_HEWL_anom_3049.refl \\\n", " refined_only=True \\\n", " save=True \\\n", " show=False \\\n", " output=wavelengths_HEWL_anom_3049.png \\\n", " log=laue.plot_wavelengths_HEWL_anom_3049.log" ] }, { "cell_type": "code", "execution_count": null, "id": "9918e710-d786-4be0-8969-768d39db25ce", "metadata": {}, "outputs": [], "source": [ "from IPython.display import Image\n", "import os\n", "cwd = os.getcwd()\n", "Image(filename=cwd+'/wavelengths_HEWL_anom_3049.png') " ] }, { "cell_type": "markdown", "id": "36c1b62e-004f-453f-bc7e-0b759d874e2b", "metadata": {}, "source": [ "## Spot prediction\n", "\n", "Since the assigned spectrum looks good, we can move on to predicting the full set of reflections. If the assigned beam spectrum ends up narrower than the wavelength limits you provided in `laue.optimize_indexing`, you can always narrow down the spectrum here for `laue.predict`. The predictor will find the locations of all feasible spots and build profiles for the weak spots based on the observed strong spots. The output reflection table can then be fed along with the refined `expt` file into `laue.integrate` to generate `mtz` files suitable for merging in a program like `careless`." ] }, { "cell_type": "code", "execution_count": null, "id": "3e09c0be-dd7d-4a86-a90f-c6b5e83c07e8", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.predict poly_refined_HEWL_anom_3049.expt \\\n", " poly_refined_HEWL_anom_3049.refl \\\n", " output.reflections=predicted_HEWL_anom_3049.refl \\\n", " output.log=laue.predict_HEWL_anom_3049.log \\\n", " wavelengths.lam_min=0.97 \\\n", " wavelengths.lam_max=1.25 \\\n", " reciprocal_grid.d_min=1.4 \\\n", " nproc=60" ] }, { "cell_type": "markdown", "id": "75f043fa-ad0e-40fe-a163-b5ce0994bfe9", "metadata": {}, "source": [ "## Integration" ] }, { "cell_type": "code", "execution_count": null, "id": "97640ef6-f3ad-4882-b45a-13b20cb0c27c", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.integrate poly_refined_HEWL_anom_3049.expt \\\n", " predicted_HEWL_anom_3049.refl \\\n", " output.filename=integrated_HEWL_anom_3049.mtz \\\n", " output.log=laue.integrate_HEWL_anom_3049.log \\\n", " nproc=12" ] }, { "cell_type": "markdown", "id": "de455bc5-911c-4dd8-9b99-5eb51610ef0d", "metadata": { "tags": [] }, "source": [ "# Conclusion\n", "\n", "At this point, you now have integrated `mtz` files that you can pass to [careless](https://github.com/rs-station/careless) for scaling and merging. We provide an example SLURM-compatible `careless` script, found at `scripts/sbatch_careless_varied_frames.sh`. There are also several other scripts that can be used for further processing that are described by `README.txt`.\n", "\n", "Note that throughout this pipeline, you can use DIALS utilities like `dials.image_viewer` or `dials.report` to check progress and ensure your data is being analyzed properly. We recommend regularly checking the analysis by looking at the data on images, which can be done by\n", "\n", "`dials.image_viewer FILE.expt FILE.refl`.\n", "\n", "These files are generally written as pairs with the same base name, with the exception of combining `imported.expt` + `strong.refl`, or `poly_refined.expt` + `predicted.refl`.\n", "\n", "Also note that you can take any program and enter it on the command-line for further help. For example, writing\n", "\n", "`laue.optimize_indexing`\n", "\n", "will print a help page for the program. You can see all configurable parameters by using \n", "\n", "`laue.optimize_indexing -c`.\n", "\n", "This applies to all `laue-dials` command-line programs.\n", "\n", "For further processing of these data in programs like `careless`, the `README.txt` file includes instructions for using the programs in `/scripts/` (reproduced below). \n", "\n", "Congratulations! This tutorial is now over. For further questions, feel free to consult documentation or email the [authors](https://pypi.org/project/laue-dials/)." ] }, { "cell_type": "markdown", "id": "ecdb7eef-6bf7-4ab3-994e-8997cf758057", "metadata": {}, "source": [ "# Post-Laue-DIALS processing\n", "\n", "All HEWL anomalous data analysis and figure generation post-laue-dials was done using the 5 scripts below, in order:\n", "\n", "1. HEWL_anom_cut_friedelize_careless.sh\n", " - Copies the integrated (unmerged) mtz file produced by the HEWL_anom_laue_dials_processing_final.ipynb notebook into the working directory\n", " - Calls the cut_unmerged_mtz_by_frames.py utility to create mtzs with only a subset of the overall images\n", " - Calls the friedelize.py utility to split the Friedel mates into two mtz files (*_plus.mtz and *_minus.mtz)\n", " - Copies those split mtzs into the appropriate directory\n", " - Calls the sbatch_careless_varied_frames.sh utility to scale those mtzs\n", "\n", "2. HEWL_anom_unfriedelize.sh\n", " - Calls the unfriedelize.py utility to recombine the Friedel mates into a single mtz file\n", " - Moves the resulting mtz to the refinement directory\n", "\n", "3. HEWL_anom_refine.sh\n", " - Copies files with a set of custom refinement parameters for each step of refinement in Phenix into the appropriate directory. Refinement 1 is a rigid-body refinment only, while Refinement 2 also refines individual B-factors.\n", " - Calls the utility sbatch_phenix_Refine.sh to run Phenix refinement\n", "\n", "4. HEWL_anom_peak_heights.sh\n", " - Calls the anomalous_peak_heights.py utility to calculate the anomalous peak heights for each I and S atom accross all frame number sizes and store the resulting outputs in csv files\n", " - Calls the concatenate_anomalous_peak_csv.py utility to concatenate the resulting 13 csv files into one\n", "\n", "5. HEWL_anom_figures.sh\n", " - Calls the HEWL_anom_peaks.pml utility to generate the PyMOL figure showing anomalous density\n", " - Calls the careless.ccanom and careless.cchalf function to prepare data for subsequent plotting in Jupyter notebooks\n" ] } ], "metadata": { "kernelspec": { "display_name": "dev", "language": "python", "name": "dev" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.6" } }, "nbformat": 4, "nbformat_minor": 5 }