{ "cells": [ { "cell_type": "markdown", "id": "c566775b-2672-4bad-a9ac-2dfdac0fca41", "metadata": {}, "source": [ "## Installation\n", "\n", "The easiest way to install `laue-dials` and its dependencies is using [Anaconda](https://docs.anaconda.com/free/anaconda/install/index.html). First we update and install the libmamba solver with\n", "\n", "```\n", "conda update -n base conda\n", "conda install -n base conda-libmamba-solver\n", "conda config --set solver libmamba\n", "```\n", "\n", "With Anaconda, we can then create and activate a custom environment for your install by running \n", "\n", "```\n", "conda create --name laue-dials\n", "conda activate laue-dials\n", "```\n", "\n", "Now we are ready to install the main dependency and framework: [DIALS](https://dials.github.io). After installing that, we can install `laue-dials` using pip, as below:\n", "\n", "```\n", "conda install -c conda-forge dials\n", "pip install laue-dials\n", "```\n", "\n", "All other dependencies will then be automatically installed for us, and we'll be ready to analyze your first Laue data set! Reopen this notebook with the appropriate environment activated when ready.\n", "\n", "Documentation for `laue-dials` can be found at [here](https://rs-station.github.io/laue-dials/index.html), and entering a command with no arguments on the command line will also print a help page!" ] }, { "cell_type": "markdown", "id": "50269767-09e6-448a-9e8e-8562ba87604d", "metadata": { "tags": [] }, "source": [ "# Introduction\n", "In this notebook, we will process a time-resolved EF-X dataset.\n", " \n", "The data is comprised of four passes of four timepoints each. Each of the four electric-field timepoints is taken for a given `phi` angle, then the crystal is rotated aaround the phi goniometer axis and then the four timepoints are taken again. This is done over four passes, `c,d,e,f`. The start angle and the step can be found in the below table. \n", "\n", "| pass | start phi angle (deg) | rotation phi step (deg) | \n", "|------|-------------|----|\n", "| c | 0 | 2 |\n", "| d | 92 | 2 |\n", "| e | 181 | 2 |\n", "| f | 361.5 | 1 |\n", "\n", "At the end of processing, we would like four `.mtz` files, one for each timepoint. We must run `laue-dials` on each timepoint individually, then combine all passes for a given timepoint at the end. We start with the `off` timepoint pass `c` only. Then, we will analyze all sixteen passes in a single script, and combine the output mtzs into a single mtz file. \n", "\n", "Data processing will rely on images found in `./images` and scripts found in `./scripts`. \n", "\n", "# Importing Data\n", "\n", "We can use `dials.import` as a way to import the data files written at experimental facilities into a format that is friendly to both `DIALS` and `laue-dials`. Feel free to use any data set you'd like below, but a [sample time-resolved EF-X data set](https://zenodo.org/record/6407157) has been uploaded to Zenodo for your convenience, and this notebook has been tested using that dataset.\n", "\n", "First, we create two dictionaries, `START_ANGLES` and `OSCS`, that respectively map the pass names to the start angles and rotation steps (treated as oscillations in `dials.import)`. \n", "```\n", "declare -A START_ANGLES=( [\"c\"]=0 [\"d\"]=92 [\"e\"]=181 [\"f\"]=361.5)\n", "declare -A OSCS=( [\"c\"]=2 [\"d\"]=2 [\"e\"]=2 [\"f\"]=1)\n", "```\n", "\n", "Then, we import the files for the `c`,`off` images using `dials.import`. " ] }, { "cell_type": "code", "execution_count": null, "id": "6f84f097-1d3e-4d26-ad3d-2bef23dee63f", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "#these start-angles and sweep angles are to be manually input by the user using information from their particular experimental design. \n", "declare -A START_ANGLES=( [\"c\"]=0 [\"d\"]=92 [\"e\"]=181 [\"f\"]=361.5)\n", "declare -A OSCS=( [\"c\"]=2 [\"d\"]=2 [\"e\"]=2 [\"f\"]=1)\n", "\n", "#this is the delay time. \n", "TIME=\"off\"\n", "# this is the pass. \n", "pass=\"c\"\n", "FILE_INPUT_TEMPLATE=\"data/e35${pass}_${TIME}_###.mccd\"\n", "# Import data into DIALS files\n", "dials.import geometry.scan.oscillation=${START_ANGLES[$pass]},${OSCS[$pass]}\\\n", " geometry.goniometer.invert_rotation_axis=True \\\n", " geometry.goniometer.axes=0,1,0 \\\n", " geometry.beam.wavelength=1.04 \\\n", " geometry.detector.panel.pixel_size=0.08854,0.08854 \\\n", " input.template=$FILE_INPUT_TEMPLATE \\\n", " output.experiments=imported.expt " ] }, { "cell_type": "markdown", "id": "c7137b40-e10e-43f8-a66d-67113f712980", "metadata": { "tags": [] }, "source": [ "# Getting an Initial Estimate\n", "\n", "After importing our data, the first thing we need to do is get an initial estimate for the experimental geometry. Here, we'll use some monochromatic algorithms from DIALS to help! This step can be tricky -- failure can be due to several causes. In the event of failure, here are a few common causes:\n", "\n", "1. The spotfinding gain is either too high or too low. Try looking at the results of `dials.image_viewer imported.expt strong.refl` and seeing if you have too many (or too few) reflections. Lower gain gives you more spots, but also more likely to give false positives.\n", "2. Supplying the space group or unit cell during indexing can be helpful. When supplying the unit cell, allow for some variation in the lengths of the axes, since the monochromatic algorithms may result in a slightly scaled unit cell depending on the chosen wavelength.\n", "3. You may have intensities that need to be masked. These can come from bad panels or extraneous scatter. You can use `dials.image_viewer` (described below) to create a mask file for your data, and then provide the `spotfinder.lookup.mask=\"pixels.mask\"` command below to use that mask during spotfinding." ] }, { "cell_type": "code", "execution_count": null, "id": "887213ac-3883-4492-8f92-78126129f37a", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.find_spots imported.expt \\\n", " spotfinder.mp.nproc=8 \\\n", " spotfinder.threshold.dispersion.gain=0.3 \\\n", " spotfinder.filter.max_separation=10" ] }, { "cell_type": "code", "execution_count": null, "id": "bbd85bba-e987-47e0-8338-85b9ae0d691c", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "CELL='\"65.3,39.45,39.01,90.000,117.45,90.000\"' #this is a unit cell of PDZ2 from PDB 5E11\n", "\n", "laue.index imported.expt strong.refl \\\n", " indexer.indexing.known_symmetry.space_group=5 \\\n", " indexer.indexing.refinement_protocol.mode=refine_shells \\\n", " indexer.indexing.known_symmetry.unit_cell=$CELL \\\n", " indexer.refinement.parameterisation.auto_reduction.action=fix \\\n", " laue_output.index_only=False" ] }, { "cell_type": "markdown", "id": "7e4678da-294d-40cc-b88b-7137ad7ef6e5", "metadata": { "tags": [] }, "source": [ "# Viewing Images\n", "\n", "Sometimes it's helpful to be able to see the analysis data overlayed on the raw data. DIALS has a utility for viewing spot information on the raw images called `dials.image_viewer`. For example, the spotfinding gain parameter can be tuned to capture more spots, but lowering it too much finds nonexistent spots. To check this, we can use the image viewer to see what spots were found on images. We need to provide an `expt` file and a `refl` file -- the `imported.expt` and `strong.refl` files will do for checking spotfinding. This program also has utilities for generating masks if they are needed. The red dots from the checkbox \"Mark centers of mass\" are the spots found by `laue.find_spots` (which in turn makes a call to `dials.find_spots`). These are best used for judging whether you need to adjust the gain higher (for fewer spots) or lower (for more) during spotfinding. You can find more details on the image viewer in the [DIALS tutorial here](https://dials.github.io/documentation/tutorials/processing_in_detail_betalactamase.html)." ] }, { "cell_type": "code", "execution_count": null, "id": "d976c4ea-c674-42b1-b7bc-a7a36bec8861", "metadata": {}, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "dials.image_viewer imported.expt strong.refl" ] }, { "cell_type": "markdown", "id": "3766edd9-c6d8-4c20-acc0-22f95f9e549e", "metadata": {}, "source": [ "# Making Stills\n", "\n", "Here we will now split our monochromatic estimate into a series of stills to prepare it for the polychromatic pipeline. There is a useful utility called `laue.sequence_to_stills` for this.\n", "\n", "NOTE: Do not use `dials.sequence_to_stills`, as there are data columns which do not match between the two programs." ] }, { "cell_type": "code", "execution_count": null, "id": "6a5d19c3-db92-4635-aeea-006456656ca7", "metadata": {}, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.sequence_to_stills monochromatic.*\n", "#cctbx.python scripts/sequence_to_stills-newest_ld.py monochromatic.*" ] }, { "cell_type": "markdown", "id": "fe8411d3-f6b3-4c54-a042-2f6110aeaef3", "metadata": {}, "source": [ "# Polychromatic Analysis\n", "\n", "Here we will use four other programs in `laue-dials` to create a polychromatic experimental geometry using our initial monochromatic estimate. Each of the programs does the following:\n", "\n", "`laue.optimize_indexing` assigns wavelengths to reflections and refines the crystal orientation jointly.\n", "\n", "`laue.refine` is a polychromatic wrapper for `dials.refine` and allows for refining the experimental geometry overall to one suitable for spot prediction and integration.\n", "\n", "`laue.predict` takes the refined experimental geometry and predicts the centroids of all strong and weak reflections on the detector.\n", "\n", "`laue.integrate` then builds spot profiles and integrates intensities on the detector." ] }, { "cell_type": "code", "execution_count": null, "id": "3011e012-9e90-4334-9647-7f8f29cf8461", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "N=8 # Max multiprocessing\n", "laue.optimize_indexing stills.* \\\n", " output.experiments=\"optimized.expt\" \\\n", " output.reflections=\"optimized.refl\" \\\n", " output.log=\"laue.optimize_indexing.log\" \\\n", " wavelengths.lam_min=0.95 \\\n", " wavelengths.lam_max=1.2 \\\n", " reciprocal_grid.d_min=1.7 \\\n", " nproc=$N" ] }, { "cell_type": "code", "execution_count": null, "id": "ad8b015b-8bee-42a2-9330-d2bac5273b0e", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "N=8 # Max multiprocessing\n", "laue.refine optimized.* \\\n", " output.experiments=\"poly_refined.expt\" \\\n", " output.reflections=\"poly_refined.refl\" \\\n", " output.log=\"laue.poly_refined.log\" \\\n", " nproc=$N >> sink.log" ] }, { "cell_type": "markdown", "id": "6c6e89aa-f02e-4b7f-ae93-2a2649b663f9", "metadata": {}, "source": [ "To check the refinement quality, we check the spotfinding root-mean-square deviations (rmsds) as a function of image." ] }, { "cell_type": "code", "execution_count": null, "id": "72aab4c0-e28b-42b9-a188-e00f3d1ffd1f", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "laue.compute_rmsds poly_refined.* refined_only=True" ] }, { "cell_type": "markdown", "id": "ec4029e8-36e7-4b1d-8da0-ab8382823ae8", "metadata": {}, "source": [ "These `rmsd`s look good. " ] }, { "cell_type": "markdown", "id": "c70b84c0-e9be-4f86-a79c-4055b0454467", "metadata": { "jupyter": { "outputs_hidden": true }, "tags": [] }, "source": [ "## Checking the Wavelength Spectrum\n", "\n", "`laue.plot_wavelengths` allows us to plot the wavelengths assigned in stored in a reflection table. The histogram of these reflections should resemble the beam spectrum, so this is a good check to do at this time! " ] }, { "cell_type": "code", "execution_count": null, "id": "de21c592-b2cc-4110-a6e9-c89cb3a5fe32", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "laue.plot_wavelengths poly_refined.refl refined_only=True save=True show=False" ] }, { "cell_type": "code", "execution_count": null, "id": "45ebc809-2d3c-42d6-846f-d020739c04c7", "metadata": { "tags": [] }, "outputs": [], "source": [ "from IPython.display import Image\n", "Image(filename='wavelengths.png') " ] }, { "cell_type": "markdown", "id": "c72a6b9f-8bd2-4ec7-97b9-ab93c7a523ed", "metadata": {}, "source": [ "This is the expected wavelength profile, indicating successful wavelength assignment. " ] }, { "cell_type": "markdown", "id": "d539d272-9746-457c-9c2b-a65975b5b2f0", "metadata": { "tags": [] }, "source": [ "## DIALS Reports\n", "\n", "DIALS has a utility that gives useful information on various diagnostics you may be interested in while analyzing your data. The program `dials.report` generates an HTML file you can open to see information and plots regarding the status of your analyzed data. You can run it on any files generated by `DIALS` or `laue-dials`. " ] }, { "cell_type": "code", "execution_count": null, "id": "835ca5c3-7562-4599-9377-108ad25085b4", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "dials.report poly_refined.expt poly_refined.refl" ] }, { "cell_type": "markdown", "id": "36dd794c-d101-4295-8307-1e4d34bb8122", "metadata": {}, "source": [ "# Integrating Spots\n", "\n", "Now that we have a refined experiment model, we can use `laue.predict` and `laue.integrate` to get integrated intensities from the data. We will predict the locations of all feasible spots on the detector given our refined experiment model, and at each of those locations we will integrate the intensities to get an `mtz` file that we can feed into `careless`." ] }, { "cell_type": "code", "execution_count": null, "id": "f777ddaa-a3bc-4784-a825-f9a4d57bd13d", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "N=8 # Max multiprocessing\n", "laue.predict poly_refined.* \\\n", " output.reflections=\"predicted.refl\" \\\n", " output.log=\"laue.predict.log\" \\\n", " wavelengths.lam_min=0.95 \\\n", " wavelengths.lam_max=1.2 \\\n", " reciprocal_grid.d_min=1.7 \\\n", " nproc=$N" ] }, { "cell_type": "code", "execution_count": null, "id": "48273edc-a8f6-400c-a8d5-41f50fe936ec", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "N=8 # Max multiprocessing\n", "laue.integrate poly_refined.expt predicted.refl \\\n", " output.filename=\"integrated.mtz\" \\\n", " output.log=\"laue.integrate.log\" \\\n", " nproc=$N" ] }, { "cell_type": "markdown", "id": "d4364f44-e2c7-4275-a133-4ce1e49ccd19", "metadata": {}, "source": [ "# Processing and Combining All Passes" ] }, { "cell_type": "markdown", "id": "572a3f17-f16b-46f9-a4fb-b09165e00e64", "metadata": {}, "source": [ "We have successfully integrated one of the sixteen image series. Let's now process the rest. For the `off` timepoints, we process as above. Pass `e` has a different indexing solution (up to the C2 symmetry operation `-x,y,-z`) and so we reindex pass `e` using [dials.reindex](https://dials.github.io/documentation/programs/dials_reindex.html).\n", "\n", "Our strategy for the `50ns`,`100ns`,`200ns` timepoints is to transfer the `stills.expt` geometry and then refine spot positions that may have changed due to the electric field. \n", "\n", "Using the attached `../scripts/one-pass-from_off.sh` script which contains all of the above `bash` code, we iterate over all of the passes in the below cell. The below cell takes a while to run -- we don't recommend to run this in the jupyter notebook. Instead, we recommend to run it as a standalone parallel script, attached as `../scripts/process.sh`. Either proccedure will create a folder named `gain_0,3` containing subfolders of `dials` files for each pass. For example, `../gain_0,3-from_stills/dials_files_d_100ns` contains `dials` files for pass `d`, timepoint `100ns`. " ] }, { "cell_type": "code", "execution_count": null, "id": "9bb17f9c-a5c9-4ad2-a803-390321c66dfa", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "declare -A START_ANGLES=( [\"c\"]=0 [\"d\"]=92 [\"e\"]=181 [\"f\"]=361.5)\n", "declare -A OSCS=( [\"c\"]=2 [\"d\"]=2 [\"e\"]=2 [\"f\"]=1)\n", "declare -A DELAY=\"off\"\n", "\n", "gain=0.3\n", "for pass in c d e f \n", "do\n", " if [ pass == e ];then \n", " sh scripts/one_pass-from_off.sh $pass $DELAY ${START_ANGLES[$pass]} ${OSCS[$pass]} $gain -x,y,-z >> sink.log\n", " else\n", " sh scripts/one_pass-from_off.sh $pass $DELAY ${START_ANGLES[$pass]} ${OSCS[$pass]} $gain x,y,z >> sink.log\n", " fi\n", "done" ] }, { "cell_type": "markdown", "id": "d2ea60b3-033d-4581-9586-0290c62c7511", "metadata": {}, "source": [ "Once the `off` timepoint series finish, we process the remaining timepoints. " ] }, { "cell_type": "code", "execution_count": null, "id": "0328297c-b7dd-4ce5-8d49-ecd12f3d1d10", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "\n", "declare -A START_ANGLES=( [\"c\"]=0 [\"d\"]=92 [\"e\"]=181 [\"f\"]=361.5)\n", "declare -A OSCS=( [\"c\"]=2 [\"d\"]=2 [\"e\"]=2 [\"f\"]=1)\n", "gain=0.3\n", "for delay in \"50ns\" \"100ns\" \"200ns\"\n", "do \n", " for pass in \"c\" \"d\" \"e\" \"f\" \n", " do\n", " sh scripts/one_pass-from_off.sh $pass $delay ${START_ANGLES[$pass]} ${OSCS[$pass]} $gain x,y,z >> sink.log\n", " done\n", "done" ] }, { "cell_type": "markdown", "id": "1eed69cb-1e38-453a-8249-2c13d327ad4b", "metadata": {}, "source": [ "Finally, we combine all `.mtz` files for passes of a single timepoint using the attached `scripts/expt_concat.py` script. `.mtz` files can be found in `gain_0,3-from_stills/ld_0,3_mtzs`." ] }, { "cell_type": "code", "execution_count": null, "id": "565288ad-94c6-4514-9f3c-3ef93d4fee18", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "python scripts/expt_concat.py 0.3" ] }, { "cell_type": "code", "execution_count": null, "id": "7ffc3a23-2f50-4755-8f75-791bdceecc5b", "metadata": {}, "outputs": [], "source": [ "import reciprocalspaceship as rs\n", "rs.read_mtz(\"gain_0,3/ld_0,3_mtzs/cdef_e35_off.mtz\")" ] }, { "cell_type": "markdown", "id": "8f552860-1a42-4158-b035-128bce2426dd", "metadata": {}, "source": [ "We expect a mtz file with about 350,000 reflections." ] }, { "cell_type": "markdown", "id": "321a3447-1d39-471e-ab7b-e2203edf402a", "metadata": { "tags": [] }, "source": [ "# Conclusion\n", "\n", "At this point, you now have integrated `mtz` files that you can pass to [careless](https://github.com/rs-station/careless) for scaling and merging. We provide an example `careless` script, found at `../scripts/careless-cdef-ohp-mlpw.sh`. However, after all Laue-DIALS files are printed out, `../scripts/reduce.sh` can also be run for a complete analysis.\n", "\n", "Note that throughout this pipeline, you can use DIALS utilities like `dials.image_viewer` or `dials.report` to check progress and ensure your data is being analyzed properly. We recommend regularly checking the analysis by looking at the data on images, which can be done by\n", "\n", "`dials.image_viewer FILE.expt FILE.refl`.\n", "\n", "These files are generally written as pairs with the same base name, with the exception of combining `imported.expt` + `strong.refl`, or `poly_refined.expt` + `predicted.refl`.\n", "\n", "Also note that you can take any program and enter it on the command-line for further help. For example, writing\n", "\n", "`laue.optimize_indexing`\n", "\n", "will print a help page for the program. You can see all configurable parameters by using \n", "\n", "`laue.optimize_indexing -c`.\n", "\n", "This applies to all `laue-dials` command-line programs.\n", "\n", "Congratulations! This tutorial is now over. For further questions, feel free to consult documentation or email the [authors](https://pypi.org/project/laue-dials/)." ] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:anaconda-manuscript]", "language": "python", "name": "conda-env-anaconda-manuscript-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 5 }