Other miscellaneous utilites
rs.scaleit
Run CCP4’s scaleit on the given data.
usage: rs.scaleit [-h] -r ref data_col sig_col -i mtz data_col sig_col
[-o OUTFILE] [--ignore-isomorphism]
- -h, --help
show this help message and exit
- -r <ref> <data_col> <sig_col>, --refmtz <ref> <data_col> <sig_col>
MTZ to be used as reference for scaling using given data columns. Specified as (filename, F, SigF) or (filename, I, SigI)
- -i <mtz> <data_col> <sig_col>, --inputmtz <mtz> <data_col> <sig_col>
MTZ to be scaled to reference using given data columns. Specified as (filename, F, SigF) or (filename, I, SigI)
- -o <outfile>, --outfile <outfile>
MTZ file to which scaleit output will be written
- --ignore-isomorphism
Allow poorly isomorphous inputs to be scaled. By default (no flag) poorly isomorphous inputs will raise an error.
rs.precog2mtz
Convert precognition ingegration results to .mtz files for mergning in Careless.
usage: rs.precog2mtz [-h] [--remove-sys-absences]
[--spacegroup-for-absences SPACEGROUP_FOR_ABSENCES]
--spacegroup SPACEGROUP --cell CELL CELL CELL CELL CELL
CELL [-o MTZ_OUT]
ii_in [ii_in ...]
- ii_in
Precognition .ii file(s)
- -h, --help
show this help message and exit
- --remove-sys-absences
Optionally remove systematic absences from the data according to –spacegroup or –spacegroup-for-absences if supplied.
- --spacegroup-for-absences <spacegroup_for_absences>
Optionally use a different spacegroup to compute systematic absences. This may be useful for some EF-X data.
- --spacegroup <spacegroup>
The spacegroup of the data
- --cell <cell>
The unit cell supplied as six floats. For example, –spacegroup 34. 45. 98. 90. 90. 90.
- -o <mtz_out>, --mtz-out <mtz_out>
Name of the output mtz file.
rs.rfree
Create an mtz containing rfree flags
usage: rs.rfree [-h] [-o OUTFILE] [-f FROM_FILE] [-c a b c alpha beta gamma]
[-sg SPACEGROUP] [-d DMIN] -r RFRACTION [-s SEED]
- -h, --help
show this help message and exit
- -o <outfile>, --outfile <outfile>
Output MTZ filename
- -f <from_file>, --from-file <from_file>
Use the cell and spacegroup from the specified mtz file. Either this or –cell and –spacegroup must be provided. If no –dmin is provided, dmin will be inferred from this file.
- -c <a> <b> <c> <alpha> <beta> <gamma>, --cell <a> <b> <c> <alpha> <beta> <gamma>
Cell for output mtz file containing rfree flags. Specified as (a, b, c, alpha, beta, gamma)
- -sg <spacegroup>, --spacegroup <spacegroup>
Spacegroup for output mtz file containing rfree flags
- -d <dmin>, --dmin <dmin>
Maximum resolution of reflections to be included
- -r <rfraction>, --rfraction <rfraction>
Fraction of reflections to be flagged as Rfree
- -s <seed>, --seed <seed>
Seed to random number generator for reproducible Rfree flags
rs.scaleit
Run CCP4’s scaleit on the given data.
usage: rs.scaleit [-h] -r ref data_col sig_col -i mtz data_col sig_col
[-o OUTFILE] [--ignore-isomorphism]
- -h, --help
show this help message and exit
- -r <ref> <data_col> <sig_col>, --refmtz <ref> <data_col> <sig_col>
MTZ to be used as reference for scaling using given data columns. Specified as (filename, F, SigF) or (filename, I, SigI)
- -i <mtz> <data_col> <sig_col>, --inputmtz <mtz> <data_col> <sig_col>
MTZ to be scaled to reference using given data columns. Specified as (filename, F, SigF) or (filename, I, SigI)
- -o <outfile>, --outfile <outfile>
MTZ file to which scaleit output will be written
- --ignore-isomorphism
Allow poorly isomorphous inputs to be scaled. By default (no flag) poorly isomorphous inputs will raise an error.
rs.extrapolate
Make extrapolated structure factors for refinement.
Equations
with reference:
F_{esf} = f * (F_{on} - F_{off}) + F_{ref} SigF_{esf} = sqrt( ( (f**2)*(SigF_{on}**2) ) + ( (f**2)*(SigF_{off}**2) ) + (SigF_{ref}**2))
with calc:
F_{esf} = f * (F_{on} - F_{off}) + F_{calc} SigF_{esf} = sqrt( ( (f**2)*(SigF_{on}**2) ) + ( (f**2)*(SigF_{off}**2) ) )
where f, is the extrapolation factor.
Notes
F_{off} and F_{calc} can be the same MTZ file, as done in Hekstra et al, Nature (2016). In that case, the equation for SigF_{esf} is adjusted to use (f-1)**2 for SigF_{off} to avoid double-counting in the error propagation.
At most one of F_{ref} and F_{calc} can be specified. If neither is specified, F_{calc} will be set to F_{off}.
After computing |F_{esf}|, any negative structure factor amplitudes are converted to positive values. This is to ensure that they are handled correctly downstream in phenix, and because they are technically amplitudes of complex numbers and the phase should just be flipped by 180 degrees.
usage: rs.extrapolate [-h] -on mtz f_col sig_col -off mtz data_col sig_col
[-calc mtz data_col] [-ref mtz data_col sig_col]
[-f FACTOR] [-o OUTFILE]
- -h, --help
show this help message and exit
- -on <mtz> <f_col> <sig_col>, --onmtz <mtz> <f_col> <sig_col>
MTZ to be used as on data. Specified as (filename, F, SigF)
- -off <mtz> <data_col> <sig_col>, --offmtz <mtz> <data_col> <sig_col>
MTZ to be used as off data. Specified as (filename, F, SigF)
- -calc <mtz> <data_col>, --calcmtz <mtz> <data_col>
MTZ to be used as calc data. Specified as (filename, F). At most one of -calc and -ref can be specified.
- -ref <mtz> <data_col> <sig_col>, --refmtz <mtz> <data_col> <sig_col>
MTZ to be used as ref data. Specified as (filename, F, SigF). At most one of -calc and -ref can be specified.
- -f <factor>, --factor <factor>
Extrapolation factor
- -o <outfile>, --outfile <outfile>
Output MTZ filename
rs.mle_dw_extrapolate
Runs maximum likelihood estimation of model parameters (r,p) for DW-Extrapolator.
Notes
Uses scipy.optimize to minimize negative log likelihood
For more efficient runs, can run optimization on a subset of reflections in the datsets; control this
using the –subset flag
usage: rs.mle_dw_extrapolate [-h] --onmtz ONMTZ --offmtz OFFMTZ
[--use_structure_factors f_col sigf_col]
[--use_intensities i_col sigi_col]
[--nsamples NSAMPLES] [--nproc NPROC]
[--init_r INIT_R] [--init_p INIT_P]
[--bounds_r lower_bound upper_bound]
[--bounds_p lower_bound upper_bound]
[--maxiter MAXITER] [--seed SEED]
[--subset SUBSET] [--disable_progress_bar]
[--out OUT]
- -h, --help
show this help message and exit
- --onmtz <onmtz>
.mtz file for perturbed dataset
- --offmtz <offmtz>
.mtz file for ground state dataset
- --use_structure_factors <f_col> <sigf_col>, -use_SF <f_col> <sigf_col>
Use structure factors from French-Wilson scaling. Specified as (F, SigF)
- --use_intensities <i_col> <sigi_col>, -use_I <i_col> <sigi_col>
Use integrated intensities. Specified as (I, SigI)
- --nsamples <nsamples>, -n <nsamples>
Number of Monte Carlo samples (default 1e4)
- --nproc <nproc>
Number of processes (default: cpu_count)
- --init_r <init_r>
Initial guess for r
- --init_p <init_p>
Initial guess for p
- --bounds_r <lower_bound> <upper_bound>
Bounds for r
- --bounds_p <lower_bound> <upper_bound>
Bounds for p
- --maxiter <maxiter>
Max optimizer iterations
- --seed <seed>
Random seed for MC samples
- --subset <subset>
Optional number of reflections to randomly subsample for faster runs
- --disable_progress_bar
- --out <out>, -o <out>
Where to write JSON results
rs.dw_extrapolate
Runs DW-Extrapolator, a Bayesian inference procedure to infer excited state structure factors in perturbative crystallography datsets.
Equations
The underlying model assumes that ground state (GS) and excited state (ES) structure factors have correlation r and that the observed “on” state structure factors are given by F^{ON} = (1-p)*F^{GS} + p*F&{ES}.
Notes
At minimum, two .mtz’s for the off and on data need to be provided
DW-Extrapolator can be run using French-Wilson scaled structure factors or integrated intensities
usage: rs.dw_extrapolate [-h] -on ONMTZ [ONMTZ ...] -off OFFMTZ [OFFMTZ ...]
[-use_SF f_col, sigf_col f_col, sigf_col]
[-use_I i_col, sigi_col i_col, sigi_col]
[-n NSAMPLES] [-r RDW] [-p ES_FRACTION] [-f FACTOR]
[-o OUTFILE] [--nproc NPROC] [--default_scan]
[--disable-progress-bar] [--seed SEED]
- -h, --help
show this help message and exit
- -on <onmtz>, --onmtz <onmtz>
.mtz file for perturbed dataset
- -off <offmtz>, --offmtz <offmtz>
.mtz file for ground state dataset
- -use_SF <f_col, sigf_col>, --use_structure_factors <f_col, sigf_col>
Use structure factors from French-Wilson scaling. Specified as (F, SigF)
- -use_I <i_col, sigi_col>, --use_intensities <i_col, sigi_col>
Use integrated intensities. Specified as (I, SigI)
- -n <nsamples>, --nsamples <nsamples>
Number of importance samples
- -r <rdw>, --rDW <rdw>
Double Wilson r (correlation) parameter
- -p <es_fraction>, --es-fraction <es_fraction>
Excited state fraction p
- -f <factor>, --factor <factor>
Extrapolation factor f = 1/p
- -o <outfile>, --outfile <outfile>
Output file name
- --nproc <nproc>
Number of processors for multiprocessing
- --default_scan
Run default scan with r=0.9 and p from 0.05 to 0.5 in steps of 0.05
- --disable-progress-bar
Disable tqdm progress bar
- --seed <seed>
Random seed for generating Monte Carlo samples