Command-Line Interface (CLI) Usage

This module is used to perform TM-Score, RMSD (Root Mean Square Deviation) and RMSF (Root Mean Square Fluctuation) analyses on protein structure predictions. The analysis is configured using a JSON configuration file or command line arguments.

Running 2D TM-Score Analysis

The tmscore_mode2d function performs a 2D TM-Score analysis on the provided protein structure predictions. This analysis compares the structural similarity between pairs of sequences and reference structures.

Example:

tmscore_mode2d --config_file config.json --output_path /path/to/output --predictions_path /path/to/predictions --jobname my_job

Command Line Arguments:

  • –config_file: Path to the JSON configuration file (default: config.json).

  • –output_path: Directory to save the results.

  • –predictions_path: Directory containing the predictions.

  • –jobname: Name of the job.

  • –seq_pairs: A list of [max_seq, extra_seq] pairs used for predictions.

  • –starting_residue: Starting residue index for reindexing (optional).

  • –slice_predictions: Slice range of predictions to analyze (optional).

  • –engine: The engine used to generate predictions (e.g., AlphaFold2, OpenFold).

  • –ref2d1: First reference structure for TM-Score calculations (optional).

  • –ref2d2: Second reference structure for TM-Score calculations (optional).

  • –n_stdevs: Number of standard deviations to consider for point closeness (optional).

  • –n_clusters: Number of clusters for TM-Score analysis (optional).

Python Usage:

You can also call the function directly in Python:

from fast_ensemble.ensemble_analysis.analysis_utils import load_config
from script_name import run_2d_tmscore_analysis

config = load_config('config.json')
run_2d_tmscore_analysis(config)

Running 1D TM-Score Analysis

The tmscore_mode1d function performs a 1D TM-Score analysis. This analysis identifies structural modes in the protein predictions and clusters them.

Example:

tmscore_mode1d --config_file config.json --output_path /path/to/output --jobname my_job --engine AlphaFold2

Command Line Arguments:

  • –config_file: Path to the JSON configuration file (default: config.json).

  • –output_path: Directory to save the results.

  • –predictions_path: Directory containing the predictions.

  • –jobname: Name of the job.

  • –seq_pairs: A list of [max_seq, extra_seq] pairs used for predictions.

  • –starting_residue: Starting residue index for reindexing (optional).

  • –slice_predictions: Slice range of predictions to analyze (optional).

  • –ref1: Reference structure for TM-Score calculations (optional).

  • –engine: The engine used to generate predictions (e.g., AlphaFold2, OpenFold).

Python Usage:

from fast_ensemble.ensemble_analysis.analysis_utils import load_config
from script_name import run_tmscore_analysis

config = load_config('config.json')
run_tmscore_analysis(config)

Running 2D RMSD Analysis

The rmsd_mode2d function performs a 2D RMSD analysis. This analysis measures the structural deviation between pairs of sequences and two reference structures in a 2D space.

Example:

rmsd_mode2d --config_file config.json --output_path /path/to/output --predictions_path /path/to/predictions --jobname my_job

Command Line Arguments:

  • –config_file: Path to the JSON configuration file (default: config.json).

  • –output_path: Directory to save the results.

  • –mode_results: Path to the mode results CSV file.

  • –jobname: Name of the job.

  • –seq_pairs: A list of [max_seq, extra_seq] pairs used for predictions.

  • –predictions_path: Directory containing the predictions.

  • –engine: The engine used to generate predictions (e.g., AlphaFold2, OpenFold).

  • –align_range: Atom alignment range for RMSD calculations (optional).

  • –analysis_range: Atom range for RMSD calculations after alignment (optional).

  • –analysis_range_name: Name of the atom range (e.g., kinase core, helix 1, etc.).

  • –ref2d1: First reference structure for RMSD calculations (optional).

  • –ref2d2: Second reference structure for RMSD calculations (optional).

  • –n_stdevs: Number of standard deviations to consider when calculating close points (optional).

  • –n_clusters: Number of clusters to consider for RMSD analysis (optional).

Python Usage:

from fast_conformation.ensemble_analysis.analysis_utils import load_config
from script_name import run_2d_rmsd_analysis

config = load_config('config.json')
run_2d_rmsd_analysis(config)

Running 1D RMSD Analysis

The rmsd_mode1d function performs a 1D RMSD analysis. This analysis measures the structural deviation between sequences and a single reference structure.

Example:

rmsd_mode1d --config_file config.json --output_path /path/to/output --jobname my_job --engine AlphaFold2

Command Line Arguments:

  • –config_file: Path to the JSON configuration file (default: config.json).

  • –output_path: Directory to save the results.

  • –predictions_path: Directory containing the predictions.

  • –jobname: Name of the job.

  • –seq_pairs: A list of [max_seq, extra_seq] pairs used for predictions.

  • –starting_residue: Starting residue index for reindexing (optional).

  • –align_range: Atom alignment range for RMSF calculations (optional).

  • –analysis_range: Atom range for RMSD calculations after alignment (optional).

  • –analysis_range_name: Name of the atom range (e.g., kinase core, helix 1, etc.).

  • –ref1d: Reference structure for RMSD calculations (optional).

Python Usage:

from fast_conformation.ensemble_analysis.analysis_utils import load_config
from script_name import run_rmsd_analysis

config = load_config('config.json')
run_rmsd_analysis(config)

Running RMSF Analysis

The rmsf_plddt function performs RMSF analysis, which measures the flexibility of residues in the protein structure predictions.

Example:

rmsf_plddt --config_file config.json --output_path /path/to/output --jobname my_job --engine AlphaFold2 --detect_mobile True

Command Line Arguments:

  • –config_file: Path to the JSON configuration file (default: config.json).

  • –output_path: Directory to save the results.

  • –predictions_path: Directory containing the predictions.

  • –jobname: Name of the job.

  • –seq_pairs: A list of [max_seq, extra_seq] pairs used for predictions.

  • –engine: The engine used to generate predictions (e.g., AlphaFold2, OpenFold).

  • –starting_residue: Starting residue index for reindexing (optional).

  • –align_range: Atom alignment range for RMSF calculations (optional).

  • –detect_mobile: Boolean flag to detect mobile residue ranges (optional).

  • –peak_width: RMSF peak width threshold for mobile residue range detection (optional).

  • –peak_prominence: RMSF peak prominence threshold for mobile residue range detection (optional).

  • –peak_height: RMSF peak height threshold for mobile residue range detection (optional).

Python Usage:

from fast_conformation.ensemble_analysis.analysis_utils import load_config
from script_name import run_rmsf_analysis

config = load_config('config.json')
run_rmsf_analysis(config)

Configuration

The JSON configuration file should define the parameters necessary for each analysis. Here is an example configuration:

{
    "output_path": "/path/to/output",
    "predictions_path": "/path/to/predictions",
    "jobname": "my_job",
    "seq_pairs": [["seq1", "seq2"], ["seq3", "seq4"]],
    "engine": "AlphaFold2",
    "starting_residue": 1,
    "slice_predictions": "10:100",
    "align_range": "5-100",
    "detect_mobile": true,
    "peak_width": 3,
    "peak_prominence": 0.5,
    "peak_height": 1.0
}

Each function will use the configuration parameters defined in the JSON file, but they can be overridden by command line arguments. You can find example configs in the sample files.