SAXS-A-FOLD Tutorial

Contributors: Emre Brookes, Mattia Rocco, and Aaron Householder


Before You Start

  • Register on the SAXS-A-FOLD website and verify your email.
  • Prepare required files:
  • Review SAXS analysis fundamentals and AlphaFold usage for structure retrieval.
  • Help for selected entries is provided through pop-ups that activate by moving the mouse cursor over the name or the associated field.
  • There is a "Docs" tab on the right of the SAXS-A-FOLD website that includes additional help and details.

Learning Outcomes

By completing this tutorial, you will:

  • Define and manage projects in SAXS-A-FOLD.
  • Upload and process SAXS data.
  • Retrieve AlphaFold structures and perform SAXS calculations.
  • Set flexible regions in protein structures.
  • Run Monte Carlo simulations.
  • Utilize WAXSiS to compute I(q) profiles of selected models.

Introductory Remarks

SAXS-A-FOLD is a platform for analyzing and improving protein structural models, such as those predicted by AlphaFold or derived from solved modules, against SAXS data . This tutorial guides you through core features, including data uploading, structural modeling, refinement, and analysis.

Part I: Define a Project

  1. Select the "Define Project" tab on SAXS-A-FOLD start page.
  2. Start Screen
    Starting Page
  3. Enter "Tutorial" in the "Project Name" field.
  4. Provide an optional description of "SAS Tutorial Example" in the "Description" field to summarize the project's purpose.
  5. Press "Submit" to save the project. The project will now be accessible from the dropdown menu for future use.
  6. Define Project Screenshot
    Define Project tab after submit
  7. Once "Submit" has been successful, a message will be shown in the bottom textarea, "Current project is Tutorial".

Part II: Upload SAXS Data

  1. Navigate to the "Load SAXS" tab.
  2. Upload SAXS Data
    Load SAXS tab
  3. In the "SAXS I(q) File" section:
    • Click "Browse Local Files" to upload the "SASDBP9_A.dat" file from your computer that was downloaded in the "Before You Start" section of this tutorial.
  4. In the "SAXS P(r) File" section:
    • Click "Browse Local Files" to upload the "SSASDBP9_Dmax170_gnom_bin1.dat" file from your computer that was downloaded in the "Before You Start" section of this tutorial.
  5. Press "Submit" to process the data and generate interactive plots of I(q) and P(r).
  6. SAXS Data Screenshot
    Load SAXS tab after submit

Part III: Retrieve AlphaFold Structures

  1. Select the "Load Structure" tab.
  2. AlphaFold Structures Screenshot
    Retrieve AlphaFold on Load Structures tab
  3. In the "Select Input Source" dropdown:
    • Choose "Get AlphaFold Structure" to access the AlphaFold database.
    • Enter the UniProt accession code "AF-Q16543" in the provided field.
  4. Press "Process" to perform SAXS calculations and generate P(r) and I(q) plots.
  5. Once it is complete, then scroll down to see generated P(r) and I(q) plots.
  6. AlphaFold Structures Processed Screenshot
    AlphaFold uploaded structure, with below the experimental I(q) and P(r) plots overlaid with the calculated curves.

Part IV: Flexible Region Analysis

  1. Navigate to the "Structural Info / Flexible Regions" tab.
  2. Flexible Regions Screenshot
    Flexible Region Analysis tab
  3. Scroll down to verify that the "Auto Compute Flexible regions from AlphaFold residue confidence" switch is enabled for automated analysis of AlphaFold-generated structures.
  4. Flexible Regions Switch Screenshot
    Flexible Region Analysis switch
  5. For "Confidence threshold for Auto compute" field enter "60".
  6. Press "Compute flexible regions" button.
  7. Reference image "Flexible Region Analysis switch" for an example of calculated fields populated.
  8. Submit to save and update the structure visualization. After submit the message "Flexible regions saved" should be displayed below the Submit button.
  9. Flexible Regions Saved Screenshot
    Flexible Region Analysis saved

Part V: Monte Carlo Simulations

  1. Open the "Run MMC" tab.
  2. Monte Carlo Simulations Screenshot
    Monte Carlo Simulations tab
  3. Update the "number of trials attempts" field to "10000".
  4. Fields on this tab that are not editable include:
    • return to previous structure: set to "20"; after this number of failed steps attempts, the program will reset to the current coordinates.
    • temperature (K): set to 300 K.
    • molecule type: set to "protein".
    • number of flexible regions to vary: defined in the previous tab.
    • residue range for each flexible region: defined in the previous tab.
    • maximum angle(s): set to 30°, which is the maximum angle that each torsion in each of the flexible regions can sample in a single move.
  5. Fields that can be edited include:
    • structure alignment range: this is the residue range used to spatially align all the MMC-generated models; it should be set to a non-flexible region).
    • overlap basis: defined for the overlap checks used to reject structures with clashes. Default: "heavy atoms"; other options, accessible from a pull-down menu, are "all" atoms in case the structure also has H atoms defined, "backbone" atoms, or "enter atom name".
  6. Scroll to the bottom and press "Submit" to run simulations and view progress in real-time. Once it is complete, continue with next section.

Part VI: Analyze MMC Results

  1. Access the "Retrieve MMC" tab.
  2. Retrieve MMC Screenshot
    Retrieve MMC tab
  3. Set "Stride" to "10". This stride value is for reasonable compute times for tutorial purposes. In practice, the stride should be selected based on the results of the Rg histograms.
  4. Set "Offset" to "0".
  5. Generate histograms of Rg values to evaluate the distribution of conformations by pressing the "Submit" button. The goal is to have a reasonably similar Rg distribution before and after the stride application.
  6. Once that is complete then select "Extract frames" switch and press "Submit" button. After that is complete, continue to the next section.
  7. MMC Results Histogram
    Histogram of MMC Results and Rg values

Part VII: Compute I(q) and P(r) Profiles, pre-select structures best fitting the data

  1. Go to the "Compute P(r) and I(q) / Preselect Models" tab.
  2. Compute and Fit I(q) Profiles Screenshot
    Compute and scale I(q) and P(r) Profiles, Preselect Models
  3. Use the computation method: PEPSI-SAXS.
  4. Enable "Perform a Second P(r) NNLS Fit" if additional fitting is required.
  5. Submit to perform computations and view real-time progress. Once it is complete, continue to the next section. Progress will show first advancements in the P(r) computations, then on the I(q) computations, both in blocks of 50 curves.
  6. Completed computations are shown in sections.
  7. Compute and Fit P(r)
    P(r) results section
    Compute and Fit I(q)
    I(q) PEPSI-SAXS results section

Part VIII: Final Model Selection

  1. Navigate to the "Final Models Selection Using WAXSiS" tab.
  2. Final Model Selection Screenshot
    Final Model Selection tab
  3. Click "Submit". This process can take up to a day to complete.
  4. Review Models.
  5. Final Models Graphs
    Final models with WAXSiS computations
  6. Review structures
  7. Final Models Structure
    Final structures with WAXSiS computations

Challenges

Perspectives

SAXS-A-FOLD integrates SAXS data with AlphaFold predictions for advanced structural analysis. Its comprehensive toolset aids in protein modeling and functional insights.

Help and Feedback

For questions or issues, visit the GitHub Issues page.