easyPARM Logo

Tutorial: Parameterization of Multi-metal Metalloproteins

Overview

This tutorial provides a step-by-step guide for parameterizing a multi-metal metalloprotein using Gaussian for QM calculations. The same workflow can be applied to ORCA and GAMESS as well. The example used throughout the tutorial is 1RWJ.pdb, a protein containing a three-heme c structure, each containing an iron (Fe) atom, as shown in the figure below.

1RWJ protein

Parameterization Challenges and Strategy

This tutorial highlights the capabilities of easyPARM by demonstrating how to parameterize a complex metalloprotein structure. The main challenges include:

  1. Multi-Metal System: The protein contains three heme c groups, each with an Fe atom that needs parameterization.
  2. Ligand Coordination Complexity:
    • Two of the heme c groups have Fe coordinated to two histidines (HID), and the porphyrin part is linked to the protein through a thioether bond with two cysteines (CYS). This means that parameterization is required not only for the metal-protein interactions but also for the non-standard thioether linkage.
    • The third heme c group has a slightly different coordination, where one of the histidines is replaced by a methionine (MET).

The figure below illustrates these coordination environments:

Heme Coordination

This workflow ensures accurate force field parameters for multi-metal systems, making it easier to simulate metalloproteins with complex coordination environments.

1. Preprocessing the PDB File

After downloading the PDB file from the Protein Data Bank, preprocessing is necessary to ensure compatibility with easyPARM and molecular dynamics (MD) simulations.

1.1 Preprocessing Tools

We will use the following tool:

  • H++ to prepare the system 1RWJ.pdb. After using H++, you will notice that the metal and its non-standard residues are removed, leaving only standard residues in the output file prepared.pdb.

Important Notes:

  • Protonation states of residues linked to the metal, such as histidine and cysteine, may not be correct after preprocessing. Verify the PDB file and make necessary adjustments:
    • Cysteine (CYS) should be renamed CYM, with the hydrogen linked to sulfur removed.
    • Histidine (HIE) should be renamed HID, with the hydrogen linked to nitrogen removed.
    • Protonated form: protonated.pdb
    • Deprotonated form: deprotonated.pdb

1.2 Manual Preparation

Once you have the deprotonated PDB, extract the three heme groups (HEC) from 1RWJ.pdb, corresponding to residues 90, 91, and 92:

  1. ligand1.pdb
  2. ligand2.pdb
  3. ligand3.pdb

Since hydrogen atoms are missing in these structures, use reduce from AmberTools to add them:

reduce -i ligand1.pdb -o ligand1_h.pdb
reduce -i ligand2.pdb -o ligand2_h.pdb
reduce -i ligand3.pdb -o ligand3_h.pdb

Then, create three copies of deprotonated.pdb as:

  • protein1.pdb
  • protein2.pdb
  • protein3.pdb
  • Each ligand must be added to its corresponding protein file individually. Follow these steps to create three separate protein-ligand complexes:

Ensure you have the following files ready:

  • ligand1_h.pdb, ligand2_h.pdb, and ligand3_h.pdb Add ligands to corresponding protein files
    ligand1_h.pdb → protein1.pdb
    ligand2_h.pdb → protein2.pdb
    ligand3_h.pdb → protein3.pdb
    

    After completion, you should have three separate protein-ligand complex files:

    1. protein1.pdb
    2. protein2.pdb
    3. protein3.pdb

Note: Each file contains exactly one protein structure with its corresponding ligand. Do not combine multiple ligands into a single protein file.

Next Steps

After creating these files, proceed to the parameterization stage as described in the following section.

2. Generating the XYZ Structure with easyPARM

Run easyPARM and select option 2:

Select your option:
1- Generate molecular complex parameters
2- Generate metalloprotein .xyz structure
3- Convert AMBER parameters to OpenMM or GROMACS format
Enter your choice: 2
Please provide the metalloprotein pdb file: protein1.pdb
XYZ Output: initial_structure.xyz

Rename the output file:

mv initial_structure.xyz initial_structure1.xyz

Repeat this for protein2.pdb and protein3.pdb to obtain:

  1. initial_structure1.xyz
  2. initial_structure2.xyz
  3. initial_structure3.xyz

Since protein1.pdb and protein2.pdb produce identical XYZ structures, only one will be used for Optimization, Frequency calculations, and Charge calculations.


3. QM Calculation Outputs

3.1 Protein 1 and Protein 2

3.2 Protein 3

4. Generating Molecular Parameters with easyPARM

Important Notes:

  1. We will parameterize the ligands one by one, starting with protein 1 and finishing with protein 3.
  2. It is highly recommended to select the option Change Residue ID to assign a unique residue ID for each heme C.

Run easyPARM again and select option 1:

Enter your choice: 1

This will prompt a series of configurations:

4.1 AMBER Configuration

Choose how to load AMBER:

  • Option 1: Use currently loaded AMBER
  • Option 2: Specify AMBER installation path

Since AMBER is already loaded, select option 1:

Enter your choice: 1

4.2 System Charge

The tool will request the total charge of the system:

Please provide the total charge: -2

4.3 Providing the Optimized Structure for the Seminario Method

Specify the optimized XYZ geometry file:

Please provide the optimized XYZ geometry file: OPTIMIZED1_2.xyz

4.4 Charge Calculation Method

Choose a method from the menu:

Select the charge calculation method:
1- GAUSSIAN (RESP charges)
2- ORCA (CHELPG charges)
3- ORCA (RESP Charges)
4- GAMESS (RESP Charges)
5- GAMESS (GAMESS Fit Charges)
Enter your choice: 1

4.5 mol2 File Generation

Select the charge output format:

Please select the input format:
1- Gaussian Output (.log file)
2- RESP (.gesp file)
Enter your choice: 1

Select the charge method:

Please select the charge method (recommended: RESP):
1- RESP (resp)
2- Mulliken (mul)
3- ESP (esp)
4- AM1-BCC (bcc)
Enter your choice: 1

Select the atom type:

Please select the atom type:
1- AMBER Force Field (AMBER)
2- General AMBER Force Field (GAFF)
3- General AMBER Force Field (GAFF2)
Enter your choice: 2

Provide the charge output file:

Please provide the charge output file (e.g., .log, .gesp): CHARGES1_2.log

4.6 Seminario Method Setup

Select the format for the Seminario method:

Please select the format you will provide:
1- Orca Output
2- Gaussian Output
3- Gaussian Checkpoint
4- Gaussian Formatted Checkpoint
5- Gamess Output
Enter your choice: 4

Please provide the formatted checkpoint file (.fchk): COMPLEX1_2.fchk

4.7 Metalloprotein Structure Confirmation

Does your structure belong to MetalloProtein? (y/n): y

Provide the final metalloprotein PDB file:

Please provide the metalloprotein PDB file: protein1.pdb

4.8 Change Residue ID

Would you like to change the residue ID (Default= mol)? (y/n): y
Please provide the residue name: 1HE

5. Output Generation

Upon completion, easyPARM generates the following files:

5.1 Key Outputs

5.2 Modifications for the Next Step

Copy the PDB output file:

cp easyPARM_MetalloProtein_1HE.pdb temp1.pdb

Open temp1.pdb, locate the first heme with the selected residue ID, and move the metal atom (Fe1) line to a new file:

Next, add ligand2_h.pdb to temp1.pdb. At this stage, we have:

  1. temp1.pdb: Contains protein, first heme (without metal atom), second heme (with its metal atom).
  2. reference.pdb: Contains the metal atom (Fe1) of the first heme.

Run easyPARM again and select option 1. Provide the same files (OPTIMIZED1_2.xyz, CHARGES1_2.log, COMPLEX1_2.fchk) but with temp1.pdb and a new residue name.

4.7 Metalloprotein Structure Confirmation (Second Iteration)

Does your structure belong to MetalloProtein? (y/n): y

Provide the final metalloprotein PDB file:

Please provide the metalloprotein PDB file: temp1.pdb

4.8 Change Residue ID

Would you like to change the residue ID (Default= mol)? (y/n): y
Please provide the residue name: 2HE

5. Output Generation

Upon completion, easyPARM generates the following files:

5.1 Key Outputs

5.2 Modifications for the Next Step

Copy the PDB output file:

cp easyPARM_MetalloProtein_2HE.pdb temp2.pdb

Modify temp2.pdb by moving the second heme’s metal atom (Fe1) to reference.pdb, then add ligand3_h.pdb. At this stage, we have:

  1. temp2.pdb: Contains protein, first and second heme (without metal atoms), third heme (with metal atom).
  2. reference.pdb: Contains metal atoms of the first and second heme.

Run easyPARM again, providing OPTIMIZED3.xyz, CHARGES3.log, COMPLEX3.fchk, and temp2.pdb with a new residue name.

4.7 Metalloprotein Structure Confirmation (Third Iteration)

Does your structure belong to MetalloProtein? (y/n): y

Provide the final metalloprotein PDB file:

Please provide the metalloprotein PDB file: temp2.pdb

4.8 Change Residue ID

Would you like to change the residue ID (Default= mol)? (y/n): y
Please provide the residue name: 3HE

5. Output Generation

Upon completion, easyPARM generates the following files:

5.1 Key Outputs

Finalize by merging metal atoms from reference.pdb into the correct residue ID in easyPARM_MetalloProtein_3HE.pdb:

cp easyPARM_MetalloProtein_3HE.pdb easyPARM_MetalloProtein_Final.pdb

This easyPARM_MetalloProtein_Final.pdb final structure is now ready for tleap.

6. Running tleap

Once the parameters are generated, you can proceed with tleap to prepare the final topology and coordinate files. Use the appropriate tleap input file to ensure correct setup:

Streamlined Metalloprotein Library Integration

Our latest implementation now supports the automatic generation of a library file for metalloproteins, eliminating the need for manual handling of multiple parameter files. The generated COMPLEX.lib file simplifies the tleap process by allowing direct loading with:

loadoff COMPLEX.lib

With this approach:

  • There is no need to use the Hybridization_Info.dat file.
  • You do not need to load each individual .mol2 file manually.
  • The metalloprotein setup is now more efficient and automated.

tleap Input File

Use the following tleap input file for reference:

This marks the completion of the metalloprotein parameterization, ensuring a seamless transition to system preparation and simulation.

7. Molecular Dynamic Simulation

simulation protein



Back to top

© 2025 Abdelazim M. A. Abdelgawwad. Distributed under the GNU LESSER GENERAL PUBLIC LICENSE Version 2.1, February 1999.