User Tools

Site Tools


Single Linux Script for Molecular Docking from Pubchem to spreadsheet

Necessary software:

  1. Open Babel for file conversions
    1. Python is necessary to run Open Babel
  2. AutoDock Vina for protein-ligand docking
  3. AutoDock Tools for protein file preparation
  4. DSX online to re-score ligand poses

Program Prerequisites:

  1. the Linux path must be set to a designated folder
  2. receptor (protein) pdbqt file must be in designated folder
  3. a text file with NSC numbers of desired ligands, one number per line must be in the designated folder
  4. must have known gridbox information for the protein file acquired manually using AutoDock Vina
  5. the pdb_pot folder and lib files for dsx must be in the designated folder
  6. the program file itself must also be in the designated folder

Program file:

Note: The first time running this file, you must change permissions by typing

  chmod u+x PUGREST_to_results.bash 

To run the file:

  1. type

    in the command line

  2. input the name of your ligand (protein) as it is named as a pdbqt file (this is case sensitive), along with previously obtained grid box information from AutoDock
  3. when prompted, input the name of your text file with NSC numbers and choose a name for the output (csv) file. This will overwrite any other csv file in the current directory with the name you choose.
  4. The first 2 ligands will run through the program, then it will stop.
    1. At this point, check the files created for each ligand and the csv file. Enter Y or N to continue or exit the program
  5. If you continue, the program will run through the rest of the ligands from your text file. At the end, the total number of successfully analyzed ligands will be displayed along with the NSC numbers of any ligands that were unsuccessful either at the PUGREST step or the Vina step

A note on unsuccessful ligands: There are some cases that the NSC number for a compound is simply not listed in the PubChem database. This could be one reason why the sdf file could not be pulled from PubChem. Another reason is that there may not be a 3D image listed in the database. These are the 2 main issues we have come across, but there may be others. If a ligand is unsuccessful at being analyzed by Vina, it most likely contains an unsupported atom type. The only supported atom types are H, C, N, O, F, Mg, P, S, Cl, Ca, Mn, Fe, Zn, Br and I and are listed by the script if any of the ligands were unsuccessful in going through Vina.

single_script_linux_instructions.txt · Last modified: 2019/03/30 21:16 by smparker