(please see click for Yang lab website)
- 1 Protein Data Bank (PDB)
- 2 VMD: a powerful tool
- 3 Cygwin: A Linux Simulator
- 4 Basic UNIX/LINUX commands
- 5 Linux Editors
- 6 Collecting CASE IDs
- 7 APBS/VMD: Generating the electrostatic surface of the protein
- 8 Running MD simulations
- 9 APBS
- 10 MC
Protein Data Bank (PDB)
The PDB is the most comprehensive repository of information regarding 3D structures of biological molecules, including proteins and nucleic acids. This resource is powered by the Protein Data Bank archive-information about the 3D shapes of proteins, nucleic acids, and complex assemblies that helps students and researchers understand all aspects of biomedicine and agriculture, from protein synthesis to health and disease. As a member of the wwPDB, the RCSB PDB curates and annotates PDB data. The RCSB PDB builds upon the data by creating tools and resources for research and education in molecular biology, structural biology, computational biology, and beyond.
http://www.rcsb.org --- A Structural View of Biology
Each 3D structure is saved as a .pdb file. For example, below is my favorite system (because we have been working on estrogen receptor in the lab),
How to download a PDB file
There are quite a few options available.
Click "Download Files" on the top-right corner --> PDB format
So you save the file to any desired folder of your choice. Another simple option is
Click "Display Files" on the top-right corner --> PDB format,
then you will see the file content directly.
HEADER NUCLEAR RECEPTOR 31-MAR-99 3ERD TITLE HUMAN ESTROGEN RECEPTOR ALPHA LIGAND-BINDING DOMAIN IN TITLE 2 COMPLEX WITH DIETHYLSTILBESTROL AND A GLUCOCORTICOID ... ... ATOM 1 N SER A 305 36.780 -16.046 0.284 1.00 75.45 N ATOM 2 CA SER A 305 36.130 -14.722 0.061 1.00 74.80 C ATOM 3 C SER A 305 35.329 -14.305 1.291 1.00 73.93 C ATOM 4 O SER A 305 34.145 -13.980 1.190 1.00 73.69 O ATOM 5 CB SER A 305 35.218 -14.791 -1.161 1.00 74.39 C ATOM 6 N LEU A 306 35.983 -14.315 2.450 1.00 72.78 N ATOM 7 CA LEU A 306 35.333 -13.947 3.703 1.00 70.98 C ...
How to read the PDB file
Details of the PDB file format can be found at
For example, the 'ATOM' section seen above is defined as follows:
COLUMNS DATA TYPE FIELD DEFINITION ------------------------------------------------------------------------------------- 1 - 6 Record name "ATOM " 7 - 11 Integer serial Atom serial number. 13 - 16 Atom name Atom name. 17 Character altLoc Alternate location indicator. 18 - 20 Residue name resName Residue name. 22 Character chainID Chain identifier. 23 - 26 Integer resSeq Residue sequence number. 27 AChar iCode Code for insertion of residues. 31 - 38 Real(8.3) x Orthogonal coordinates for X in Angstroms. 39 - 46 Real(8.3) y Orthogonal coordinates for Y in Angstroms. 47 - 54 Real(8.3) z Orthogonal coordinates for Z in Angstroms. 55 - 60 Real(6.2) occupancy Occupancy. 61 - 66 Real(6.2) tempFactor Temperature factor. 77 - 78 LString(2) element Element symbol, right-justified. 79 - 80 LString(2) charge Charge on the atom.
The 3D structure encoded in a PBD file (thanks for its standardized format) can be visualized by many different software packages such as VMD discussed below.
VMD: a powerful tool
VMD is designed for modeling, visualization, and analysis of biological systems such as proteins, nucleic acids, lipid bilayer assemblies, etc. It may be used to view more general molecules, as VMD can read standard PDB files and display the contained structure. VMD provides a wide variety of methods for rendering and coloring a molecule: simple points and lines, CPK spheres and cylinders, licorice bonds, backbone tubes and ribbons, cartoon drawings, and others. VMD can be used to animate and analyze the trajectory of a molecular dynamics (MD) simulation. In particular, VMD can act as a graphical front end for an external MD program by displaying and animating a molecule undergoing simulation on a remote computer. More details can be found at
The aim here is to very quickly get you familiar enough with VMD to be able to view individual protein structures and the sorts of trajectories containing many structures that are produced by molecular dynamics and other simulation techniques. This document is deliberately designed to cover only the most basic features of VMD. Excellent tutorials teaching the full range of the functionality provided by the program can be found at the VMD website above.
How to install VMD on your local laptop
You can download the file needed for your OS, but for the sake of saving time, you can click the link below. If you use Windows,
If you use iOS,
These Windows and OS X installations are pretty self explanatory. The small extra effort involved in Linux installation is also documented in the README file distributed with the program. Note that there is a copy of VMD already installed on the computer cluster we are going to use later.
How to run VMD
To start the program VMD,
OS X: Double click on the VMD icon in the Applications directory. Linux: Type vmd in a terminal window. Windows: Select Start -> Programs -> VMD.
Upon opening VMD opens three windows: the Main, OpenGL Display and Console (or Terminal on OS X) windows. To end a VMD session, go to the Main window, and choose File -> Quit. You can also quit VMD by closing the Console or Main window.
How to load a PDB file into VMD
From the menu bar in the Main window,
select File -> New Molecule
A window, called Molecule File Browser, should now open.
select Browse -- > pick 3ERD.pdb,
which you downloaded earlier.
select it and click Open.
Now, you should now have been returned to the Molecule File Browser window (the structure will not yet have been loaded). To load the file you need to
The structure should now be loaded into the main window.
Beatifying the protein structure
From the menubar of the Main window,
Select Graphics -> Representations
A new Graphical Representations window will open showing the current representation being highlighted.
Under the "Draw Style," there are two main drop down menus; One is
changing the Coloring Method by choosing different options from the drop down menu
The other is
altering the Drawing Method of the graphics used to display the molecule with the drop down menu.
A good choice (and my favorite) is 'New Cartoon' which highlights different elements of protein secondary structure.
Adding more Graphical Representations
From VMD Main window,
Click Graphics --> Representations --> Create Rep
You are encouraged to explore different layers of representations using Atom Selections. For example, under Selected Atoms, enter the following text
not protein and not water
Of course, you can pick a different color. Now you should be able to see any binding ligand.
View the protein sequence
To launch the Sequence Viewer click VMD Main window,
Extensions --> Analysis --> Sequence Viewer
The different color scales beside the sequence correspond to the B-factor and Secondary structure type (the major ones being Extended (beta) in yellow and Helix in purple).
How to save the figure
Normally, you use change to the default background color to white by
Click Graphics --> Colors --> Display --> Background --> White
Also, you can turn off the axes sign on the bottom by
Click Display --> Axes --> Off
Finally, you can save the structure view into a file by
Click File --> Render ... using "Tachyon (internal, in-memory rendering)"
Now, you should have the file saved to where you select. Note that Tachyon (internal) is commonly used for publication quality images
Many advanced topics are available. Here is a simple one:
Two PDB files used for class demonstration
Cygwin: A Linux Simulator
One of the best options is to provide functionality similar to a Linux distribution on Windows. Another good option is NX-based NoMachine if you like to give a try (just google NoMachine to learn more). If you use iOS, just use your terminal and no need to use Cygwin.
Should be straightforward. It seems the default installation does not include SSH (needed to make a connection to the HPCC cluster we will use later). You can update the installation by re-clicking the .EXE file. During the installation, search the package OPENSSH and make sure OPENSSH is clicked (instead of skipped). It should work if you finish the rest of this update process. You may need the package scp as well.
include openssh and scp
Just like any other windows program
Basic UNIX/LINUX commands
ls # list files/content in current directory mkdir dirname # make a new directory/folder cd # change directory to the directory 'dir' pwd # print the current working directory on the screen rm file # delete (remove) 'file' mv oldfile newfile # rename file to newfile cat file # print the contents of file to the screen head filename # Display the first n lines of a file (the default is 10) tail filename # Display the last n lines of a file (the default is 10) more filename # print file to the screen with more navigation less filename # View the contents of a file one page at a time ... man command # Display the help information for the specified command info command # Alternative to display the help information
Just like using Microsoft Word, you can choose an editor under LINUX/Cygwin. You can pick anyone you are familiar with. The default editor that comes with the UNIX operating system is called vi (visual editor). [Alternate editors for UNIX environments include pico and emacs, a product of GNU.]
The UNIX vi editor is a full screen editor and has two modes of operation:
Command mode commands which cause action to be taken on the file, and Insert mode in which entered text is inserted into the file.
In the command mode, every character typed is a command that does something to the text file being edited; a character typed in the command mode may even cause the vi editor to enter the insert mode. In the insert mode, every character typed is added to the text in the file; pressing the <Esc> (Escape) key turns off the Insert mode.
While there are a number of vi commands, just a handful of these is usually sufficient for beginning vi users. A sampling of basic vi commands can be found at
Collecting CASE IDs
Now, we need a list of CASE IDs for the next class
Get a HPCC account via faculty Account
How to transfer files between HPCC and your local machine (say, your laptop).
APBS/VMD: Generating the electrostatic surface of the protein
By now you might have realized that electrostatics play a very important role in docking molecules. The purpose of this optional exercise is to generate and view the electrostatic potential of HIV-1 protease and to see how a ligand interacts with this charged surface. As we found out earlier, PDB structures do not contain charge or radii information for the atoms in the structure. However, knowledge of charges is crucial for electrostatic calculations. Hence, we need to add this information to our PDB file. To do this we will use the python script pdb2pqr as described below.
VMD has a simple plugin for computing and viewing electrostatic surfaces generated using a program called Adaptive Poisson-Boltzmann Solver (APBS). We will load our new PQR file into VMD and call the plugin extension to call APBS. Alternatively, we can use the APBS webserver at
Running MD simulations
How to perform a simple MD simulation and data analytics (more to come)
Advanced sampling methods and energy-landscape simulations (more to come)
Absolute binding free energy calculations (more to come)