Tutorial:Main Page

From tutorial
Revision as of 23:09, 11 October 2016 by Syang (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

(please see click for Yang lab website)

Protein Data Bank (PDB)


The PDB is the most comprehensive repository of information regarding 3D structures of biological molecules, including proteins and nucleic acids. This resource is powered by the Protein Data Bank archive-information about the 3D shapes of proteins, nucleic acids, and complex assemblies that helps students and researchers understand all aspects of biomedicine and agriculture, from protein synthesis to health and disease. As a member of the wwPDB, the RCSB PDB curates and annotates PDB data. The RCSB PDB builds upon the data by creating tools and resources for research and education in molecular biology, structural biology, computational biology, and beyond.

  http://www.rcsb.org    ---  A Structural View of Biology

Each 3D structure is saved as a .pdb file. For example, below is my favorite system (because we have been working on estrogen receptor in the lab),


How to download a PDB file

There are quite a few options available.

  Click "Download Files" on the top-right corner  --> PDB format

So you save the file to any desired folder of your choice. Another simple option is

  Click "Display Files" on the top-right corner  --> PDB format,

then you will see the file content directly.

HEADER    NUCLEAR RECEPTOR                        31-MAR-99   3ERD              
ATOM      1  N   SER A 305      36.780 -16.046   0.284  1.00 75.45           N  
ATOM      2  CA  SER A 305      36.130 -14.722   0.061  1.00 74.80           C  
ATOM      3  C   SER A 305      35.329 -14.305   1.291  1.00 73.93           C  
ATOM      4  O   SER A 305      34.145 -13.980   1.190  1.00 73.69           O  
ATOM      5  CB  SER A 305      35.218 -14.791  -1.161  1.00 74.39           C  
ATOM      6  N   LEU A 306      35.983 -14.315   2.450  1.00 72.78           N  
ATOM      7  CA  LEU A 306      35.333 -13.947   3.703  1.00 70.98           C  

How to read the PDB file

Details of the PDB file format can be found at


For example, the 'ATOM' section seen above is defined as follows:

 1 -  6        Record name   "ATOM  "
 7 - 11        Integer       serial       Atom  serial number.
13 - 16        Atom          name         Atom name.
17             Character     altLoc       Alternate location indicator.
18 - 20        Residue name  resName      Residue name.
22             Character     chainID      Chain identifier.
23 - 26        Integer       resSeq       Residue sequence number.
27             AChar         iCode        Code for insertion of residues.
31 - 38        Real(8.3)     x            Orthogonal coordinates for X in Angstroms.
39 - 46        Real(8.3)     y            Orthogonal coordinates for Y in Angstroms.
47 - 54        Real(8.3)     z            Orthogonal coordinates for Z in Angstroms.
55 - 60        Real(6.2)     occupancy    Occupancy.
61 - 66        Real(6.2)     tempFactor   Temperature  factor.
77 - 78        LString(2)    element      Element symbol, right-justified.
79 - 80        LString(2)    charge       Charge  on the atom.

The 3D structure encoded in a PBD file (thanks for its standardized format) can be visualized by many different software packages such as VMD discussed below.

VMD: a powerful tool


VMD is designed for modeling, visualization, and analysis of biological systems such as proteins, nucleic acids, lipid bilayer assemblies, etc. It may be used to view more general molecules, as VMD can read standard PDB files and display the contained structure. VMD provides a wide variety of methods for rendering and coloring a molecule: simple points and lines, CPK spheres and cylinders, licorice bonds, backbone tubes and ribbons, cartoon drawings, and others. VMD can be used to animate and analyze the trajectory of a molecular dynamics (MD) simulation. In particular, VMD can act as a graphical front end for an external MD program by displaying and animating a molecule undergoing simulation on a remote computer. More details can be found at


The aim here is to very quickly get you familiar enough with VMD to be able to view individual protein structures and the sorts of trajectories containing many structures that are produced by molecular dynamics and other simulation techniques. This document is deliberately designed to cover only the most basic features of VMD. Excellent tutorials teaching the full range of the functionality provided by the program can be found at the VMD website above.

How to install VMD on your local laptop

You can download the file needed for your OS, but for the sake of saving time, you can click the link below. If you use Windows,

 click  win   

If you use iOS,

 click  mac

These Windows and OS X installations are pretty self explanatory. The small extra effort involved in Linux installation is also documented in the README file distributed with the program. Note that there is a copy of VMD already installed on the computer cluster we are going to use later.

How to run VMD

To start the program VMD,

    OS X: Double click on the VMD icon in the Applications directory.
    Linux: Type vmd in a terminal window.
    Windows: Select Start -> Programs -> VMD.

Upon opening VMD opens three windows: the Main, OpenGL Display and Console (or Terminal on OS X) windows. To end a VMD session, go to the Main window, and choose File -> Quit. You can also quit VMD by closing the Console or Main window.

How to load a PDB file into VMD

From the menu bar in the Main window,

   select File -> New Molecule

A window, called Molecule File Browser, should now open.

   select Browse -- > pick 3ERD.pdb,

which you downloaded earlier.

   select it and click Open.

Now, you should now have been returned to the Molecule File Browser window (the structure will not yet have been loaded). To load the file you need to

  click Load. 

The structure should now be loaded into the main window.

Beatifying the protein structure

From the menubar of the Main window,

  Select Graphics -> Representations

A new Graphical Representations window will open showing the current representation being highlighted.

Under the "Draw Style," there are two main drop down menus; One is

   changing the Coloring Method by choosing different options from the drop down menu

The other is

   altering the Drawing Method of the graphics used to display the molecule with the drop down menu. 

A good choice (and my favorite) is 'New Cartoon' which highlights different elements of protein secondary structure.

Adding more Graphical Representations

From VMD Main window,

   Click Graphics --> Representations --> Create Rep

You are encouraged to explore different layers of representations using Atom Selections. For example, under Selected Atoms, enter the following text

 not protein and not water

Of course, you can pick a different color. Now you should be able to see any binding ligand.

View the protein sequence

To launch the Sequence Viewer click VMD Main window,

  Extensions --> Analysis --> Sequence Viewer

The different color scales beside the sequence correspond to the B-factor and Secondary structure type (the major ones being Extended (beta) in yellow and Helix in purple).

How to save the figure

Normally, you use change to the default background color to white by

    Click Graphics --> Colors --> Display --> Background --> White

Also, you can turn off the axes sign on the bottom by

    Click Display --> Axes --> Off

Finally, you can save the structure view into a file by

    Click File --> Render ... using "Tachyon (internal, in-memory rendering)"

Now, you should have the file saved to where you select. Note that Tachyon (internal) is commonly used for publication quality images

Additional tutorials

Many advanced topics are available. Here is a simple one:

  VMD Tutorials

Two PDB files used for class demonstration

ER   ERwat

Cygwin: A Linux Simulator

One of the best options is to provide functionality similar to a Linux distribution on Windows. Another good option is NX-based NoMachine if you like to give a try (just google NoMachine to learn more). If you use iOS, just use your terminal and no need to use Cygwin.

Install Cygwin


Should be straightforward. It seems the default installation does not include SSH (needed to make a connection to the HPCC cluster we will use later). You can update the installation by re-clicking the .EXE file. During the installation, search the package OPENSSH and make sure OPENSSH is clicked (instead of skipped). It should work if you finish the rest of this update process. You may need the package scp as well.

  include openssh and scp 

Run Cygwin

Just like any other windows program

Basic UNIX/LINUX commands

see the Unix primer HTML PDF. Below are some commonly used commands:

ls                   # list files/content in current directory
mkdir dirname        # make a new directory/folder
cd                   # change directory to the directory 'dir'
pwd                  # print the current working directory on the screen
rm file              # delete (remove) 'file'
mv oldfile newfile   # rename file to newfile
cat file             # print the contents of file to the screen
head filename        # Display the first n lines of a file (the default is 10)
tail filename        # Display the last n lines of a file (the default is 10)
more filename        # print file to the screen with more navigation
less filename        # View the contents of a file one page at a time
man  command         # Display the help information for the specified command
info  command        # Alternative to display the help information

Linux Editors

Just like using Microsoft Word, you can choose an editor under LINUX/Cygwin. You can pick anyone you are familiar with. The default editor that comes with the UNIX operating system is called vi (visual editor). [Alternate editors for UNIX environments include pico and emacs, a product of GNU.]



The UNIX vi editor is a full screen editor and has two modes of operation:

        Command mode commands which cause action to be taken on the file, and
        Insert mode in which entered text is inserted into the file. 

In the command mode, every character typed is a command that does something to the text file being edited; a character typed in the command mode may even cause the vi editor to enter the insert mode. In the insert mode, every character typed is added to the text in the file; pressing the <Esc> (Escape) key turns off the Insert mode.

While there are a number of vi commands, just a handful of these is usually sufficient for beginning vi users. A sampling of basic vi commands can be found at




Collecting CASE IDs

Now, we need a list of CASE IDs for the next class




Get a HPCC account via faculty Account

How to transfer files between HPCC and your local machine (say, your laptop).

How to transfer

APBS/VMD: Generating the electrostatic surface of the protein

By now you might have realized that electrostatics play a very important role in docking molecules. The purpose of this optional exercise is to generate and view the electrostatic potential of HIV-1 protease and to see how a ligand interacts with this charged surface. As we found out earlier, PDB structures do not contain charge or radii information for the atoms in the structure. However, knowledge of charges is crucial for electrostatic calculations. Hence, we need to add this information to our PDB file. To do this we will use the python script pdb2pqr as described below.

VMD has a simple plugin for computing and viewing electrostatic surfaces generated using a program called Adaptive Poisson-Boltzmann Solver (APBS). We will load our new PQR file into VMD and call the plugin extension to call APBS. Alternatively, we can use the APBS webserver at

Running MD simulations

HowTo: a step-by-step tutorial


HowTo: a step-by-step tutorial


HowTo: a step-by-step tutorial

How to perform a simple MD simulation and data analytics (more to come)

Advanced sampling methods and energy-landscape simulations (more to come)

Absolute binding free energy calculations (more to come)