PDB Newsletter Number 16 -- Winter 2003 Published quarterly by the Protein Data Bank Weekly PDB news is available online at http://www.rcsb.org/pdb/latest_news.html. Links to PDB newsletters are available at http://www.rcsb.org/pdb/newsletter.html. ----------------------------------------- SNAPSHOT -- January 1, 2003 19,623 released atomic coordinate entries Molecule Type Experimental Technique 17,673 proteins, peptides, and viruses 16,588 diffraction and other 1,123 nucleic acids 8,514 structure factor files 809 protein/nucleic acid complexes 3,035 NMR 18 carbohydrates 1,441 NMR restraint files ----------------------------------------- TABLE OF CONTENTS Message from the PDB Data Deposition and Processing PDB Deposition Statistics for 2002 PDB Focus: Sequence Prerelease Data Query, Reporting, and Access New PDB Mirror Site in Germany Rsync Script for FTP Mirroring Theoretical Models Search Interface on the PDB Beta Web Site BioEditor Now Available from the PDB Web Site PDB Web Site Statistics PDB Outreach PDB Paper Published in Bioinformatics "Banking on Structures": BioIT World Article Looks at the PDB PDB Art Exhibit Part of CCMB's Silver Jubilee Celebrations and Symposium PDB Highlighted on New Jersey Network News PDB Annual Report 2002 Now Available PDB CD-ROM Set #102 and Subsequent Releases PDB Focus: Educational Resources at the PDB PDB Molecules of the Quarter: Dihydrofolate Reductase, Ferritin and Transferrin, and Cytochrome c PDB Job Listings Statement of Support PDB Members ----------------------------------------- MESSAGE FROM THE PDB We wish all of our user community a very happy new year! PDB members will be presenting talks at the Pacific Symposium on Biocomputing (January 3-7, Kauai, Hawaii) and at the Cambridge Healthtech Institute's PEP TALK (Human Proteome Project: plotting the course for proteomics, January 15-16, San Diego, CA) We will be exhibiting at the Biophysical Society's 46th Annual Meeting in San Antonio, TX (March 1-5, 2003) -- we hope you will stop by booth 204 and say hello. The PDB recently published a paper in Nucleic Acids Research's Database Issue. J. Westbrook, Z. Feng, L. Chen, H. Yang, and H.M. Berman (2003): The Protein Data Bank and structural genomics. Nucl. Acids Res. 31, pp. 489-491. This paper describes some of the resources available from http://www.rcsb.org/pdb/strucgen.html, including the target registration database, TargetDB. The PDB DATA DEPOSITION AND PROCESSING PDB DEPOSITION STATISTICS FOR 2002 In 2002, 3,381 structures were deposited to the PDB, and were processed by teams at RCSB-Rutgers, Osaka University, and the European Bioinformatics Institute. Of the structures deposited, 73% were deposited with a release status of "hold until publication"; 18% were released as soon as annotation of the entry was complete; and 9% were held until a particular date. 80% of these entries were determined by X-ray crystallographic methods; 13% were determined by NMR methods. PDB FOCUS: SEQUENCE PRERELEASE PDB depositors are given the opportunity to prerelease a sequence in advance of the coordinates. ADIT's default setting is set to sequence prerelease. From the PDB status search at http://www.rcsb.org/pdb/status.html, users may query all available sequences, or query based on criteria such as title or deposition date. This feature was developed in response to requests made to the PDB. It is hoped that the prerelease of sequence data will prevent unintended duplication of effort in structure determination. It will allow users to conduct blind tests of structure prediction and modeling techniques. DATA QUERY, REPORTING, AND ACCESS NEW PDB MIRROR SITE IN GERMANY A new RCSB PDB mirror site has been established at the Max Delbruck Center for Molecular Medicine in Berlin, Germany. This site is now accessible at http://www.pdb.mdc-berlin.de/. A complete list of the RCSB PDB's eight worldwide mirror sites can be found at http://www.rcsb.org/pdb/mirrors.html. RSYNC SCRIPT FOR FTP MIRRORING An rsync script, rsyncPDB.sh, has been made available at ftp://ftp.rcsb.org/pub/pdb/software/. This script assists users in setting up their own local mirrors of the PDB FTP site. Before successfully running it, users will need to set three variables in rsyncPDB.sh to suit their local setup. The rsync script may be preferred to the mirror.pl script that was previously recommended for local mirroring of the FTP site. Questions about the rsync script may be sent to info@rcsb.org. THEORETICAL MODELS SEARCH INTERFACE ON THE PDB BETA WEB SITE A simple search interface for theoretical model structures is now available from the PDB Beta Web Site at http://beta.rcsb.org/pdb/cgi/models.cgi. This interface facilitates searches by PDB ID, author, and compound, and is accessible from the top of the SearchLite and SearchFields interfaces. Feedback on this new feature may be sent to info@rcsb.org. BIOEDITOR NOW AVAILABLE FROM THE PDB WEB SITE BioEditor, a program for creating and viewing structure presentations, is now accessible from the PDB Web site at http://www.rcsb.org/pdb/education.html#Other and http://www.rcsb.org/pdb/software-list.html#Graphics. BioEditor (http://bioeditor.sdsc.edu/) is a tool to bridge the gap between printed literature and current Web-based presentation formats for macromolecular structures. It is a standalone Windows application that can be used to prepare and present structure annotations containing formatted text, graphics, sequence data, and interactive molecular views-- all in a single document or set of documents. BioEditor facilitates the communication of structure data to a diverse audience by allowing users to create and view dynamic content in a uniform format that can be widely distributed through the internet. BioEditor is designed to be used by structural scientists reporting and evaluating their data, as well as by educators and students who are seeking to relate structure to function in biological macromolecules. The BioEditor application includes features that enable the user to enter data, images, and references. Many features also link directly to resources on the internet. Complete BioEditor documentaries on the PDB structures of a zinc binuclear cluster and the glyceraldehyde 3-phosphate dehydrogenase (GAPDH) enzyme are accessible from the Protein Documentaries section of PDB's Education Page at http://www.rcsb.org/pdb/education.html#Documentaries. PDB WEB SITE STATISTICS The PDB is available from several Web and FTP sites located around the world. Users are also invited to preview new features at the PDB beta test site, accessible at http://beta.rcsb.org/pdb/. The access statistics are given below for the main PDB Web site at http://www.pdb.org/. Access Statistics for www.pdb.org ..........Daily Average................Monthly Totals................ Month.....Hits......Files....Sites.....KBytes......Files......Hits... Dec 02...159532....116284....78703....123057553....3488526....4785986 Nov 02...192425....147563....96273....169632466....4426891....5772772 Oct 02...187356....143130...101117....149761561....4437049....5808040 PDB Web Mirrors SDSC/UCSD (US) http://www.pdb.org/ Rutgers (US) http://rutgers.rcsb.org/ CARB/NIST (US) http://nist.rcsb.org/ CCDC (UK) http://pdb.ccdc.cam.ac.uk/ National University of http://pdb.bic.nus.edu.sg/ Singapore Osaka University (Japan) http://pdb.protein.osaka-u.ac.jp/ Universidade Federal de http://www.pdb.ufmg.br/ Minas Gerais (Brazil) Max Delbruck Center http://www.pdb.mdc-berlin.de/pdb/ (Germany) PDB OUTREACH PDB PAPER PUBLISHED IN BIOINFORMATICS The September 2002 issue of Bioinformatics contains a paper describing the Ontology Driven Architecture that was developed to increase efficiency in the use of macromolecular structure (MMS) data contained in the PDB. The paper describes the OpenMMS Toolkit, a suite of software tools that conform to the OMG/LSR Corba standard (OMG specification formal/02-05-01). The OpenMMS Toolkit is based on a metamodel architecture and is written entirely in Java. It contains an mmCIF parser, reference and database Corba servers, a relational database loader and a prototype XML formatted file converter. Detailed documentation and all of the source code are available at http://openmms.sdsc.edu/. D.S. Greer, J.D. Westbrook, P.E. Bourne (2002): An ontology driven architecture for derived representations of macromolecular structure. Bioinformatics, 18(9), pp. 1280-1281. © Oxford University Press. "BANKING ON STRUCTURES": BioIT WORLD ARTICLE LOOKS AT THE PDB The November 2002 issue of BioIT World examines all aspects of the PDB in "Banking on Structures" by Tracy Smith Schmidt at http://www.bio-itworld.com/archive/100902/banking.html. PDB ART EXHIBIT PART OF CCMB'S SILVER JUBILEE CELEBRATIONS AND SYMPOSIUM The PDB's "Art of Science" exhibit was part of the Centre for Cellular and Molecular Biology's (CCMB) Silver Jubilee Celebrations and Symposium on "The Current Excitement in Biology" in Hyderabad, India. The exhibit will be on display at the CCMB campus's main building from November 15, 2002 through February 28, 2003. The Symposium (http://www.ccmb.res.in/symposium/) was held on November 24-29, 2002. The PDB would like to see this exhibit travel to other places. If you would be interested in sponsoring this exhibit at your institution, please let us know at info@rcsb.org. PDB HIGHLIGHTED ON NEW JERSEY NETWORK NEWS A segment highlighting the PDB was shown as part of the New Jersey Network (NJN) News on Friday, December 27, 2002. The show interviewed PDB Director Helen M. Berman and provided a glimpse of the PDB offices at Rutgers University. PDB ANNUAL REPORT 2002 NOW AVAILABLE The PDB Annual Report 2002 is now being distributed. This document features a detailed look at the third full year of the operation of the PDB by Rutgers, the State University of New Jersey; the San Diego Supercomputer Center at the University of California, San Diego; and the Center for Advanced Research in Biotechnology of the National Institute of Standards and Technology -- three members of the RCSB. It highlights accomplishments during this period, from July 1, 2001 through June 30, 2002, and describes future developments of the PDB resource. Requests for printed copies of the PDB Annual Report can be sent to AnnualReport@rcsb.org. This document is also available on-line in PDF format at http://www.rcsb.org/pdb/annual_report02.pdf. PDB CD-ROM SET #102 AND SUBSEQUENT RELEASES Issue 102 of the PDB CD-ROM sets has been shipped. With this release, the coordinate files of 18,796 experimentally determined structures from the PDB FTP site as of October 1, 2002 are contained on a 5 CD-ROM set. The theoretical structures are also included in this set in a models directory. The experimental data (X-ray structure factors and NMR constraints) are available as a separate CD-ROM set. For additional information and ordering instructions refer to the CD-ROM page at http://www.rcsb.org/pdb/cdrom.html. Beginning with the 2003 releases, a copy of the full coordinate archive will be sent in January that will be followed by incremental releases in April, July, and October. These incremental releases will include all files released and updated since the prior CD-ROM set. A list of files that have become obsolete since the last update will be sent with each release so subscribers can update their set of structures. July and October releases will be updates also. New subscribers will receive the January release and all subsequent updates. A list of files that have become obsolete since the last update will be sent with each release so users can update their set of structures. The experimental data (NMR constraints and X-ray structure factors) CD-ROMs will be handled in the same manner as the structures - a complete set in January and incremental updates for the three subsequent quarters. PDB FOCUS: EDUCATIONAL RESOURCES AT THE PDB The "Educational Resources" page and the "Molecule of the Month" series are two important resources that the PDB maintains to serve the user community. Each month, Dr. David S. Goodsell (The Scripps Research Institute) highlights a key biological molecule with illustrations and text designed for a general audience. These features are available from the PDB home page, and are archived at http://www.rcsb.org/pdb/molecules/molecule_list.html. The PDB Educational Resources page at http://www.rcsb.org/pdb/education.html compiles molecular biology resources for audiences ranging from elementary level students to undergraduates to the general public. Proteins and nucleic acid tutorials, PDB articles and animations, protein documentaries, the World Index of Molecular Visualization Resources, and an illustrated glossary of crystallographic and NMR terminology are a few of the resources linked to from this page. Suggestions for additions to this page are appreciated and can be sent to info@rcsb.org. PDB MOLECULES OF THE QUARTER: DIHYDROFOLATE REDUCTASE, FERRITIN AND TRANSFERRIN, AND CYTOCHROME C The Molecule of the Month series explores the functions and significance of selected biological macromolecules for a general audience. These features, written and illustrated by Dr. David S. Goodsell of the Scripps Research Institute, are available at http://www.rcsb.org/pdb/molecules/molecule_list.html. A sample of the molecules featured during this past quarter are included below: Dihydrofolate Reductase: A Target in the Fight Against Cancer October, 2002 -- Dihydrofolate reductase is a small enzyme that plays a supporting role, but an essential role, in the building of DNA and other processes. It manages the state of folate, a snaky organic molecule that shuttles carbon atoms to enzymes that need them in their reactions. Of particular importance, the enzyme thymidylate synthase uses these carbon atoms to build thymine bases, an essential component of DNA. After folate has released its carbon atoms, it has to be recycled. This is the job performed by dihydrofolate reductase. Dihydrofolate reductase, shown in PDB entry 7dfr, juggles two relatively large molecules in its reaction. It has a long groove that binds to folate at one end, and to NADPH at the other end. The protein wraps sidechains around the two molecules, positioning them tightly next to one another. Then, the enzyme transfers hydrogen atoms from NADPH to the folate, converting folate to a useful reduced form. Enzymes with essential roles are sensitive targets for drug therapy. Dihydrofolate reductase was the first enzyme to be targeted for cancer chemotherapy. The first drug used for cancer chemotherapy was aminopterin. It binds to dihydrofolate reductase a thousand times more tightly than folate, blocking the action of the enzyme. Today, methotrexate and other variations on aminopterin are used, because of their tighter binding and better clinical characteristics. Since these drugs attack a key step in the production of DNA, they tend to kill cells that are actively growing rather than cells that are not growing. Since cancer cells are often the most rapidly reproducing cells in a patient, the drug will have the strongest effect on the cancer cells. The side effects of chemotherapy, however, are the result of the drug on other normally-growing tissues, such as hair follicles and the lining of the stomach. Further information about dihydrofolate reductase can be found at http://www.rcsb.org/pdb/molecules/pdb34_2.html. Ferritin and Transferrin: Iron Storage and Transport November, 2002 -- Iron is found everywhere on the Earth, so it is no surprise that living cells use iron ions in many ways. We use iron throughout our body, for many tasks. Iron ions bind strongly and specifically to small molecules such as oxygen, making it an essential tool for manipulating these elusive molecules. Iron ions also cycle easily between the ferrous and ferric forms, providing a handy tool for manipulating individual electrons. Iron ions, however, pose a great challenge in our modern biological environment. The water filling cells and the oxygen in the air together conspire to convert iron ions to the ferric state, which is highly insoluble, forming rust-like oxides. The cell must somehow shelter iron ions so that they may be stored and delivered in the necessary quantities. This is the job of ferritin and transferrin. Inside cells, extra iron ions are locked safely in the protein shell of ferritin, shown in PDB entry 1fha. Ferritin is composed of 24 identical protein subunits that form a hollow shell. After entering the ferritin shell, iron ions are converted into the ferric state, where they form small crystallites along with phosphate and hydroxide ions. There is room to pack about 4500 iron ions inside. We have about 3.7 grams of iron in our body, painstakingly gathered from iron in our diet. About 2.5 grams are locked inside the hemoglobin in our blood, where they assist in the transport of oxygen. This is a valuable and essential resource, so special mechanisms for the recycling of this iron have been developed. Another few tenths of a gram are found in myoglobin, which also assists in oxygen management. A remarkably small amount--about 0.02 g--is distributed between the many different proteins that transfer electrons, such as the proteins of the electron transport chain that create most of our cellular ATP supplies. The rest, about a gram, is stored inside ferritin to fulfill future needs. Iron ions are delivered in the blood by the protein transferrin, shown in PDB entry 1h76. Each transferrin molecule can carry two iron ions, with each ion coupled with a carbonate ion. The protein contains an array of amino acids that are perfectly arranged to form four bonds to the iron ion, which locks it in place. Once it finds its iron atoms, transferrin flows through the blood until it finds a transferrin receptor on the surface of a cell. PDB entry 1cx8 contains coordinates for the part of the receptor that is outside the cell. Transferrin binds tightly to the receptor and is drawn into the cell in a small vesicle. The cell then acidifies the inside of this little pocket, which causes transferrin to release its iron. Then, the receptor and empty transferrin are recycled back to the outside of the cell. Triggered by the neutral pH of the blood, the receptor releases the empty transferrin, and it continues its job of gathering iron. Further information about ferritin and transferrin can be found at http://www.rcsb.org/pdb/molecules/pdb35_2.html. Cytochrome c: Delivering Electrons December, 2002 -- Electricity is a common phenomenon in our modern world, powering everything from the lights in your room to the computer in front of you. Electricity is the flow of electrons within a conductive material, such as metal wires. These electrons flow in bulk, meandering from atom to atom along the wire. Cells also use electricity to power many processes, but the electrons move in a very different way. The electrons do not flow smoothly along a cell-sized wire. Instead, electrons are transported one at a time, jumping from protein to protein. In this way, the electrons may be picked up from one particular place and delivered exactly where they are needed. Cytochrome c, shown in PDB entry 3cyt, is a carrier of electrons. Like many proteins that carry electrons, it contains a special prosthetic group that handles the slippery electrons. Cytochrome c contains a heme group with an iron ion gripped tightly inside. The iron ion readily accepts and releases an electron. The surrounding protein creates the perfect environment for the electron, tuning how tightly it is held. The protein also determines where cytochrome c fits into the overall cellular electronic circuit. Cytochrome c is an ancient protein, developed early in the evolution of life. Since this essential protein performs a key step in the production of cellular energy, it has changed little in millions of years. So, you can look into yeast cells or plant cells or our own cells and find a very similar form of cytochrome c. If you look around the PDB, however, you can find a diverse collection of other electron carrier molecules. There are many variations on cytochrome c, which use heme and iron to carry electrons, but change the protein surrounding them to perform different jobs. Other carriers use other prosthetic groups to carry electrons, such as clusters of iron and sulfur (such as ferredoxin), brilliant blue copper ions (such as azurin and plastocyanin) or more exotic metal ions. Like cytochrome c, each of these proteins is a single connection in a cellular electronic circuit, transferring electrons from one point to another. Further information about reverse cytochrome c can be found at http://www.rcsb.org/pdb/molecules/pdb36_2.html. PDB JOB LISTINGS PDB career opportunities are posted at http://www.rcsb.org/pdb/jobs.html. The current available openings are: SYSTEMS AND APPLICATIONS PROGRAMMER The Protein Data Bank at Rutgers University has a position open for an applications programmer to support and develop software for data processing operations at the Protein Data Bank. Programming areas include: macromolecular structure analysis and validation, molecular graphics, web application development, distributed object and relational database applications, and general scientific programming. Experience developing and maintaining object oriented software on UNIX platforms is required. Experience in the following is highly desirable: C/C++, JAVA, and Corba. Please send resume to Dr. Helen Berman at pdbjobs@rcsb.rutgers.edu. BIOCHEMICAL INFORMATION SPECIALIST The Protein Data Bank at Rutgers University has a position open for a Biochemical Information Specialist to curate and standardize macromolecular structures for the Protein Data Bank. A background in biological chemistry, as well as some experience with UNIX-based computer systems, is required. Experience in crystallography and/or NMR spectroscopy is a strong advantage. The successful candidate should be well- motivated, able to pay close attention to detail, and meet deadlines. This position offers the opportunity to participate in an exciting project with significant impact on the scientific community. Please send resume to Dr. Helen Berman at pdbjobs@rcsb.rutgers.edu. ADMINISTRATIVE SUPPORT The Protein Data Bank at Rutgers University has a position open for Administrative Support. General office support including, but not limited to calendar maintenance, filing, phones, typing general correspondence. Proficiency in word processing and database applications as well as usage and creation of spreadsheets utilizing MS Office software. Requires excellent organizational and communication skills. General accounting experience a plus. Please send resume to Dr. Helen M. Berman at pdbjobs@rcsb.rutgers.edu. ----------------------------------------- STATEMENT OF SUPPORT The PDB is supported by funds from the National Science Foundation, the Office of Biological and Environmental Research at the Department of Energy, and two units of the National Institutes of Health: The National Institute of General Medical Sciences and the National Library of Medicine, in addition to resources and staff made available by the host institutions. ----------------------------------------- The PDB is managed by three partner sites of the Research Collaboratory for Structural Bioinformatics: RUTGERS Rutgers, The State University of New Jersey Department of Chemistry and Chemical Biology 610 Taylor Road Piscataway, NJ 08854-8087 SDSC/UCSD San Diego Supercomputer Center University of California, San Diego (UCSD) 9500 Gilman Drive La Jolla, CA 92093-0537 CARB/NIST Center for Advanced Research in Biotechnology National Institute of Standards and Technology 100 Bureau Drive Gaithersburg, MD 20899-8314 PDB PROJECT TEAM LEADERS Dr. Helen M. Berman - Director Rutgers University berman@rcsb.rutgers.edu Dr. Philip E. Bourne - Co-Director SDSC/UCSD bourne@sdsc.edu Judith Flippen-Anderson - Production Coordinator Rutgers University flippen@rcsb.rutgers.edu Dr. Gary Gilliland - Co-Director CARB/NIST gary.gilliland@nist.gov Dr. John Westbrook - Co-Director Rutgers University jwest@rcsb.rutgers.edu PDB MEMBERS RUTGERS Anthony Adelakun anthony@rcsb.rutgers.edu Kyle Burkhardt kburkhar@rcsb.rutgers.edu Li Chen lchen@rcsb.rutgers.edu Sharon Cousin sharon@rcsb.rutgers.edu Dr. Shuchismita Dutta sdutta@rcsb.rutgers.edu Dr. Zukang Feng zfeng@rcsb.rutgers.edu Dr. Rachel Kramer Green kramer@rcsb.rutgers.edu Dr. Shri Jain sjain@rcsb.rutgers.edu Richard Kreuter richard@rcsb.rutgers.edu Dr. Rose Oughtred rose@rcsb.rutgers.edu Gnanesh Patel gnanesh@rcsb.rutgers.edu Irina Persikova persikova@rcsb.rutgers.edu Tania Rose Posa tania@rcsb.rutgers.edu Suzanne Richman richman@rcsb.rutgers.edu Dr. Bohdan Schneider bohdan@rcsb.rutgers.edu Christine Zardecki zardecki@rcsb.rutgers.edu SDSC/UCSD David Archbell dave@sdsc.edu Dr. Peter Arzberger parzberg@sdsc.edu Bryan Banister bryan@sdsc.edu Tammy Battistuz tammyb@sdsc.edu Dr. Wolfgang Bluhm wbluhm@sdsc.edu Dr. Nita Deshpande nita@sdsc.edu Dr. Ward Fleri ward@sdsc.edu Dr. Douglas S. Greer dsg@sdsc.edu Jeff Ott jott@sdsc.edu David Padilla dpadilla@sdsc.edu Thomas Solomon tsolomon@sdsc.edu Wayne Townsend-Merino wayne@sdsc.edu Peggy Wagner wagner@sdsc.edu CARB/NIST Dr. T.N. Bhat bhat@nist.gov Phoebe Fagan phoebe.fagan@nist.gov Jeremy Kohansimeh jkohansi@nist.gov Dr. Veerasamy Ravichandran vravi@nist.gov Michael Tung michael.tung@nist.gov Dr. Greg Vasquez gregory.vasquez@nist.gov Padma Priya Paragi Vedanthi padmap@nist.gov ----------------------------------------- PDB Newsletter Number 16 -- Winter 2003 Published quarterly by the Protein Data Bank Weekly PDB news is available online http://www.rcsb.org/pdb/latest_news.html. Links to PDB newsletters are available at http://www.rcsb.org/pdb/newsletter.html.