Dear all,Greetings of the day!
"Bioinfo Consortium -India" -is a non-profit student sector.It helps the students of Biotechnology,Bioinformatics,Microbiology,Biochemistry to get training in the institutes/in industries.It also helps the students in job hunting and for those who looking for higher studies in India and abroad. This cant be done by a single person, we need your help.So I request everyone help me by sending informations about trainings,jobs,education,research opportunities and other useful informations to biomaticsindia@googlegroups.com/biomaticsindia@yahoo.com. The informations will be displayed with your name,photo and address in this site,so while sending the informations send your profile with your photo. Those who received this URL please forward to your friends in India and abroad.
Advance thanks for your contributions.
With regards,
Rajesh kumar.R
Founder-Biomatics
India
www.rajeshkumar.tk
Thursday, April 19, 2007
Thursday, December 14, 2006
Learn Bioinformatics for Free
BLAST
BLAST (NCBI)
BLAST Course (NCBI)
Blast 2 Sequences
BLAT Search
BLAST NCBI dbSNP
BLAST (EBI)
PSI/PHI BLAST
BLAST Mouse Genome
Genome MegaBlast
High Throughput BLAST Based on Web Services
BLAST ++ (Batch BLAST)
BioSearches
UCSC Genome Browser
Ensembl Genome Browser
LocusLink
AceView
OMIM
GenBank sequences
GenBank proteins
PubCrawler Results
Medline
NCBI PubMed
Google Scholar
Open J-Gate (Open Access Journals)
Pathways
Kyoto Encyclopedia of Genes and Genomes
ExPASy - Biochemical Pathways
CGAP - Pathways
Links to Pathway and Other Databases
The SEED
H. Pylori metabolic pathways
Pathway-specific SuperArrays
Repair FunMap
Gene Finders
GeneQuiz II
BCM Gene Finder
Proteins
NPS@: Network Protein Sequence @nalysis
BCM Protein secondary structure prediction
BCM General Protein Sequence/Pattern Searches
ProDom
Blocks (protein families)
ExPASy - PROSITE
ModBase (3D protein structure models)
3Motif
The PredictProtein server
PSORT II Protein Domain Prediction
PFAM - Protein FAMilies
DEAMBULUM : Analyse d'une sequence proteique
DRAWHCA
Multiple Alignments
ClustalW (EMBNet)
ClustalW (EBI)
LALIGN (EMBNet)
BCM Search Launcher: Multiple Sequence Alignm...
PipMaker and MultiPipMaker
Vista
GDB
U.S.A.
Australia
Australia 2
Germany
Israel
Netherlands
Taiwan
U.K.
SeWeR
Entrez Map View
Cancer Genome Anatomy
Phase Online
Celera - Human Genome Publication Site
PubGene
Software Tools - TIGR
Genome Project Links
TheScientificWorld
BioBencHelper
LabOnWeb
Promoter Analysis etc.
Metric Map of the Human Genome
UDB, The Unified Database
US Patent Database
NIX - Nucleotide identify X
SAM - Sequence Alignment and Modeling (UCSC)
BCM Search Launcher
DNA Block Alignment
Fasta3 (EBI)
SRS-FastA
Accelrys GCG
DBGET
Human Gene Mutation Database
HOVERGEN
Familial Cancer Risk Counseling & Genetic Testing Information Search Form
GenAtlas
Genetic Maps
GeneCards
Biolinks
Glovar Genome Server
The Catalog of Databases
Molecular Biology Database List
Bioinformatics
MacroShack
Nature Special on Computational Biology
Links to MolBio Tools
Pedro's Research Tools (de)
Pedro's Research Tools (us)
Webcutter 2.0
REBsites
REBASE
Sequence Gazing
The Swiss Institute of Bioinformatics
Bioinformatic tutorials
DISguISE
CCP11 - Biosequences and Structure Analysis
VSNS Biocomputing Division
Hidden Markov models for sequence analysis: extension and analysis of the basic method
SAM: Sequence Alignment and Modeling System
GenomePixelizer
Splice Junction Analysis Service
Individual Information Theory and Walkers
UK-HGMP
Virtual Genome Center
CSC BioBox
GeneWindow
PolyPhen
SLIDER
WebLogo
Energy Normalized Logo
SNPs
dbSNP/NCBI
Frequency Finder 2.0
SNPper
SNP 500
HapMap
Perlegen Genotype Browser
Cohort Consortium SNPs (login & pswd:FOCUS)
HGVBASE (se)
HGVBASE (de)
The Canvas
The Canvas 2
UW-FHCRC Variation Discovery Resource
Environmental SNPs (NIEHS)
JSNPs (Nakamura)
Cytokine Gene Polymorphism
CGAP-GAI SNP Finder (ID: 48196 pw:dcox)
ALFRED
The SNP Consortium Home Page
PerkinElmer Life Sciences > SNP Database
MIT-WI SNP database
Whitehead cSNPs (at WI)
CWRU Hypertension cSNPs
Kidd's lab
SNP genotyping by Amplifluor probes
Fluorescence Polarization
DASH : Dynamic Allele Specific Hybridization
A Resource for Discovering Human DNA Polymorphisms
The Sanger Centre : Polymorphisms
Mouse SNPs (Roche)
MIT-WI mouse SNPs
Applied Biosystems Store
PolyPhen non-syn SNP function
Database of Genomic Variants (CNP)
Scientific Journals
Journals Impact Factors
Web of Science
IARC Journals online
Mulford Library: Instructions to Authors. Wel...
Ovid
Nature
Nature
Nature Genetics
Nature Medicine
Nature Biotechnology
Nature Omics Gateway
Science
Cell
PNAS
New England Journal of Medicine
JNCI
Lancet
Genome Research
BioMed Central
GenomeBiology
The Pharmacogenomics Journal
Bioinformatics
BioTechniques
The Journal of Clinical Investigation
Cancer Research
Cancer Epidemiology Biomarkers & Prevention
The American Journal of Human Genetics
Wiley Personal Home Page for F. Canzian
Human Mutation
Oncogene
The Journal of Clinical Endocrinology & Metabolism
American Journal of Medical Genetics
JAMA
Pharmacogenetics
FEBS Letters
Annals of Improbable Research
European Journal of Endocrinology
Genetics
Genes & Development
FASEB Journal
Nucleic Acids Research
Human Molecular Genetics
Carcinogenesis
Genomics
BioMedNet Library
Journal of Molecular Endocrinology
Molecular Endocrinology
S. KARGER AG, BASEL - Journals INDEX
Elsevier Press
Physiological Genomics
Gastroenterology
Oligos
Oligo Analyzer
Calculating Tm
Genosys Oligo Calculations
Tm Determination
Oligo Calculator (us)
Oligo Calculator (au)
NetPrimer
Primer3
Primer Selection (Text)
DoPrimer
Robotics
Yahoo! Groups : lrig-discussion
ELRIG Home
Matrix Technologies (Tango, Hydra)
ORCA Robot - The Optimized Robot for Chemica...
Zymark Corporation - Genomics
Packard: Liquid Handling
CRS Robotics
Lab Services
MWG Biotech Robots
Robot Manufacturers
Tecan Group Ltd. - Aquarius™
Caliper Life Sciences : Sciclone ALH 3000 Workstation
Perkin Elmer Liquid Handling 96- and 384-Tip
EST & X-pression Genomix
Virtual Northern (example IL1B)
HUGE protein database (Kazusa)
ESTblast
U-Wash
I.M.A.G.E.
NCBI dbEST
UniGene
GeneMap'98
XREF Home Page
TIGR Human cDNA Mapping
DNA Chips
ArrayExpress
Hoheisel's lab
BODYMAP - INDEX
QIAGEN - Oligo Microarray Database
BD Biosciences Clontech - Custom array builder
SuperArray - home of the application specific...
Genome Centers
MIT - WI
CGR/KI
Eesti Geenikeskus
Kazusa Research Institute
TIGR
Sanger
UWash
Stanford
Baylor
LLNL BBRP
GSDB
Genome Institute of Singapore
Cancer Centers
GLOBOCAN 2000
EUCAN
NCI-DCEG
Gan Center
TeleSCAN: European Cancer Organisations
TeleSCAN: Cancer Institutions in Europe
ICRF
Italian National Cancer Institute
Hawaii Cancer Research Center
CancerWEB
Syöpä
Other Centers
Institute for Systems Biology
IAB Institute for Advanced Biosciences Keio...
Former Excoffier Lab
Marshfield Medical Research Foundation
TIGEM
Institut Pasteur
MPIMG - Lehrach
Oxford/Wellcome
EMBL
Genethon
Discovery Institute
BLAST
BLAST (NCBI)
BLAST Course (NCBI)
Blast 2 Sequences
BLAT Search
BLAST NCBI dbSNP
BLAST (EBI)
PSI/PHI BLAST
BLAST Mouse Genome
Genome MegaBlast
High Throughput BLAST Based on Web Services
BLAST ++ (Batch BLAST)
BioSearches
UCSC Genome Browser
Ensembl Genome Browser
LocusLink
AceView
OMIM
GenBank sequences
GenBank proteins
PubCrawler Results
Medline
NCBI PubMed
Google Scholar
Open J-Gate (Open Access Journals)
Pathways
Kyoto Encyclopedia of Genes and Genomes
ExPASy - Biochemical Pathways
CGAP - Pathways
Links to Pathway and Other Databases
The SEED
H. Pylori metabolic pathways
Pathway-specific SuperArrays
Repair FunMap
Gene Finders
GeneQuiz II
BCM Gene Finder
Proteins
NPS@: Network Protein Sequence @nalysis
BCM Protein secondary structure prediction
BCM General Protein Sequence/Pattern Searches
ProDom
Blocks (protein families)
ExPASy - PROSITE
ModBase (3D protein structure models)
3Motif
The PredictProtein server
PSORT II Protein Domain Prediction
PFAM - Protein FAMilies
DEAMBULUM : Analyse d'une sequence proteique
DRAWHCA
Multiple Alignments
ClustalW (EMBNet)
ClustalW (EBI)
LALIGN (EMBNet)
BCM Search Launcher: Multiple Sequence Alignm...
PipMaker and MultiPipMaker
Vista
GDB
U.S.A.
Australia
Australia 2
Germany
Israel
Netherlands
Taiwan
U.K.
SeWeR
Entrez Map View
Cancer Genome Anatomy
Phase Online
Celera - Human Genome Publication Site
PubGene
Software Tools - TIGR
Genome Project Links
TheScientificWorld
BioBencHelper
LabOnWeb
Promoter Analysis etc.
Metric Map of the Human Genome
UDB, The Unified Database
US Patent Database
NIX - Nucleotide identify X
SAM - Sequence Alignment and Modeling (UCSC)
BCM Search Launcher
DNA Block Alignment
Fasta3 (EBI)
SRS-FastA
Accelrys GCG
DBGET
Human Gene Mutation Database
HOVERGEN
Familial Cancer Risk Counseling & Genetic Testing Information Search Form
GenAtlas
Genetic Maps
GeneCards
Biolinks
Glovar Genome Server
The Catalog of Databases
Molecular Biology Database List
Bioinformatics
MacroShack
Nature Special on Computational Biology
Links to MolBio Tools
Pedro's Research Tools (de)
Pedro's Research Tools (us)
Webcutter 2.0
REBsites
REBASE
Sequence Gazing
The Swiss Institute of Bioinformatics
Bioinformatic tutorials
DISguISE
CCP11 - Biosequences and Structure Analysis
VSNS Biocomputing Division
Hidden Markov models for sequence analysis: extension and analysis of the basic method
SAM: Sequence Alignment and Modeling System
GenomePixelizer
Splice Junction Analysis Service
Individual Information Theory and Walkers
UK-HGMP
Virtual Genome Center
CSC BioBox
GeneWindow
PolyPhen
SLIDER
WebLogo
Energy Normalized Logo
SNPs
dbSNP/NCBI
Frequency Finder 2.0
SNPper
SNP 500
HapMap
Perlegen Genotype Browser
Cohort Consortium SNPs (login & pswd:FOCUS)
HGVBASE (se)
HGVBASE (de)
The Canvas
The Canvas 2
UW-FHCRC Variation Discovery Resource
Environmental SNPs (NIEHS)
JSNPs (Nakamura)
Cytokine Gene Polymorphism
CGAP-GAI SNP Finder (ID: 48196 pw:dcox)
ALFRED
The SNP Consortium Home Page
PerkinElmer Life Sciences > SNP Database
MIT-WI SNP database
Whitehead cSNPs (at WI)
CWRU Hypertension cSNPs
Kidd's lab
SNP genotyping by Amplifluor probes
Fluorescence Polarization
DASH : Dynamic Allele Specific Hybridization
A Resource for Discovering Human DNA Polymorphisms
The Sanger Centre : Polymorphisms
Mouse SNPs (Roche)
MIT-WI mouse SNPs
Applied Biosystems Store
PolyPhen non-syn SNP function
Database of Genomic Variants (CNP)
Scientific Journals
Journals Impact Factors
Web of Science
IARC Journals online
Mulford Library: Instructions to Authors. Wel...
Ovid
Nature
Nature
Nature Genetics
Nature Medicine
Nature Biotechnology
Nature Omics Gateway
Science
Cell
PNAS
New England Journal of Medicine
JNCI
Lancet
Genome Research
BioMed Central
GenomeBiology
The Pharmacogenomics Journal
Bioinformatics
BioTechniques
The Journal of Clinical Investigation
Cancer Research
Cancer Epidemiology Biomarkers & Prevention
The American Journal of Human Genetics
Wiley Personal Home Page for F. Canzian
Human Mutation
Oncogene
The Journal of Clinical Endocrinology & Metabolism
American Journal of Medical Genetics
JAMA
Pharmacogenetics
FEBS Letters
Annals of Improbable Research
European Journal of Endocrinology
Genetics
Genes & Development
FASEB Journal
Nucleic Acids Research
Human Molecular Genetics
Carcinogenesis
Genomics
BioMedNet Library
Journal of Molecular Endocrinology
Molecular Endocrinology
S. KARGER AG, BASEL - Journals INDEX
Elsevier Press
Physiological Genomics
Gastroenterology
Oligos
Oligo Analyzer
Calculating Tm
Genosys Oligo Calculations
Tm Determination
Oligo Calculator (us)
Oligo Calculator (au)
NetPrimer
Primer3
Primer Selection (Text)
DoPrimer
Robotics
Yahoo! Groups : lrig-discussion
ELRIG Home
Matrix Technologies (Tango, Hydra)
ORCA Robot - The Optimized Robot for Chemica...
Zymark Corporation - Genomics
Packard: Liquid Handling
CRS Robotics
Lab Services
MWG Biotech Robots
Robot Manufacturers
Tecan Group Ltd. - Aquarius™
Caliper Life Sciences : Sciclone ALH 3000 Workstation
Perkin Elmer Liquid Handling 96- and 384-Tip
EST & X-pression Genomix
Virtual Northern (example IL1B)
HUGE protein database (Kazusa)
ESTblast
U-Wash
I.M.A.G.E.
NCBI dbEST
UniGene
GeneMap'98
XREF Home Page
TIGR Human cDNA Mapping
DNA Chips
ArrayExpress
Hoheisel's lab
BODYMAP - INDEX
QIAGEN - Oligo Microarray Database
BD Biosciences Clontech - Custom array builder
SuperArray - home of the application specific...
Genome Centers
MIT - WI
CGR/KI
Eesti Geenikeskus
Kazusa Research Institute
TIGR
Sanger
UWash
Stanford
Baylor
LLNL BBRP
GSDB
Genome Institute of Singapore
Cancer Centers
GLOBOCAN 2000
EUCAN
NCI-DCEG
Gan Center
TeleSCAN: European Cancer Organisations
TeleSCAN: Cancer Institutions in Europe
ICRF
Italian National Cancer Institute
Hawaii Cancer Research Center
CancerWEB
Syöpä
Other Centers
Institute for Systems Biology
IAB Institute for Advanced Biosciences Keio...
Former Excoffier Lab
Marshfield Medical Research Foundation
TIGEM
Institut Pasteur
MPIMG - Lehrach
Oxford/Wellcome
EMBL
Genethon
Discovery Institute
Monday, November 27, 2006
PHYLIP programs and documentation
PHYLIP, the PHYLogeny Inference Package, consists of 35 programs. There are documentation files for each program, in the form of web pages in HTML 3.2. There are also documentation web pages for each group of programs, and a main documentation file that is the basic introduction to the package. Before running any of the programs you should read it.
Below you will find a list of the programs and the documentation files. The names of the documentation files are highlighted as links that will take you to those documentation files.
Introduction to PHYLIP
main documentation file
Molecular sequence methods
molecular sequence programs documentation file
protpars
protein parsimony documentation file
dnapars
DNA sequence parsimony documentation file
dnapenny
DNA parsimony branch and bound documentation file
dnamove
interactive DNA parsimony documentation file
dnacomp
DNA compatibility documentation file
dnaml
DNA maximum likelihood documentation file
dnamlk
DNA maximum likelihood with clock documentation file
proml
Protein sequence maximum likelihood documentation file
promlk
Protein sequence maximum likelihood with clock documentation file
dnainvar
DNA invariants documentation file
dnadist
DNA distance documentation file
protdist
Protein sequence distance documentation file
restdist
Restriction sites and fragments distances documentation file
restml
Restriction sites maximum likelihood documentation file
seqboot
Bootstrapping/Jackknifing documentation file
Distance matrix methods
Distance matrix programs documentation file
fitch
Fitch-Margoliash distance matrix method documentation file
kitsch
Fitch-Margoliash distance matrix with clock documentation file
neighbor
Neighbor-Joining and UPGMA method documentation file
Gene frequencies and continuous characters
Continuous characters and gene frequencies documentation file
contml
Maximum likelihood continuous characters and gene frequencies documentation file
contrast
Contrast method documentation file
gendist
Genetic distance documentation file
Discrete characters methods
Discrete characters methods documentation file
pars
Unordered multistate parsimony documentation file
mix
Mixed method parsimony documentation file
penny
Branch and bound mixed method parsimony documentation file
move
Interactive mixed method parsimony documentation file
dollop
Dollo and polymorphism parsimony documentation file
dolpenny
Dollo and polymorphism branch and bound parsimony documentation file
dolmove
Dollo and polymorphism interactive parsimony documentation file
clique
0/1 characters compatibility method documentation file
factor
Character recoding program documentation file
Tree drawing, consensus, tree editing, tree distances
Tree drawing programs documentation file
drawgram
Rooted tree drawing program documentation file
drawtree
Unrooted tree drawing program documentation file
consense
Consensus tree program documentation file
treedist
Tree distance program documentation file
retree
interactive tree rearrangement program documentation file
Below you will find a list of the programs and the documentation files. The names of the documentation files are highlighted as links that will take you to those documentation files.
Introduction to PHYLIP
main documentation file
Molecular sequence methods
molecular sequence programs documentation file
protpars
protein parsimony documentation file
dnapars
DNA sequence parsimony documentation file
dnapenny
DNA parsimony branch and bound documentation file
dnamove
interactive DNA parsimony documentation file
dnacomp
DNA compatibility documentation file
dnaml
DNA maximum likelihood documentation file
dnamlk
DNA maximum likelihood with clock documentation file
proml
Protein sequence maximum likelihood documentation file
promlk
Protein sequence maximum likelihood with clock documentation file
dnainvar
DNA invariants documentation file
dnadist
DNA distance documentation file
protdist
Protein sequence distance documentation file
restdist
Restriction sites and fragments distances documentation file
restml
Restriction sites maximum likelihood documentation file
seqboot
Bootstrapping/Jackknifing documentation file
Distance matrix methods
Distance matrix programs documentation file
fitch
Fitch-Margoliash distance matrix method documentation file
kitsch
Fitch-Margoliash distance matrix with clock documentation file
neighbor
Neighbor-Joining and UPGMA method documentation file
Gene frequencies and continuous characters
Continuous characters and gene frequencies documentation file
contml
Maximum likelihood continuous characters and gene frequencies documentation file
contrast
Contrast method documentation file
gendist
Genetic distance documentation file
Discrete characters methods
Discrete characters methods documentation file
pars
Unordered multistate parsimony documentation file
mix
Mixed method parsimony documentation file
penny
Branch and bound mixed method parsimony documentation file
move
Interactive mixed method parsimony documentation file
dollop
Dollo and polymorphism parsimony documentation file
dolpenny
Dollo and polymorphism branch and bound parsimony documentation file
dolmove
Dollo and polymorphism interactive parsimony documentation file
clique
0/1 characters compatibility method documentation file
factor
Character recoding program documentation file
Tree drawing, consensus, tree editing, tree distances
Tree drawing programs documentation file
drawgram
Rooted tree drawing program documentation file
drawtree
Unrooted tree drawing program documentation file
consense
Consensus tree program documentation file
treedist
Tree distance program documentation file
retree
interactive tree rearrangement program documentation file
Tuesday, November 21, 2006
Free Bioinformatics tools
Arka
Description
Arka is a program that Serves as a graphical interface for the programs from the GP package
has some interesting functions on it. Main scope of the program is the manipulation and visualisation of DNA / RNA / protein sequences.
The GP package contains many command-line utilities which fullfill a whole bunch of tasks (from DNA sequence searches to restriction analysis and determining the melting temperature of oligonucleotides). While those programs are convenient to use in batch processing and CGI scripts (which was the purpose of those programs), they lack a nice GUI.
Arka remembers the options for the GP programs and knows what both the programs and the options do. Besides, it has some gadgets on its own. It requires GTK+, but doesn't need GNOME. Also, it is small and quick: look, I write and use my programs on an old 486 laptop. It should run like hot butter on your computer. Unless, of course, it is a 386 SX. The name comes from the "UAG" stop codon, which is traditionally called "arka codon".
Available Downloads
Source + i386 binaries, tar and gzipped: arka-0.11.tgz RPM, i386 binaries: arka-0.11-1.i386.rpm RPM, source: arka-0.11-1.src.rpm
Bioperl
Description
The Bioperl server provides an online resource for modules, scripts, and web links for developers of Perl-based software for life science research. They can also provide web, FTP and CVS space for individuals and organizations wishing to distribute or otherwise make freely available standalone scripts & code.
Tutorial
BioPerl 1.4 Tutorial
Pasteur Institute Bioperl Course
BioPerl 1.4 Module Documentation
Available Downloads
Core:bioperl-1.4
gzip
bz2
zip
ppm.gzip
Run:bioperl-run-1.4
gzip
bz2
Ext:bioperl-ext-1.4
gzip
bz2
DB:bioperl-db-1.4
gzip
zip
Microarray:bioperl-microarray-0.1
gzip
bz2
zip
GUI:bioperl-gui-0.7
gzip
Chemtool
Description
Chemtool is a small program for drawing chemical structures on Linux and Unix systems using the GTK toolkit under X11. A short and possibly outdated description of the available functions is available here.
Chemtool relies on transfig by Brian Smith for postscript printing and exporting files in PicTeX and EPS formats. Its companion program, XFig, is recommended for enhancing the output of chemtool, and for creation of 2D diagrams and schematics in general. Both are included with most distributions of Linux, and are available through a number of websites including http://www.nminbre.org/pages/bioinformatics/www.xfig.org. If you want to import chemtool drawings into word processing programs other than LaTeX, you will probably want to add a preview bitmap to them, as neither StarOffice/OpenOffice nor that software from Redmond seem to be able to display postscript inserts on screen without them. For this purpose, using either ps2epsi, which comes with ghostscript, or epstool, a part of gsview is recommended. Since chemtool-1.6, this option is supported directly (through the equivalent function offered by recent versions of transfig). Chemtool was originally written by Thomas Volk, then a student of chemistry and biology at the University of Ulm, Germany. His version, which was described in an article in the German periodical LinuxMagazin, was using plain X11. A more recent review of chemtool appeared in Nachr. Chem. Tech. Lab. 49 (2001) 1310-1314.
Available Downloads
chemtool-1.6.8, sourcecode in tar.gz formatchemtool-1.6.8-1.src.rpm, sourcecode in rpm formatchemtool-1.6.8-1.i586.rpm, SuSE 9.3 rpm package
ClustalW
Description
Clustal W is a general purpose multiple sequence alignment program for DNA or proteins. It produces biologically meaningful multiple sequence alignments of divergent sequences. It calculates the best match for the selected sequences, and lines them up so that the identities, similarities and differences can be seen. Evolutionary relationships can be seen via viewing Cladograms or Phylograms.
Multiple alignments of protein sequences are important tools in studying sequences. The basic information they provide is identification of conserved sequence regions. This is very useful in designing experiments to test and modify the function of specific proteins, in predicting the function and structure of proteins, and in identifying new members of protein families.
Sequences can be aligned across their entire length (global alignment) or only in certain regions (local alignment). This is true for pairwise and multiple alignments. Global alignments need to use gaps (representing insertions/deletions) while local alignments can avoid them, aligning regions between gaps. ClustalW is a fully automatic program for global multiple alignment of DNA and protein sequences. The alignment is progressive and considers the sequence redundancy. Trees can also be calculated from multiple alignments. The program has some adjustable parameters with reasonable defaults. EBI provides a version of Clustal W that can be executed over the Internet on their computers. In addition, you can download a copy of the basic software to run on your own computer. Versions exist for UNIX, DOS, Windows XP (command line mode only) and Mac OSX.
Tutorials
Nucleotides
Proteins
Available Downloads
DOS
XP
MAC-OSX
UNIX
COALESCE
Description
Metropolis-Hastings Markov Chain Monte Carlo genealogy sampler.
For use in cases without recombination, selection or migration and with constant population size.
This program takes as input a set of aligned DNA or RNA sequences from different individuals in a population and uses them to make a maximum likelihood estimate of the parameter "theta," using the method described in Kuhner et al. (1995). Theta is defined as 4 times the effective population size times the mutation rate in a diploid organism, or 2 times the effective population size times the mutation rate in a haploid. (Note that this is mutation rate per site, not per locus.)
COALESCE assumes that the sampled population is of constant size, and that the loci sampled are not affected by selection or recombination. If these assumptions are violated the results may be erroneous. The algorithm begins with a genealogy for the sequences and sequentially makes modifications in it, accepting or rejecting the modifications based on the sequence data, and sampling the current genealogy at intervals. From the sampled genealogies it constructs a likelihood curve and maximum likelihood estimate for theta. The aim is to preferentially sample those genealogies which can contribute substantial information to the estimate of theta, avoiding the myriads of possible but unlikely and thus uninformative genealogies. If more than one locus is analyzed, likelihoods from all loci are summed to make an overall likelihood curve and estimate of theta. The basic unit of progress of the program is a "step"--one proposed change to the genealogy, which may be accepted or rejected. A continuous series of steps, all using the same parameter values, is a "chain".
Documentation
User's guide
Available Downloads
Unix source code:
UNIX tar.gz encoded [156kb]
UNIX tar.Z encoded [211kb] PowerMac binary:
HQX encoded CompactPro archive [256kb; binary + documentation]
fastDNAml
Description
fastDNAml is a program for estimating maximum likelihood phylogenetic trees from nucleotide sequences. Much of this program is based on version 3.3 of Joseph Felsenstein's DNAML program.
Reference
G. J. Olsen, H. Matsuda, R. Hagstrom, and R. Overbeek. 1994. fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10: 41-48
Available Downloads
fastDNAml -- The current release of the program.
mpi_fastDNAml and pvm_fastDNAml -- Parallel versions based upon MPI or PVM are available from Indiana University.
fastDNAml_p4 -- A version of the program using the p4 (Portable Programs for Parallel Processing) library. This version is now quite old and is not being supported (see the link above for newer MPI and PVM versions from Indiana University).
FLUCTUATE
Description
FLUCTUATE fits the model which has a single population which has been growing (or shrinking) according to an exponential growth law. It estimates 4Nu and g, where N is the effective population size, u is the neutral mutation rate per site, and g is the growth rate of the population. If you have a PowerMac, you will want to fetch the PowerMac binary, or if you have an Intel processor with Windows 95/98/NT/2000/XP you want the exe file.
Available Downloads
LINUX tar.gz encoded source and documentation.
LINUX tar.gz encoded binary and documentation.
Self-extracting HQX encoded CompactPro archive [PowerMac binary and documentation]
Self extracting Windows archive [Windows 95/98/NT/2000/XP binary and documentation]
LaTeX version of paper
PostScript version of paper
HMMER
Description
Profile hidden Markov models (profile HMMs) can be used to do sensitive database searching using statistical descriptions of a sequence family's consensus. HMMER is a freely distributable implementation of profile HMM software for protein sequence analysis
Documentation
User's guide
Available Downloads
All distributions below come with full source code, the User's Guide (PDF format), UNIX man pages, and other documentation. Once you download, uncompress (gunzip), and un-tar (tar xf), see the file INSTALL for quick installation instructions.
HMMER should build cleanly on any UNIX platform, including Mac OS/X. It should also compile on Microsoft Windows platforms, but you would have to work around the GNU configure script and UNIX makefiles. Porting to other non UNIX operating systems such as VAX/VMS should not be difficult. The code is standard ANSI/POSIX C.
Source code
AMD Opteron/Linux
Apple Macintosh PowerPC OS/X
Compaq Alpha Tru64
Compaq Alpha Linux
Hewlett/Packard IA64 (Itanium2), Linux
Hewlett/Packard IA64 (Itanium2), HP/UX
IBM Power4, Linux
IBM Power4, AIX
Intel FreeBSD
Intel GNU/Linux
Intel GNU/Linux as RPM
Intel OpenBSD
Intel Solaris
Silicon Graphics IA64 (Itanium2), Linux
Silicon Graphics MIPS IRIX
Sun Sparc Solaris not currently available; Use source code above
GeneSplicer
Description
A fast, flexible system for detecting splice sites in the genomic DNA of various eukaryotes. The system has been trained and tested successfully on Plasmodium falciparum (malaria), Arabidopsis thaliana, human, Drosophila, and rice. Training data sets for human and Arabidopsis thaliana are included. Use the GeneSplicer Web Interface to run GeneSplicer directly, or see below for instructions on downloading the complete system including source code. GeneSplicer is released as source code and was tested on Linux RedHat 6.x+, Sun Solaris, and Alpha OSF1, but should work on any Unix system.
Available Downloads
GeneSplicer system
GP
Description
GP is a set of small utilities written in ANSI C to manipulate DNA sequences in a Unix fashion, fit for combining within shell and cgi scripts. I have done this utilities for myself and found them very useful for my work; they are fast and quite reliable, and playing with large numbers of sequences is much more convenient with command line interface then with standard GUI tools. Feel free to mail me bug reports and suggestions. The programs are supposed to compile fine under any ANSI C compiler, but I never tried any platform other then Unix / Linux. You will find more details online on the GP man pages. And here is an example of a site using GP programs in CGI scripts to do promoter searches on-the-fly.
Available Downloads
Source + i386 binaries, tar and gzipped: gp-0.26.tgz RPM, i386 binaries: gp-0.26-1.i386.rpm RPM, source: gp-0.26-1.src.rpm
Lucy
Description
A Sequence Cleanup Program. Lucy is a utility that prepares raw DNA sequence fragments for sequence assembly, possibly using the TIGR Assembler. The cleanup process includes quality assessment, confidence reassurance, vector trimming and vector removal. The primary advantage of Lucy over other similar utilities is that it is a fully integrated, stand alone program.
Reference
H. H. Chou and M. H. Holmes. 2001. DNA sequence quality trimming and vector removal. Bioinformatics. 17(12): 1093-1104.
Documentation
Program Requirements
Available Downloads
Lucy [Unix version]
Lucy2 [Hui-Hsien Chou's Windows version]
NUT
Description
NUT is an open-source free nutrition software that records what you eat and analyzes your meals for nutrient levels in terms of the "Daily Value" or DV which is the standard for food labeling in the US. The program uses the free food composition database from the USDA. By experimenting, you can find the optimal level of the various nutrients and how to implement this with foods available to you. NUT can help reconstruct the lost instruction manual to your care and feeding because, when the authorities and crackpots disagree on the proper human diet, you can design an experiment using the food composition tables to discover the truth!
Features of NUT include:
7146 foods and 136 nutrients--the complete, latest USDA database
Foods easy to find and add to daily meals
Configurable for 1-19 meals per day and any dietary plan--including low carb, zone, low fat
Comprehensive meal analysis for any number of consecutive meals
Presents both easy-to-read percentage summaries and in-depth nutrient analysis, including Omega-3 and Omega-6 essential fatty acids
Defaults to ounces or grams based on user input
Suggests foods based on current diet
Can easily create additional databases for other family members
Auto-transfer of successful dietary strategies from analysis screen to configuration settings
Allows recording of recipes and customary meals for fast data entry
Guesses recipes of packaged foods
Creates graphs of nutrient intake showing daily and monthly trends
Sorts foods richest in each of the 136 nutrients
Reveals which foods contribute most to user's nutrition
Runs on Linux, Un*x, Windows (DOS); allows dual-boot systems to share the same data; and has no dependencies on other programs
The price is right--it's free! And you can read and modify the source code.
Documentation
Man page
Installation instructions
Opinions on how to improve your nutrition
A frequently-asked question
How To Use NUT
Find the Optimum: It's Easy as 1, 2, 3!
Find the Optimum: How NUT's Default Polyunsaturated Fat Reference Values Were Derived
Find the Optimum: Which Fats?
Find the Optimum: A Word about Insulin Resistance (Which Carbohydrates?)
Find the Optimum: Notes on Vitamins and Minerals
Read about Feline Nutrition and Consider Its Resonance with Human Nutrition
Available Downloads
latest source archive compressed with gzip: nut-11.1.tar.gz
latest source archive compressed with bzip2: nut-11.1.tar.bz2
latest source archive compressed with zip: nut-11.1.zip
PdbAlign
Description
Given a GCG multiple sequence alignment file (a GCG MSF file), which a includes a sequence of known structure, the program pdbalign maps the sequence variability onto the known structure. The central premise is of course, that for a closely related family of proteins (sequence ID > 40%) the 3-D structures will not be significantly different.
Reference
Roger A. Sayle, Mansoor A. S. Saqi, M. Weir, Andrew Lyall. 1995. PdbAlign, PdbDist and DistAlign: tools to aid in relating sequence variability to structure. Computer Applications in the Biosciences. 11(5): 571-573.
Documentation
README
Available Downloads
UNIX
PHYLIP
Description
PHYLIP (the PHYLogeny Inference Package) is a package of programs for inferring phylogenies (evolutionary trees). It is available free over the Internet, and written to work on as many different kinds of computer systems as possible. The source code is distributed (in C), and executables are also distributed. In particular, already-compiled executables are available for Windows (95/98/NT/2000/me/xp), MacOS 8 and 9, MacOS X, and Linux systems. Complete documentation is available on documentation files that come with the package.
Methods that are available in the package include parsimony, distance matrix, and likelihood methods, including bootstrapping and consensus trees. Data types that can be handled include molecular sequences, gene frequencies, restriction sites and fragments, distance matrices, and discrete characters.
The programs are controlled through a menu, which asks the users which options they want to set, and allows them to start the computation. The data are read into the program from a text file, which the user can prepare using any word processor or text editor (but it is important that this text file not be in the special format of that word processor -- it should instead be in "flat ASCII" or "Text Only" format). Some sequence analysis programs such as the ClustalW alignment program can write data files in the PHYLIP format. Most of the programs look for the data in a file called "infile" -- if they do not find this file they then ask the user to type in the file name of the data file.
Output is written onto special files with names like "outfile" and "outtree". Trees written onto "outtree" are in the Newick format, an informal standard agreed to in 1986 by authors of a number of major phylogeny packages. At this stage they do not have a mouse-windows interface for PHYLIP.
Documentation
Overview
PHYLIP programs and documentation
Installation instructions
Available Downloads
Linux or Unix gzip'ed tar archive of C sources and documentation
Windows Documentation and C source code
Windows95/98/NT/2000/me/xp executables, part 1
Windows95/98/NT/2000/me/xp executables, part 2
Mac OS X Documentation, source code and executables
Mac OS 8 or 9 Single Stuffit Documentation and C source code
Mac OS 8 or 9 Multiple Stuffit Documentation and C source code
Macintosh Mac OS 8 or 9 executables, part 1
Macintosh Mac OS 8 or 9 executables, part 2
Macintosh Mac OS 8 or 9 executables, part 3
Please register after downloading
ProFit
Description
ProFit (pronounced Pro-Fit, not profit!) is designed to be the ultimate program for performing least squares fits of two protein structures. It performs a very simple and basic function, but allows as much flexibility as possible in performing this procedure. Thus one can specify subsets of atoms to be considered, specify zones to be fitted by number, sequence, or by sequence alignment. ProFit does not try to address the question of sorting out equivalent atoms for you beyond doing a sequence alignment. There are other programs such as SSAP and GAFIT which address that problem. You must specify which residues and atoms you consider to be equivalent although the program supports internal sequence alignment to set the zones automatically.
Documentation
Full ProFit documentation
Frequently asked questions
Available Downloads
ProFit is freely available for use by not-for-profit organisations and for commercial organisations (providing they inform the author that they are using it). It may not be distributed without the author's permission, but must be obtained from this site. It is supplied as a gzipped tar file of source code and as an Linux binary.
Bernhard Rupp has kindly provided a ZIP file of ProFit compiled for Windows (Win32). This is only available for Version 2.3 of ProFit.
Registration and download
RasMol
Description
RasMol is a molecular graphics program intended for the visualisation of proteins, nucleic acids and small molecules. The program is aimed at display, teaching and generation of publication quality images. The program has been developed at the University of Edinburgh's Biocomputing Research Unit and the Biomolecular Structures Group at Glaxo Research and Development, Greenford, UK.
RasMol reads in molecular co-ordinate files in a number of formats and interactively displays the molecule on the screen in a variety of colour schemes and representations. Currently supported input file formats include Brookhaven Protein Databank (PDB), Tripos' Alchemy and Sybyl Mol2 formats, Molecular Design Limited's (MDL) Mol file format, Minnesota Supercomputer Center's (MSC) XMol XYZ format, CHARMm format, MOPAC format, CIF format and mmCIF format files. If connectivity information and/or secondary structure information is not contained in the file this is calculated automatically. The loaded molecule may be shown as wireframe, cylinder (drieding) stick bonds, alpha-carbon trace, spacefilling (CPK) spheres, macromolecular ribbons (either smooth shaded solid ribbons or parallel strands), hydrogen bonding and dot surface. Atoms may also be labelled with arbitrary text strings. Alternate conformers and multiple NMR models may be specially coloured and identified in atom labels. Different parts of the molecule may be displayed and coloured independently of the rest of the molecule or shown in different representations simultaneously. The space filling spheres can even be shadowed. The displayed molecule may be rotated, translated, zoomed, z-clipped (slabbed) interactively using either the mouse, the scroll bars, the command line or an attached dials box. RasMol can read a prepared list of commands from a `script' file (or via interprocess communication) to allow a given image or viewpoint to be restored quickly. RasMol can also create a script file containing the commands required to regenerate the current image. Finally the rendered image may be written out in a variety of formats including both raster and vector PostScript, GIF, PPM, BMP, PICT, Sun rasterfile or as a MolScript input script or Kinemage. RasMol will run on a wide range of architectures and systems including SGI, sun4, sun3, sun386i, SGI, DEC, HP and E&S workstations, IBM RS/6000, Cray, Sequent, DEC Alpha (OSF/1, OpenVMS and Windows NT), IBM PC (under Microsoft Windows, Windows NT, OS/2, Linux, BSD386 and *BSD), Apple Macintosh (System 7.0 or later), PowerMac and VAX VMS (under DEC Windows). UNIX and VMS versions require an 8bit, 24bit or 32bit X Windows frame buffer (X11R4 or later). The X Windows version of RasMol provides optional support for a hardware dials box and accelerated shared memory rendering (via the XInput and MIT-SHM extensions) if available.
Available Downloads
DEC/Compaq/HP
HP
Linux (RedHat 7, i386)
Mac
MS Windows
RS/6000
SGI
SeWeR
Description
SeWeR is an acronym, stands for SEquence analysis using WEb Resources. It serves you a single door to all the common web-based services for sequence analysis. And it sews. It sews all these services together. For a refined mind, SeWeR is an integrated portal to common web-based services in bioinformatics. SeWeR is cross-browser DHTML. It is written entirely in JavaScript1.2. Hence it will run only in Netscape 4.0 or higher and Internet Explorer 4.0 or higher.
Reference
M. K. Basu. 2001. SeWeR: a customizable and integrated dynamic HTML interface to bioinformatics services. Bioinformatics. 17(6): 577-578.
Available Downloads
SeWeR is feather-light! The whole package is just around 300K. You can even run it from a floppy. The zip archive is available at two locations:
IUBIO
http://www.bioinformatics.org/sewer/sewer.zip
STRIDE
Description
STRIDE is a program to recognize secondary structural elements in proteins from their atomic coordinates. It performs the same task as DSSP by Kabsch and Sander but utilizes both hydrogen bond energy and mainchain dihedral angles rather than hydrogen bonds alone. It relies on database-derived recognition parameters with the crystallographers' secondary structure definitions as a standard-of-truth. Please see Frishman and Argos for detailed description of the algorithm.
Reference
D. Frishman & P. Argos. 1995. Knowledge-based secondary structure assignment. Proteins. 23: 566-579.
Available Downloads
Executables of STRIDE for several UNIX platforms, VAX/VMS, OpenVMS, Dos and Mac together with documentation and source code are available by anonymous FTP from ftp://ftp.ebi.ac.uk/ (directories /pub/software/unix/stride, /pub/software/dos/stride, /pub/software/vms/stride, /pub/software/mac/stride). Data files with STRIDE secondary structure assignments for the current release of the PDB databank are in the directory /pub/databases/stride of the same site. Atomic coordinate sets can be submitted for secondary structure assignment through electronic mail to stride@embl-heildelberg.de. A mail message containing HELP in the first line will be answered with appropriate instructions. See also WWW page http://www.embl-heidelberg.de/stride/stride_info.html.
XYLEM
Description
XYLEM(1) is a package of tools designed to exploit the Unix environment to enable the user to identify, extract and manipulate data from major databases such as GenBank, EMBL and PIR. SPLITDB splits database files into annotation, sequence, and index files for more efficient searching. Fundamental to the power of these programs is the ability to perform operations on groups of sequences, represented by names or accession numbers which function as virtual database subsets. Keyword searches can be performed by FINDKEY. Hits can be retrieved using FETCH. The most powerful program is FEATURES, which uses the GETOB parser to evaluate GenBank/EMBL/DDBJ Features Table expressions, thereby extract features (eg. mRNA, sig_peptide, intron) from lists of entries. Additional programs perform operations such as translation or randomization of datasets, and formatting of multiply-aligned sequences for publication. XYLEM is compatible with the Fristensky Sequence Analysis Package, and the Pearson FASTA programs(2), and can be used from within the Genetic Data Environment (GDE) of Steven Smith(3).
Reference
B. Fristensky. 1993. Feature expressions: creating and manipulating sequence datasets. NAR. 21: 5997-6003.
W. R. Pearson and D. J. Lipman. 1988. Improved tools for biological sequence comparison. PNAS. 85: 2444-2448.
S. W. Smith, R. Overbeek, C. R. Woese, W. Gilbert and P. M. Gillevet. 1994. The genetic data environment an expandable GUI for multiple sequence analysis. Computer Applications in the Biosciences. 10:671-675
Available Downloads
Source code and documentation (xylem.1.8.7.tar.Z, 418 k)
Solaris/Sparc binaries (xylem.1.8.7.solaris-bin.tar.Z, 192k)
Linux/Intel binaries (xylem.1.8.7.linux-bin.tar.Z, 179k)
195 Free Online Programming Books
How to Be a Programmer http://samizdat.mines.edu/howto/HowToBeAProgrammer.html
How to Design Programs http://www.htdp.org/2002-09-22/Book/
Practical Theory of Programming http://www.cs.toronto.edu/~hehner/aPToP/
Software Engineering for Internet Applications http://philip.greenspun.com/seia/S
tructure and interpretation of computer programs http://mitpress.mit.edu/SICP/
More programming books http://2020ok.com/3839.htm
The Programmers Stone http://www.reciprocality.org/Reciprocality/r0/
Subversion Version Control: Using the Subversion Version Control System in Development Projects http://www.phptr.com/promotions/promotion.asp?promo=1484&redir=1&rl=1
Ada
Ada 95 Rational http://www.adaic.org/standards/95rat/RATht…5-contents.html
Ada 95 Reference Manual http://www.adahome.com/rm95/
Changes to Ada 1987 - 1995 http://www.oopweb.com/Ada/Documents/Change…lumeFrames.html
Ada 95: The Lovelace Tutorial http://www.adahome.com/Tutorials/Lovelace/master.htm
The Big Online Book of Linux Ada Programming http://www.pegasoft.ca/resources/boblap/book.html
Algorithms
Algorithms and Complexity http://www.cis.upenn.edu/~wilf/AlgComp.html
Programming Algorithms http://2020ok.com/3870.htm
Information Theory, Inference, and Learning Algorithms http://www.inference.phy.cam.ac.uk/mackay/itprnn/book.html
Assembly
Assembly Language Tutorial http://www.oopweb.com/Assembly/Documents/a…lumeFrames.html
Programming From the Ground Up http://download.savannah.gnu.org/releases/pgubook/
Assembly Language Programming http://2020ok.com/3954.htm
Ralph Brown's Interrupt List http://www.oopweb.com/Assembly/Documents/I…lumeFrames.html
The Art of Assembly Language Programming http://www.oopweb.com/Assembly/Documents/A…lumeFrames.html
The Assembly Language Database http://www.oopweb.com/Assembly/Download/NortonGuide.zip
Win32 Programming for x86 Assembly Language Programmers http://www.oopweb.com/Assembly/Documents/Win32ASM/VolumeFrames.html
C
A Tutorial on Pointers and Arrays in C http://www.oopweb.com/CPP/Documents/CPoint…lumeFrames.html
C Programming http://www.oopweb.com/CPP/Documents/CProgr…lumeFrames.html
Object Orientated Programming in ANSI-C http://www.planetpdf.com/developer/article…?contentid=6635
The C Book http://publications.gbdirect.co.uk/c_book/
Writing Bug-Free C Code http://www.duckware.com/bugfreec/index.html
C - Elements of Style http://www.computer-books.us/c_3.php
Learning GNU C http://www.linuxtopia.org/online_books/programming_books/
learning_gnu_c/index.html
C++
An Overview Of The C++ Programming Langauge http://www.oopweb.com/CPP/Download/crc.zip
C++ Annotations http://www.oopweb.com/CPP/Documents/CPPAnn…lumeFrames.html
C++ Annotations http://www.oopweb.com/CPP/Download/cplusplus.zip
C++ Coding Standard http://www.oopweb.com/CPP/Documents/CodeSt…lumeFrames.html
C & C++ http://2020ok.com/3956.htm
C++ Course http://www.oopweb.com/CPP/Download/CPPCourse.zip
C++ How To http://www.oopweb.com/CPP/Documents/CPPHOW…lumeFrames.html
C++ In Action http://www.relisoft.com/book/index.htm
C++: A Dialog http://www.steveheller.com/cppad/cppad.htm
How To Think Like A Computer Scientist with C++ http://www.oopweb.com/CPP/Documents/ThinkC…lumeFrames.html
Introduction To OOP Using C++ http://www.oopweb.com/CPP/Documents/Intro2…lumeFrames.html
Introduction To OOP Using C++ http://www.oopweb.com/CPP/Download/Intro2OOP.zip
Objects First http://www.oopweb.com/CPP/Documents/Object…lumeFrames.html
Optimizing C++ http://www.steveheller.com/opt/
STL Guide http://www.oopweb.com/CPP/Documents/
STLGui…lumeFrames.htmlS
TL Guide http://www.oopweb.com/CPP/Download/stl.zip
The Function Pointer Tutorials http://www.oopweb.com/CPP/Documents/Functi…lumeFrames.html
The Standard Template Library Tutorial http://www.oopweb.com/CPP/Documents/STL/VolumeFrames.html
Thinking in C++ http://www.planetpdf.com/developer/article…?ContentID=6634
Thinking in C++, Second Edition (Volumes 1 & 2) http://mindview.net/Books/TICPP/ThinkingInCPP2e.html
An Introduction to C++ Programming http://www.computer-books.us/cpp_1.php
Programming in C++ - Rules and Recommendations http://www.computer-books.us/cpp_6.php
A Beginners C++ Book http://www.uow.edu.au/~nabg/ABC/ABC.html
C++ GUI Programming with Qt 3 http://www.phptr.com/promotion/1484?redir=1
Cross-Platform GUI Programming with wxWidgets http://www.phptr.com/promotion/1484?redir=1
C#
C# in Detail http://www.computer-books.us/csharp_0005.php
C# - The Basics http://www.computer-books.us/csharp_0004.php
C# Language Specification http://www.computer-books.us/csharp_1.php
Data Structures and Algorithms with Object-Oriented Design Patterns in C# http://www.computer-books.us/csharp_2.php
C# Programming http://2020ok.com/697342.htm
Dissecting a C# Application - Inside SharpDevelop http://www.computer-books.us/csharp_3.php
C# tutorial (2 .pdf's) http://www.ssw.uni-linz.ac.at/Teaching/Lectures/CSharp/Tutorial/
CGI
CGI Programming on the World Wide Web http://www.oreilly.com/openbook/cgi/
CGI Programming http://2020ok.com/4025.htm
COBOL
zingCOBOL - A Beginners Guide to COBOL Programming http://www.computer-books.us/cobol_0006.php
Teach Yourself COBOL in 21 Days
http://www.computer-books.us/cobol_0005.php
WebSphere Studio COBOL for Windows - Language Reference http://www.computer-books.us/cobol_1.php
COBOL Programming Course http://www.computer-books.us/cobol_2.php
COBOL Programming http://2020ok.com/3969.htm
WebSphere Studio COBOL for Windows - Programming Guide http://www.computer-books.us/cobol_3.php
HP COBOL II/XL Reference Manual http://www.computer-books.us/cobol_4.php
Databases
MySQL Reference Manual http://dev.mysql.com/doc/
Database http://2020ok.com/549646.htm
Oracle 10g Database Book and Documentation Library http://wtcis.wtamu.edu/oracle/
Delphi/Pascal
Delphi 2005 Tutorial for Beginners http://www.xcalibur.co.uk/training/Delphi2005/index.php
Delphi Training http://www.xcalibur.co.uk/training/delphi/oldindex.html
Essential Delphi http://marcocantu.com/edelphi/default.htm
Essential Pascal http://marcocantu.com/epascal/default.htm
Delphi Language Guide - Delphi For The Microsoft .NET Framework http://www.computer-books.us/delphi_2.php
Delphi Database Application Developers Guide http://www.computer-books.us/delphi_1.php
Fortran
Numerical Recipes with Fortran 77 http://www.library.cornell.edu/nr/cbookfpdf.html
Numerical Recipes with Fortran 90 http://www.library.cornell.edu/nr/cbookf90pdf.html
Professional Programmer's Guide to Fortran 77 http://www.computer-books.us/fortran_3.php
User Notes on Fortran Programming (UNFP) http://www.ibiblio.org/pub/languages/fortran/
HTML
HTML 4.01 Specifications http://www.oopweb.com/HTML/Documents/HTML4/VolumeFrames.html
Web Development http://2020ok.com/3510.htm
Writing HTML http://www.oopweb.com/HTML/Documents/Writing%20HTML/VolumeFrames.html
Java
How to Think Like a Computer Scientist with Java http://www.oopweb.com/Java/Documents/Think…lumeFrames.html
Introduction to Programming Using Java http://www.oopweb.com/Java/Documents/Intro…lumeFrames.html
Introduction To Programming Using Java http://www.linuxtopia.org/online_books/pro…ming/index.html
Java Programming Tutorial: Introduction to Computer Science http://www.oopweb.com/Java/Documents/JavaN…lumeFrames.html
Thinking in Java, 3rd Edition http://www.mindview.net/Books/TIJ/
Thinking in Enterprise Java http://www.ibiblio.org/pub/docs/books/eckel/
More Java Books http://kickjava.com/freeBooks.html
Java AWT Reference http://www.oreilly.com/catalog/javawt/book/index.html
Enterprise JavaBeans http://www.computer-books.us/java_1.php
Essentials of the Java Programming Language - Part 1 http://www.computer-books.us/java_2.php
Essentials of the Java Programming Language - Part 2 http://www.computer-books.us/java_3.php
Exploring Java http://www.computer-books.us/java_4.php
Introduction to Computer Science using Java http://www.computer-books.us/java_5.php
Java Development http://2020ok.com/3608.htm
Java Language Reference http://www.computer-books.us/java_8.php
Java Servlet Programming http://www.computer-books.us/java_9.php
Java Web Services Tutorial http://www.computer-books.us/java_10.php
Java Look and Feel Design Guidelines, Second Edition http://java.sun.com/products/jlf/ed2/book/index.html
The Design Patterns: Java Companion http://www.patterndepot.com/put/8/JavaPatterns.htm
1000 Java Tips e-Book http://javaa.com
Apache Jakarta Commons: Reusable Java™ Components http://www.phptr.com/promotion/1484?redir=1
Java™ Application Development on Linux® http://www.phptr.com/promotion/1484?redir=1
Practical Artificial Intelligence Programming in Java http://www.markwatson.com/opencontent/javaai_lic.htm
Javascript
Voodoo's Introduction to Javascript http://www.oopweb.com/JavaScript/Documents…lumeFrames.html
Javascript Programming http://2020ok.com/3617.htm
Linux
Linux Device Drivers, Third Edition http://lwn.net/Kernel/LDD3/
The Linux Development Platform http://www.phptr.com/promotion/1484?redir=1
Understanding the Linux Virtual Memory Manager http://www.phptr.com/promotion/1484?redir=1
Self-Service Linux®: Mastering the Art of Problem Determination http://www.phptr.com/promotion/1484?redir=1
Linux® Quick Fix Notebook http://www.phptr.com/promotion/1484?redir=1
Managing Linux Systems with Webmin: System Administration and Module Development http://www.phptr.com/promotion/1484?redir=1
An Introduction to GCC http://www.linuxtopia.org/online_books/an_…_gcc/index.html
Linux http://2020ok.com/3756.htm
Using the GNU Compiler Collection (GCC)
http://www.linuxtopia.org/online_books/pro…tion/index.html
Bash Reference Guide http://www.linuxtopia.org/online_books/bas…uide/index.html
Bash Guide for Beginners http://www.linuxtopia.org/online_books/bas…ners/index.html
Advanced Bash Scripting Guide http://www.linuxtopia.org/online_books/adv…uide/index.html
Linux Kernel Module Programming Guide http://www.linuxtopia.org/online_books/Lin…uide/index.html
Red Hat Linux Developer Tools Guide http://www.linuxtopia.org/online_books/red…uide/index.html
Linux Debugging with gdb Guide http://www.linuxtopia.org/online_books/red…_gdb/index.html
Using cpp, the C Preprocessor Guide http://www.linuxtopia.org/online_books/programming_tool_guides/
redhat_using_cpp_c_preprocessor/index.html
Lisp
Loving Lisp - the Savy Programmer's Secret Weapon http://www.markwatson.com/opencontent/lisp_lic.htm
List Programming http://2020ok.com/3981.htm
Open Source
Rapid Application Development with Mozilla http://www.phptr.com/promotion/1484?redir=1
Creating Applications with Mozilla http://books.mozdev.org/chapters/index.html
Free as in Freedom http://www.oreilly.com/openbook/freedom/index.html
Managing Projects with GNU make, 3rd Edition http://www.oreilly.com/catalog/make3/book/index.csp
OpenSources: Voices from the Open Source Revolution http://www.oreilly.com/catalog/opensources/book/toc.html
Understanding Open Source and Free Software Licensing http://www.oreilly.com/catalog/osfreesoft/book/
Embedded Software Development with eCos http://www.phptr.com/promotion/1484?redir=1
Open Source Security Tools: A Practical Guide to Security Applications http://www.phptr.com/promotion/1484?redir=1
Perl
HTMLified Perl 5 Reference Guide http://www.oopweb.com/Perl/Documents/Perl5…lumeFrames.html
Perl 5 Documentation http://www.oopweb.com/Perl/Documents/PerlD…lumeFrames.html
Perl for Perl Newbies http://www.oopweb.com/Perl/Documents/P4PNe…lumeFrames.html
Perl for Win32 FAQ http://www.oopweb.com/Perl/Documents/PerlW…lumeFrames.html
Picking Up Perl http://www.oopweb.com/Perl/Documents/Picki…lumeFrames.html
Picking Up Perl http://www.linuxtopia.org/online_books/perl/index.html
Perl Programming http://www.2020ok.com/4045.htm
Practical Perl Programming http://www.oopweb.com/Perl/Documents/ppp/VolumeFrames.html
Beginning Perl http://www.perl.org/books/beginning-perl/
Impatient Perl http://www.perl.org/books/impatient-perl/
Extreme Perl http://www.extremeperl.org/bk/home
MacPerl: Power & Ease http://macperl.com/ptf_book/r/MP/i2.html
Embedding Perl in HTML with Mason http://www.masonbook.com/
Perl for the Web http://www.globalspin.com/thebook/
Practical mod_perl (1st edition) http://modperlbook.com/
Web Client Programming with Perl http://www.oreilly.com/openbook/webclient/
Perl 5 By Example http://www.computer-books.us/perl_0010.php
An Introduction to Perl http://www.linuxtopia.org/Perl_Tutorial/index.html
PHP
Practical PHP Programming http://www.hudzilla.org/phpbook/
A Programmer's Introduction to PHP 4.0 -http://www.apress.com/free/PHP 5 Power Programming http://www.computer-books.us/php_2.php
PHP Programming http://2020ok.com/295223.htm
Practical PHP Programming http://www.computer-books.us/php_3.php
Prolog
Adventure in Prolog http://www.amzi.com/AdventureInProlog/
Building Expert Systems in Prolog -http://www.amzi.com/ExpertSystemsInProlog/Prolog programming http://2020ok.com/295223.htm
Prolog Programming A First Course http://computing.unn.ac.uk/staff/cgpb4/prologbook/
Python
Non-Programmers Tutorial for Python http://rupert.honors.montana.edu/~jjc/easy…ut/easytut.html
Official Python Documentation http://www.python.org/doc/current/
Text Processing in Python -http://gnosis.cx/TPiP/Python Reference Manual http://docs.python.org/ref/ref.html
Python Imaging Library Handbook -http://www.pythonware.com/library/the-python-imaging-library.htmHow to Think Like a Computer Scientist - Learning with Python http://www.greenteapress.com/thinkpython
Dive Into Python -http://diveintopython.org/Python Programming http://2020ok.com/285856.htm
Thinking in Python http://mindview.net/Books/TIPython
A Byte of Python http://www.ibiblio.org/g2swap/byteofpython/read/
Ruby
Programming Ruby - The Pragmatic Programmer's Guide (First Edition) http://www.ruby-doc.org/docs/ProgrammingRuby/
Why's (Poignant) Guide to Ruby http://poignantguide.net/ruby/
<–the funniest programming book I have ever seen! Samba
Samba-3 by Example: Practical Exercises to Successful Deployment http://www.phptr.com/promotion/1484?redir=1
Samba-3 by Example: Practical Exercises to Successful Deployment, 2nd Edition http://www.phptr.com/promotion/1484?redir=1
The Official Samba-3 HOWTO and Reference Guide http://www.phptr.com/promotion/1484?redir=1
Implementing CIFS: The Common Internet File System http://www.phptr.com/promotion/1484?redir=1
SQL
Comparison of Different SQL Implementations http://www.computer-books.us/sql_0004.php
SQL - A Practical Introduction http://www.managedtime.com/freesqlbook.php3
Introduction To Structured Query Language http://www.computer-books.us/sql_2.php
Practical PostgreSQL http://www.opendocspublishing.com/ppbook/
UNIX
FreeBSD Handbook http://www.freebsd.org/doc/en_US.ISO8859-1…book/index.html
Unix http://2020ok.com/3778.htm
The UNIX-HATERS Handbook http://research.microsoft.com/~daniel/unix-haters.html
Visual Basic and VB.net
Programming VB.NET - A Guide For Experienced Programmers http://www.apress.com/free/
Upgrading Microsoft Visual Basic 6.0 to Microsoft Visual Basic .NET http://msdn.microsoft.com/vbrun/staythepat…s/upgradingvb6/V
isual Basic http://2020ok.com/3996.htm
Introducing Visual Basic 2005 for Developers
http://msdn.microsoft.com/vbrun/staythepath/additionalresources/IntroTo2005/default.aspx
XML
OpenOffice.org XML Essentials http://books.evc-cit.info/
Misc. stuff that is worth reading
FREE Trade Magazine Subscriptions & Technical Document Downloads http://i.nl03.net/ltr0/?_m=01.009i.nv.mfm.nv
The Future does not compute http://www.praxagora.com/stevet/fdnc/toc.html
\The Cathedral and the Bazaar http://www.catb.org/~esr/writings/cathedral-bazaar/
How to Design Programs http://www.htdp.org/2002-09-22/Book/
Practical Theory of Programming http://www.cs.toronto.edu/~hehner/aPToP/
Software Engineering for Internet Applications http://philip.greenspun.com/seia/S
tructure and interpretation of computer programs http://mitpress.mit.edu/SICP/
More programming books http://2020ok.com/3839.htm
The Programmers Stone http://www.reciprocality.org/Reciprocality/r0/
Subversion Version Control: Using the Subversion Version Control System in Development Projects http://www.phptr.com/promotions/promotion.asp?promo=1484&redir=1&rl=1
Ada
Ada 95 Rational http://www.adaic.org/standards/95rat/RATht…5-contents.html
Ada 95 Reference Manual http://www.adahome.com/rm95/
Changes to Ada 1987 - 1995 http://www.oopweb.com/Ada/Documents/Change…lumeFrames.html
Ada 95: The Lovelace Tutorial http://www.adahome.com/Tutorials/Lovelace/master.htm
The Big Online Book of Linux Ada Programming http://www.pegasoft.ca/resources/boblap/book.html
Algorithms
Algorithms and Complexity http://www.cis.upenn.edu/~wilf/AlgComp.html
Programming Algorithms http://2020ok.com/3870.htm
Information Theory, Inference, and Learning Algorithms http://www.inference.phy.cam.ac.uk/mackay/itprnn/book.html
Assembly
Assembly Language Tutorial http://www.oopweb.com/Assembly/Documents/a…lumeFrames.html
Programming From the Ground Up http://download.savannah.gnu.org/releases/pgubook/
Assembly Language Programming http://2020ok.com/3954.htm
Ralph Brown's Interrupt List http://www.oopweb.com/Assembly/Documents/I…lumeFrames.html
The Art of Assembly Language Programming http://www.oopweb.com/Assembly/Documents/A…lumeFrames.html
The Assembly Language Database http://www.oopweb.com/Assembly/Download/NortonGuide.zip
Win32 Programming for x86 Assembly Language Programmers http://www.oopweb.com/Assembly/Documents/Win32ASM/VolumeFrames.html
C
A Tutorial on Pointers and Arrays in C http://www.oopweb.com/CPP/Documents/CPoint…lumeFrames.html
C Programming http://www.oopweb.com/CPP/Documents/CProgr…lumeFrames.html
Object Orientated Programming in ANSI-C http://www.planetpdf.com/developer/article…?contentid=6635
The C Book http://publications.gbdirect.co.uk/c_book/
Writing Bug-Free C Code http://www.duckware.com/bugfreec/index.html
C - Elements of Style http://www.computer-books.us/c_3.php
Learning GNU C http://www.linuxtopia.org/online_books/programming_books/
learning_gnu_c/index.html
C++
An Overview Of The C++ Programming Langauge http://www.oopweb.com/CPP/Download/crc.zip
C++ Annotations http://www.oopweb.com/CPP/Documents/CPPAnn…lumeFrames.html
C++ Annotations http://www.oopweb.com/CPP/Download/cplusplus.zip
C++ Coding Standard http://www.oopweb.com/CPP/Documents/CodeSt…lumeFrames.html
C & C++ http://2020ok.com/3956.htm
C++ Course http://www.oopweb.com/CPP/Download/CPPCourse.zip
C++ How To http://www.oopweb.com/CPP/Documents/CPPHOW…lumeFrames.html
C++ In Action http://www.relisoft.com/book/index.htm
C++: A Dialog http://www.steveheller.com/cppad/cppad.htm
How To Think Like A Computer Scientist with C++ http://www.oopweb.com/CPP/Documents/ThinkC…lumeFrames.html
Introduction To OOP Using C++ http://www.oopweb.com/CPP/Documents/Intro2…lumeFrames.html
Introduction To OOP Using C++ http://www.oopweb.com/CPP/Download/Intro2OOP.zip
Objects First http://www.oopweb.com/CPP/Documents/Object…lumeFrames.html
Optimizing C++ http://www.steveheller.com/opt/
STL Guide http://www.oopweb.com/CPP/Documents/
STLGui…lumeFrames.htmlS
TL Guide http://www.oopweb.com/CPP/Download/stl.zip
The Function Pointer Tutorials http://www.oopweb.com/CPP/Documents/Functi…lumeFrames.html
The Standard Template Library Tutorial http://www.oopweb.com/CPP/Documents/STL/VolumeFrames.html
Thinking in C++ http://www.planetpdf.com/developer/article…?ContentID=6634
Thinking in C++, Second Edition (Volumes 1 & 2) http://mindview.net/Books/TICPP/ThinkingInCPP2e.html
An Introduction to C++ Programming http://www.computer-books.us/cpp_1.php
Programming in C++ - Rules and Recommendations http://www.computer-books.us/cpp_6.php
A Beginners C++ Book http://www.uow.edu.au/~nabg/ABC/ABC.html
C++ GUI Programming with Qt 3 http://www.phptr.com/promotion/1484?redir=1
Cross-Platform GUI Programming with wxWidgets http://www.phptr.com/promotion/1484?redir=1
C#
C# in Detail http://www.computer-books.us/csharp_0005.php
C# - The Basics http://www.computer-books.us/csharp_0004.php
C# Language Specification http://www.computer-books.us/csharp_1.php
Data Structures and Algorithms with Object-Oriented Design Patterns in C# http://www.computer-books.us/csharp_2.php
C# Programming http://2020ok.com/697342.htm
Dissecting a C# Application - Inside SharpDevelop http://www.computer-books.us/csharp_3.php
C# tutorial (2 .pdf's) http://www.ssw.uni-linz.ac.at/Teaching/Lectures/CSharp/Tutorial/
CGI
CGI Programming on the World Wide Web http://www.oreilly.com/openbook/cgi/
CGI Programming http://2020ok.com/4025.htm
COBOL
zingCOBOL - A Beginners Guide to COBOL Programming http://www.computer-books.us/cobol_0006.php
Teach Yourself COBOL in 21 Days
http://www.computer-books.us/cobol_0005.php
WebSphere Studio COBOL for Windows - Language Reference http://www.computer-books.us/cobol_1.php
COBOL Programming Course http://www.computer-books.us/cobol_2.php
COBOL Programming http://2020ok.com/3969.htm
WebSphere Studio COBOL for Windows - Programming Guide http://www.computer-books.us/cobol_3.php
HP COBOL II/XL Reference Manual http://www.computer-books.us/cobol_4.php
Databases
MySQL Reference Manual http://dev.mysql.com/doc/
Database http://2020ok.com/549646.htm
Oracle 10g Database Book and Documentation Library http://wtcis.wtamu.edu/oracle/
Delphi/Pascal
Delphi 2005 Tutorial for Beginners http://www.xcalibur.co.uk/training/Delphi2005/index.php
Delphi Training http://www.xcalibur.co.uk/training/delphi/oldindex.html
Essential Delphi http://marcocantu.com/edelphi/default.htm
Essential Pascal http://marcocantu.com/epascal/default.htm
Delphi Language Guide - Delphi For The Microsoft .NET Framework http://www.computer-books.us/delphi_2.php
Delphi Database Application Developers Guide http://www.computer-books.us/delphi_1.php
Fortran
Numerical Recipes with Fortran 77 http://www.library.cornell.edu/nr/cbookfpdf.html
Numerical Recipes with Fortran 90 http://www.library.cornell.edu/nr/cbookf90pdf.html
Professional Programmer's Guide to Fortran 77 http://www.computer-books.us/fortran_3.php
User Notes on Fortran Programming (UNFP) http://www.ibiblio.org/pub/languages/fortran/
HTML
HTML 4.01 Specifications http://www.oopweb.com/HTML/Documents/HTML4/VolumeFrames.html
Web Development http://2020ok.com/3510.htm
Writing HTML http://www.oopweb.com/HTML/Documents/Writing%20HTML/VolumeFrames.html
Java
How to Think Like a Computer Scientist with Java http://www.oopweb.com/Java/Documents/Think…lumeFrames.html
Introduction to Programming Using Java http://www.oopweb.com/Java/Documents/Intro…lumeFrames.html
Introduction To Programming Using Java http://www.linuxtopia.org/online_books/pro…ming/index.html
Java Programming Tutorial: Introduction to Computer Science http://www.oopweb.com/Java/Documents/JavaN…lumeFrames.html
Thinking in Java, 3rd Edition http://www.mindview.net/Books/TIJ/
Thinking in Enterprise Java http://www.ibiblio.org/pub/docs/books/eckel/
More Java Books http://kickjava.com/freeBooks.html
Java AWT Reference http://www.oreilly.com/catalog/javawt/book/index.html
Enterprise JavaBeans http://www.computer-books.us/java_1.php
Essentials of the Java Programming Language - Part 1 http://www.computer-books.us/java_2.php
Essentials of the Java Programming Language - Part 2 http://www.computer-books.us/java_3.php
Exploring Java http://www.computer-books.us/java_4.php
Introduction to Computer Science using Java http://www.computer-books.us/java_5.php
Java Development http://2020ok.com/3608.htm
Java Language Reference http://www.computer-books.us/java_8.php
Java Servlet Programming http://www.computer-books.us/java_9.php
Java Web Services Tutorial http://www.computer-books.us/java_10.php
Java Look and Feel Design Guidelines, Second Edition http://java.sun.com/products/jlf/ed2/book/index.html
The Design Patterns: Java Companion http://www.patterndepot.com/put/8/JavaPatterns.htm
1000 Java Tips e-Book http://javaa.com
Apache Jakarta Commons: Reusable Java™ Components http://www.phptr.com/promotion/1484?redir=1
Java™ Application Development on Linux® http://www.phptr.com/promotion/1484?redir=1
Practical Artificial Intelligence Programming in Java http://www.markwatson.com/opencontent/javaai_lic.htm
Javascript
Voodoo's Introduction to Javascript http://www.oopweb.com/JavaScript/Documents…lumeFrames.html
Javascript Programming http://2020ok.com/3617.htm
Linux
Linux Device Drivers, Third Edition http://lwn.net/Kernel/LDD3/
The Linux Development Platform http://www.phptr.com/promotion/1484?redir=1
Understanding the Linux Virtual Memory Manager http://www.phptr.com/promotion/1484?redir=1
Self-Service Linux®: Mastering the Art of Problem Determination http://www.phptr.com/promotion/1484?redir=1
Linux® Quick Fix Notebook http://www.phptr.com/promotion/1484?redir=1
Managing Linux Systems with Webmin: System Administration and Module Development http://www.phptr.com/promotion/1484?redir=1
An Introduction to GCC http://www.linuxtopia.org/online_books/an_…_gcc/index.html
Linux http://2020ok.com/3756.htm
Using the GNU Compiler Collection (GCC)
http://www.linuxtopia.org/online_books/pro…tion/index.html
Bash Reference Guide http://www.linuxtopia.org/online_books/bas…uide/index.html
Bash Guide for Beginners http://www.linuxtopia.org/online_books/bas…ners/index.html
Advanced Bash Scripting Guide http://www.linuxtopia.org/online_books/adv…uide/index.html
Linux Kernel Module Programming Guide http://www.linuxtopia.org/online_books/Lin…uide/index.html
Red Hat Linux Developer Tools Guide http://www.linuxtopia.org/online_books/red…uide/index.html
Linux Debugging with gdb Guide http://www.linuxtopia.org/online_books/red…_gdb/index.html
Using cpp, the C Preprocessor Guide http://www.linuxtopia.org/online_books/programming_tool_guides/
redhat_using_cpp_c_preprocessor/index.html
Lisp
Loving Lisp - the Savy Programmer's Secret Weapon http://www.markwatson.com/opencontent/lisp_lic.htm
List Programming http://2020ok.com/3981.htm
Open Source
Rapid Application Development with Mozilla http://www.phptr.com/promotion/1484?redir=1
Creating Applications with Mozilla http://books.mozdev.org/chapters/index.html
Free as in Freedom http://www.oreilly.com/openbook/freedom/index.html
Managing Projects with GNU make, 3rd Edition http://www.oreilly.com/catalog/make3/book/index.csp
OpenSources: Voices from the Open Source Revolution http://www.oreilly.com/catalog/opensources/book/toc.html
Understanding Open Source and Free Software Licensing http://www.oreilly.com/catalog/osfreesoft/book/
Embedded Software Development with eCos http://www.phptr.com/promotion/1484?redir=1
Open Source Security Tools: A Practical Guide to Security Applications http://www.phptr.com/promotion/1484?redir=1
Perl
HTMLified Perl 5 Reference Guide http://www.oopweb.com/Perl/Documents/Perl5…lumeFrames.html
Perl 5 Documentation http://www.oopweb.com/Perl/Documents/PerlD…lumeFrames.html
Perl for Perl Newbies http://www.oopweb.com/Perl/Documents/P4PNe…lumeFrames.html
Perl for Win32 FAQ http://www.oopweb.com/Perl/Documents/PerlW…lumeFrames.html
Picking Up Perl http://www.oopweb.com/Perl/Documents/Picki…lumeFrames.html
Picking Up Perl http://www.linuxtopia.org/online_books/perl/index.html
Perl Programming http://www.2020ok.com/4045.htm
Practical Perl Programming http://www.oopweb.com/Perl/Documents/ppp/VolumeFrames.html
Beginning Perl http://www.perl.org/books/beginning-perl/
Impatient Perl http://www.perl.org/books/impatient-perl/
Extreme Perl http://www.extremeperl.org/bk/home
MacPerl: Power & Ease http://macperl.com/ptf_book/r/MP/i2.html
Embedding Perl in HTML with Mason http://www.masonbook.com/
Perl for the Web http://www.globalspin.com/thebook/
Practical mod_perl (1st edition) http://modperlbook.com/
Web Client Programming with Perl http://www.oreilly.com/openbook/webclient/
Perl 5 By Example http://www.computer-books.us/perl_0010.php
An Introduction to Perl http://www.linuxtopia.org/Perl_Tutorial/index.html
PHP
Practical PHP Programming http://www.hudzilla.org/phpbook/
A Programmer's Introduction to PHP 4.0 -http://www.apress.com/free/PHP 5 Power Programming http://www.computer-books.us/php_2.php
PHP Programming http://2020ok.com/295223.htm
Practical PHP Programming http://www.computer-books.us/php_3.php
Prolog
Adventure in Prolog http://www.amzi.com/AdventureInProlog/
Building Expert Systems in Prolog -http://www.amzi.com/ExpertSystemsInProlog/Prolog programming http://2020ok.com/295223.htm
Prolog Programming A First Course http://computing.unn.ac.uk/staff/cgpb4/prologbook/
Python
Non-Programmers Tutorial for Python http://rupert.honors.montana.edu/~jjc/easy…ut/easytut.html
Official Python Documentation http://www.python.org/doc/current/
Text Processing in Python -http://gnosis.cx/TPiP/Python Reference Manual http://docs.python.org/ref/ref.html
Python Imaging Library Handbook -http://www.pythonware.com/library/the-python-imaging-library.htmHow to Think Like a Computer Scientist - Learning with Python http://www.greenteapress.com/thinkpython
Dive Into Python -http://diveintopython.org/Python Programming http://2020ok.com/285856.htm
Thinking in Python http://mindview.net/Books/TIPython
A Byte of Python http://www.ibiblio.org/g2swap/byteofpython/read/
Ruby
Programming Ruby - The Pragmatic Programmer's Guide (First Edition) http://www.ruby-doc.org/docs/ProgrammingRuby/
Why's (Poignant) Guide to Ruby http://poignantguide.net/ruby/
<–the funniest programming book I have ever seen! Samba
Samba-3 by Example: Practical Exercises to Successful Deployment http://www.phptr.com/promotion/1484?redir=1
Samba-3 by Example: Practical Exercises to Successful Deployment, 2nd Edition http://www.phptr.com/promotion/1484?redir=1
The Official Samba-3 HOWTO and Reference Guide http://www.phptr.com/promotion/1484?redir=1
Implementing CIFS: The Common Internet File System http://www.phptr.com/promotion/1484?redir=1
SQL
Comparison of Different SQL Implementations http://www.computer-books.us/sql_0004.php
SQL - A Practical Introduction http://www.managedtime.com/freesqlbook.php3
Introduction To Structured Query Language http://www.computer-books.us/sql_2.php
Practical PostgreSQL http://www.opendocspublishing.com/ppbook/
UNIX
FreeBSD Handbook http://www.freebsd.org/doc/en_US.ISO8859-1…book/index.html
Unix http://2020ok.com/3778.htm
The UNIX-HATERS Handbook http://research.microsoft.com/~daniel/unix-haters.html
Visual Basic and VB.net
Programming VB.NET - A Guide For Experienced Programmers http://www.apress.com/free/
Upgrading Microsoft Visual Basic 6.0 to Microsoft Visual Basic .NET http://msdn.microsoft.com/vbrun/staythepat…s/upgradingvb6/V
isual Basic http://2020ok.com/3996.htm
Introducing Visual Basic 2005 for Developers
http://msdn.microsoft.com/vbrun/staythepath/additionalresources/IntroTo2005/default.aspx
XML
OpenOffice.org XML Essentials http://books.evc-cit.info/
Misc. stuff that is worth reading
FREE Trade Magazine Subscriptions & Technical Document Downloads http://i.nl03.net/ltr0/?_m=01.009i.nv.mfm.nv
The Future does not compute http://www.praxagora.com/stevet/fdnc/toc.html
\The Cathedral and the Bazaar http://www.catb.org/~esr/writings/cathedral-bazaar/
OPSIN PROTEIN
Introduction
This tutorial allows you to explore opsins -- the proteins that catch light for our eyes -- and the genes that code for opsins. But the real subject of this exercise is bioinformatics -- the use of computers to search for, explore, and use information about genes, nucleic acids, and proteins. While learning about the human opsins, you will use some of today's most powerful bioinformatics tools. You can follow up this tutorial with a study of opsins from other organisms, or by exploring any class of biomolecules that interest you.
I assume that you are conversant with biochemistry and molecular biology. If you see unfamiliar terms pertaining to the genes, mRNAs, and proteins used as examples here, break out your biochemistry text, head for the index, and review, review, review.
For more information about each database or tool, go to its home page and read, read read.
If you are a student in my biochemistry course (CHY 563 at the University of Southern Maine), you will find that this tutorial follows closely my classroom demonstration of bioinformatics tools applied to finding desired information in databases.
This tutorial allows you to explore opsins -- the proteins that catch light for our eyes -- and the genes that code for opsins. But the real subject of this exercise is bioinformatics -- the use of computers to search for, explore, and use information about genes, nucleic acids, and proteins. While learning about the human opsins, you will use some of today's most powerful bioinformatics tools. You can follow up this tutorial with a study of opsins from other organisms, or by exploring any class of biomolecules that interest you.
I assume that you are conversant with biochemistry and molecular biology. If you see unfamiliar terms pertaining to the genes, mRNAs, and proteins used as examples here, break out your biochemistry text, head for the index, and review, review, review.
For more information about each database or tool, go to its home page and read, read read.
If you are a student in my biochemistry course (CHY 563 at the University of Southern Maine), you will find that this tutorial follows closely my classroom demonstration of bioinformatics tools applied to finding desired information in databases.
Cast of Characters
I. The Databases (and their acronyms!)
Genbank, operated by NCBI (National Center for Biotechnology Information)Contains all publicly available sequences of DNA, with annotationsSame DNA sequence content as EMBL (European Molecular Biology Laboratory) and DDBJ (DNA Data Bank of Japan)
Swiss-Prot and TrEMBL, operated by SIB (Swiss Institute of Bioinformatics) and EBI (European Bioinformatics Institute)Contains most of the publicly available sequences of proteins, with annotations
Protein Data BankContains all publicly availalble experimentally determined structural models of proteins and nucleic acids (determined by x-ray crystallography and NMR)
Swiss-Model RepositoryContains many theoretical structural models of proteins (determined by automated homology modeling)
Online Mendelian Inheritance in ManA catalog of human genes and genetic disorders, linked to gene entries in GenBank
II. The Tools
NCBI Map ViewerFor finding genes and gene products (RNAs and proteins) that interest you
BLASTFor finding genes or proteins with sequences similar to yours
ClustalWFor comparing your sequence with others, and lots of sequences with each other
PhylipFor making phylogenetic trees, which show how sequences are related to each other
TreeprintFor printing phylogenetic trees
PSIPREDFor predicting the location of helices, pleated sheets, and transmembrane elements of proteins of unknown structure
Swiss-ModelFor automated building theoretical structural models of your sequence based on known structures (homology modeling)
Deep View (also knows as Swiss-PdbViewer)For seeing and exploring macromolecular models in three dimensions, and for manual and semiautomated homology modeling
PubMedFor searching ALL the literature of the life sciences
ExPASy (Expert Protein Analysis System)Not so much a tool as a tool box -- a very complete set of protein analysis tools
Here We Go.............
Our subject is human opsins, those proteins, found in the cells of your retina, that catch light and begin the process of vision. We will proceed by asking questions about opsins and opsin genes, and then using bioinformatics to answer them.
When I provide a web address, I'll also make it a link -- just click it to go to the site in a new browser window. Then make it a bookmark so you can find it again. This tutorial will still be open in the window behind the new one.
WARNING: Bioinformatics tools evolve rapidly, faster than I can make changes to this tutorial. So if a page does not look exactly like I say it should, or if its title is different, look around and try to do what the tutorial says. You should find the same links, but names may be slightly different. If the differences are so great that you can't proceed, send me email (address at top of this page), and I'll adapt the instructions to the changes as soon as I can.
Where are the opsin genes in the human genome?
Point your browser to http://www.ncbi.nlm.nih.gov/mapview/.
Note that you can look at a genome by clicking on the NAME of the species, not the word BLAST beside it. The species name takes you to a viewer for the genome of that organism. BLAST takes you to a tool for searching that genome (later).
Find Homo sapiens (human), and click on the OLDEST "build" or version of the genome.
NOTE: Why should you use the oldest build in this tutorial? Because sometimes not all links are hooked up to the tools for the newest build. Right now (2006/03/21, 4 PM), for example, the GenomeView button (see the section below entitled Where are all the genes for these other proteins?) appears for searches of Build 35.1, but does not appear for Build 36.
You see a diagram of the human chromosomes, and a search box at the top. Enter "opsin" in the box next to Search for.
Click Find.
You see the diagram again, with red marks at your "hits", the locations of genes whose entries contain "opsin" as a whole or partial word. Below the diagram is a list of the indicated genes, with . Among them are the rhodopsin gene (RHO), and three cone pigments, short-, medium-, and long-wavelength sensitive opsins (for blue, green, and red light detection). Four hits look like visual pigments, which probably does not surprise you. To the left of each entry is the chromosome number, allowing you to tell which red mark corresponds to each entry. Note that several hits are on the X chromosome, one of the sex-determining chromosomes. You can pursue multiple hits on the same chromosome with the all matches link for that chromosome.
NOTE: In the human genome lists, you will often see duplicates marked "reference" or "Celera", referring to the results from two major efforts to sequence the human genome. At first, these two efforts were separate, but eventually they came together. When you have a choice, choose "reference," so you will be following the same path I followed in setting up the tutorial.
Click all matches next to X.
You see a very complicated display (don't sweat -- we're going to use only a part of this now). On the left is a diagram of the X chromosome, with red marks at the positions of the gene(s) you've followed to this page -- in our case, the two opsins, medium- and long-wave, which are located near the bottom tip of the X chromosome. To the right are various representations of the X chromosome, with listings of annotated areas. The two opsin genes are highlighted in pink. If you pass your cursor over this page without clicking, you will find that some symbols provide brief information, most about regions that are not yet characterized well enough to have a full entry.
As you can see, there is a tremendous amount of information on this page, with links to much more. If you want full information about the meanings of abbreviations and symbols on this page, as well as the kinds of information linked to the page, you can use Map Viewer Help at the top of the page. You will find abundant information about the Map Viewer, explanations of all symbols and links, and even tutorials about how to ask and answer all kinds of questions about the genome.
For now, note the information provided for the first of the two highlighted opsin genes, OPN1LW (this is called the gene symbol). You see that this is the long-wavelength-sensitive (red) opsin, and that it's a gene involved in color blindness (a sex-linked trait -- no surprise).
What do scientists know about the opsins?
Click OPN1LW.
You have entered Entrez Gene, which is a sort of highway interchange with routing to all sorts of information about this gene. Scan down the page. Some of the information is very plain and understandable, while some is very cryptic. One of the most accessible links is to OMIM (for Online Mendeliam Inheritance in Man), a catalog of human genes and genetic disorders. Despite the name, the database includes genes of women, too.
Look down the page and find Phenotypes, and notice the links marked MIM. These are links to OMIM entries. Click one of them.
Each OMIM entry tells you about this gene and types of colorblindness, genetic disorders associated with mutations in this gene. Read as much as your interest dictates. Follow links to other information. For more information about OMIM itself, click the OMIM logo at the top of the page. Once you've satisfied your appetite, return to the Entrez Gene page (use the Back button of your browser or your browser's history list -- if you're lost, click HERE).
Next to the Display button, pull down the menu and select PubMed Links. Then click the Display button.
You have entered PubMed, a free database of scientific literature, to a list of articles directly associated with this gene locus. By clicking on the authors of each article, you can see abstracts of the article. If you are on a university campus where there is online access to specific journals, you might also see links to full articles. PubMed is your entry point to a wide variety of scientfic literature in the life sciences. On the left side of any PubMed page, you will find links to a description of the database, help, and tutorials on searching. Use the Find tool of your browser to find the name Nathans on this page. Read the abstract of the article by Nathans and co-workers before returning to Entrez Gene.
NB to GR: Add some guided searches in PubMed.
What is the nucleotide sequence of this gene?
Remember that we are looking at the gene for the red-sensitive opsin in human vision, and it is located near the bottom tip of the X chromosome. Scroll down to NCBI Reference Sequences (RefSeq). You see that the following are available:
mRNA Sequence (sequence of nucleotide bases in the messenger RNA)
Source Sequence (sequence of the entire genome fragment that contains this sequence, from GenBank)
Product (sequence of this gene's protein product, the red opsin).
Click the entry number beside mRNA Sequence.
This is a typical GenBank nucleotide file, and a lot of it is hard to read, but a few things are clear. First note, under references, citations to the publication of this sequence in the scientific literature. To see an abstract of the article in which this gene was described, click the PubMed link below the first reference and read it. Or instead, find the word Nathans on the page, and and click the PubMed link below the related article. As you see, you've been here before. There are many ways to move from one database to another, which is both a blessing and a curse. You have to keep your eyes open for useful links, and when you find a path that you think you might use again, make a note of it and bookmark the web pages. It is frustrating to know there's an easier way to do something, and not remember how you did it.
NB to GR: point back to this abstract when you get the phylogenetic tree.
Scroll to the bottom of this long page. The last thing, labeled ORIGIN, is the sequence of this messenger RNA. You are seeing the actual list of As, Ts, Gs, and Cs that make up the message for synthesis of this opsin. But wait! You know that RNA contains no T. In most nucleotide databases, U from RNA is represented as T, to make for easy comparison of DNA and RNA sequences. This sequence information is not in the form that is most useful for searching in databases, say, searching for related genes. Let's display this entry in a form more useful for searching.
At the top of the page, beside the Display button, pull down the menu that says default (we are looking at the default entry display), and select FASTA (note that several other display options are available). Then click the Display button. You see one descriptive or "comment" line that begins with ">", followed by the nucleotide sequence. This little file is just what you need to search nucleotide databases for similar sequences. Let's keep it for future use.
Click and drag on the web page to select everything from the ">" through the last nucleotide. Be careful not to select anything else. From your browser's Edit menu, select Copy to make a copy of this information on your clipboard, for pasting elsewhere. Now start your favorite word processor, make a new document, and paste. The FASTA comment and sequence should appear. Select all of the text and change the font to Courier or Monaco -- these "typewriter" fonts make it easy to align letters into columns, because all letter are the same width. Save this file, choosing text or plain text as the file type. Call it mrnared.txt. Save it to a convenient location for the files you'll be making later. Click your browser's Back button until you return to the Entrez Gene page for this gene.
What is the amino-acid sequence of this gene?
Under NCBI Reference Sequences (RefSeq) click the entry number beside Product.
Things look a lot like before, but this is a protein entry (the classical view is that gene products are proteins, but not all of them are), containing the amino-acid sequence in one-letter abbreviations. Just as with the mRNA entry, turn this into a FASTA display, and copy it into a new word-processor document. Save it in text format as protred.txt. Return to Entrez Gene.
What does the neighborhood of this gene look like?
Click the entry number beside Source Sequence.
This entry shows the sequence of the specific DNA clone that contains the opsin gene, along with information about how this clone was produced. This entry thus shows the gene in the slightly larger context of the cloned fragment in which this gene was found. This sequence would allow you to see flanking regions around the gene, and perhaps to design PCR primers for making useful quantities of the nucleotide sequence so you could express this gene in a cloning vector. From this page, you could also find neighboring sequences if you wanted to look farther afield. As before, display this entry in FASTA format. You will get several entries, each a different clone that was found to contain this region of the genome. Save the first FASTA entry (from the ">" to the end of the nucleotide sequence) as a word processor text document entitled GBred.txt. (Why GB? Because the last time I looked [still true 2006/03/21], these entries were called GenBank entries. Things change fast in this business.)
What proteins in humans are similar to the red opsin?
Now return to the NCBI Map Viewer. We're going to search the human genome for sequences similar to that of the red opsin.
Click Blast next to Homo sapiens (human), OLDEST build.
This is the NCBI's BLAST search tool. BLAST is a widely used program for finding sequences similar to a "query" sequence that you're interest in. Pick these options from the various menus:
Database: Build Protein for OLDEST build (look at bottom of the Database menu). This means that you will search the protein sequences in this build of the database.)
Program: BLASTP (Use the version of BLAST that compares protein sequences, unlike BLASTN, which compares nucleotide sequences.)
Other Parameters, Expect: 10 (The higher the number, the less stringent the matching, and the more hits you'll get)
Next, copy the FASTA data from your file protred.txt to your clipboard, and paste it into the BLAST search box, above which it says, "Enter an accession..." Check to be sure that the first character in the box is the ">" at the beginning of the FASTA data. Then click Begin Search.
The next page is for formatting your search results. Just click that enthusiastic Format! button. When your results are ready, the results of BLAST page appears. Look down the page to the graphical display, a box containing lots of colored lines. Each line represents a hit from your blast search. If you pass your mouse cursor over a red line, the narrow box just above the box gives a brief description of the hit. You'll find that the first hit is your red opsin. That's encouraging, because the best match should be to the query sequence itself, and you got this sequence from that gene entry. The second hit is the green opsin -- remember that the PubMed entry reported that the red and green pigments are the most similar. The third and fourth hits are the blue opsin and the rod-cell pigment rhodopsin. Other hits have lower numbers of matching residues, and are color coded according to a score of matches. If you click on any of the colored lines, you'll skip down to more information about that hit, and you can see how much similarity each one has to the red opsin, your original query sequence. As you go down the list, each succeeding sequence has less in common with red opsin. Each sequence is shown in comparison with red opsin in what is called a pairwise sequence alignment. Later, you'll make multiple sequence alignments from which you can discern relationships among genes.
See what you can figure out about what the scores mean. Identities are residues that are identical in the hit and the query (red opsin), when the two are optimally aligned. Positives are residues that are very similar to each other (see residue number 1 in the blue opsin -- it's threonine in red opsin, and the very similar serine in the blue). Gaps are sometimes introduced into a hit to improve its alignment with the query. The more identities and positives, and the fewer gaps, the higher the score. Note that blue opsin and rhodopsin are only about 45% identical to the red opsin. Other proteins, which are apparently not visual pigments, have even lower scores. Now let's take a look at where all these hits are in the human genome.
Where are all the genes for these other proteins?
Click the Genome View button near just below the introductory information at the top of this result page. If this button does not appear, go back and make sure you are searching the database for the OLDEST build of the human genome.
You have come full circle. You are back that the human chromosome diagram, and all the hits of your search, in the colors that signify their BLAST scores, are located for you on the diagram. Notice that there are about 100 proteins (discovered so far, that is) that have 40% or more positives in alignment with red opsin. The opsins are members of the very large family of G protein-coupled receptors, key players in signal transduction.
How are the opsin genes related to each other?
Answering this question requires making a multiple sequence alignment and then using it to make a phylogenetic tree. For these tasks, we move to another database where it's a little easier to gather a bunch of sequences into a single FASTA file.
Point your browser to http://us.expasy.org.
You see the home page of ExPASy, the Expert Protein Analysis System. As I said earlier, ExPASy is a complete protein tool box. With ExPASy, you can do almost any imaginable analysis or comparison of protein sequences and structures.
Click Swiss-Prot and TrEMBL under Databases.
Read the introduction to these databases. They are high quality protein sequence databases with abundant annotation, minimal redundancy, and many connections to other databases.
Click Advanced search in the UniProt Knowledgebase.
With advance searching, you can limit your search to specific genes and organisms, and you can search on descriptive information in the entries
Set up a search for human opsins, as follows:
Search Swiss-Prot only.
Enter Description: opsin
Organism: Choose "Human" from the pull-down menu
Check "Append and prefix * to query terms. The * is a "wild card". You are searching for all entries that contain "opsin" as a whole or partial word.
Click Submit.
The page Swiss-Prot description is your search result page.
Look over the results. On 2006/03/21, this search gave 15 hits, including the rod pigment rhodopsin (OPSD), along with the three cone pigments (OPSB, OPSG, POSR). There is also a "visual pigment-like receptor peropsin", OPSX. Sound mysterious. Let's find out more about it, and in the process, see a typical Swiss-Prot entry.
Next to OPSX_Human, click on the number (014718) in the column headed AC (accession code?).
You see the UniProtKB/Swiss-Prot View of entry O14718. Peruse this entry and try to find out just what this rhodopsin-like protein is thought to do. Under Comments, you'll learn that it's found in the retina (the RPE or retinal pigment epithelium), and that it may detect light, or perhaps monitors levels of retinoids, the general class of compounds that are the actual light absorbers in opsins. Also under Comments - Similarity, you see, as mentioned earlier, that this protein is a member of the large family of G protein-coupled receptors (GPCRs). If you click "G-protein coupled receptor 1 family. Opsin subfamily", you find a list of all purported members of this subfamily in SwissProt. Clicking "View Classification" produces a list of all GPCRs in SwissProt, with a summary at the bottom indicating that the human genome alone contains 757 of them!
Now back up to the UniProtKB/Swiss-Prot entry page for 014718, OPSX_HUMAN.
Under References click the journal citation, "Proc. Natl. Acad. Sci. U.S.A. 94:9893-9898(1997). From the resulting page, you can read a full article in the Journal of the National Academy of Sciences (PNAS) about this protein. Like many journals, PNAS puts full articles online just 6 to 12 months after publication.
Looking further down the page, you find cross-references to the protein or its gene in other databases, predicted structural features of the protein, and last, the sequence. Note also, at the bottom of the page, links to a number of ExPASy tools listed for further analysis of this sequence. Try some of them. For example, I just learned in about ten seconds from Compute pI/MW that the isoelectric pH (or pI) of this protein is 8.78. And I learned in no time at all from ScanProSite that the sequence contains signatures indicating that the protein is probably a G protein-coupled receptor (no surprise, but comforting) and that it has a retinal binding site. ProSite is a tool for finding signatures of function in new sequences.When you finish playing with these powerful tools, return to your SwissProt search results by use of the back button of your browser. If you're lost, go back to ExPASy and do the search again.
Now let's compare the sequences with each other. We'll use the program ClustalW to make a multiple sequence alignment.
Scroll down the result page and check the boxes at the left of these entries
OPSB (blue-sensitive opsin)
OPSD (rhodopsin)
OPSG (green-sensitive opsin)
OPSR (red-sensitive opsin)
OPSX (visual pigment-like receptor opsin)
At the top of the page, at Send selected sequences to, select Clustal W (multiple alignment) from the menu, and click Submit.
ClustalW has been implemented at many web sites. This one, located at EMBnet.org, automatically receives the FASTA files from the selected entries, allows you to make some settings of the alignment criteria, and then does the alignment. We will just accept the default alignment settings. First, scroll in the Input Sequences box and verify that it contains five FASTA files, one right after the other. To make them easier to identify in subsequent outputs, edit the name of each FASTA comment line (begins with ">") as follows:
Change "spP03999OPSB_HUMAN Blue-sensitive opsin (Blue cone photoreceptor pigment) - Homo sapiens (Human)." to "Blue".
Change "spP08100OPSD_HUMAN Rhodopsin (Opsin 2) - Homo sapiens (Human)." to "Rhodopsin".
Change "spP04001OPSG_HUMAN Green-sensitive opsin (Green cone photoreceptor pigment) - Homo sapiens (Human)." to "Green".
Change "spP04000OPSR_HUMAN Red-sensitive opsin (Red cone photoreceptor pigment) - Homo sapiens (Human)." to "Red".
Change "spO14718OPSX_HUMAN Visual pigment-like receptor peropsin - Homo sapiens (Human)." to "Peropsin".
In all cases, be sure to leave the ">" in the first line of each FASTA entry. To save some work in case something goes wrong, select the edited contents of the Input Sequences box, copy it, and paste it onto an empty word-processor page, and save the file in text format. Name it Opsins.txt.
Click Run ClustalW.
The resulting page is called ClustalW query receipt, and it contains links to several output files.
Click clustalw (aln).
You see the typical ClustalW alignment file, showing our five protein sequences aligned to maximize identical and similar residues. Below each line of five sequences are symbols to show the extent of similarity among the sequences. An asterisk (*) means that the same residue is always (that is, for all of these sequences) found at that location; for example, the first asterisk marks a location where only N (asparagine) is found. Colon (:) means that all residues at this location are very similar; for example, the first colon is where only F (phenylaline), I (isoleucine), and L (leucine) -- residues with large, nonpolar sidechains -- occur. Period (.) means somewhat similar residues; for example, at the first period, serine, threonine, and glutamine occur -- all polar, but varied in size. If there is no mark then the residues at that location display no predominant common properties.
Once more, as a safety measure, copy this alignment to your clipboard, and paste it onto an empty word-processor page. Then save the file in text format. Name it OpsMSA.txt. Remember that it is still on your clipboard, for pasting at our next stop. This multiple sequence alignment is one type of input you can use to make a phylogenetic tree.
What does the family tree of human opsins look like?
Point your browser to http://bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html
This is one home of the program Phylip, One of the most rigorous tools for constructing phylogentic trees from aligned sequences.
Under Proteins, next to protdist, click "advanced form."
You are about to run protdist, a program that computes the "distance" of sequences from each other. These so-called distance matrices will be used by Phylip to construct your tree.
Enter your email into the top box.
In the alignment file box, paste your mutiple sequence alignment from ClustalW.
Click "Bootstrap Options" and make these settings:
Check the box for "Perform a bootstrap before analysis"
Enter an odd number for a seed
Enter 100 replicates
At the top of the page, click "Run protdist".
protdist constructs distance matrices by a process called "bootstrapping". Bootstrapping is a bias-reducing procedure in which protdist builds an alignment of pseudosequences by picking residue positions at random and stringing the residues at those positions together until the sequence is the same length as the original ClustalW alignment. From this pseudosequence alignment, protdist determines the relative number of sequence difference among the five proteins, as determined from a random sampling of their sequences. The result of the process is a called distance matrix, and you will see it soon. This process is repeated, 100 times in our case, to make 100 distance matrices. The tree we will ultimately produce represents a consensus of the 100 matrices.
There may be a delay of a few minutes before the result page appears. If the server is busy, you may be informed that results are being sent by email. If so, check you email in two or three minutes. You will receive five messages, the first one simply containing a link to your result page. Click the URL, or paste it into your browser and press
On the Phylip: protdist page that results, click outfile to see the output from protdist. The file contains 100 matrices containing numbers that represent the relative number of differences among the five sequences. Each matrix has the sequence names in the first column, and you should imagine that these sequence names are also the headings for the remaining columns. The number at the intersection of the row Blue and the column with the imaginary heading Peropsin gives the relative magnitude of the sequence differences between the blue opsin and peropsin. The matrices have zeros on the diagonal because each pseudosequence is identical to itself.
Click the Back button of your browser to return to the Phylip: protdist page.
On the first pull-down menu of the Phylip: protdist page, pick "neighbor." Read the menu carefully: don't pick "weighbor".
Click "Run the selected program on outfile" to run Phylip with the output file of matrices you just examined. You are running a procedure called "neighbor joining" to construct an evolutionary tree.
On the Phylip: neighbor page that appears next, beside "Distance method?" Make sure "Neighbor-joining" is selected.
Click "Bootstrap options" and make these settings:
Check "Analyze multiple data sets (M)"
Enter 100 data sets (same as number of replicates from protdist)
Enter an odd number for a seed
Check "Compute a consensus tree"
Scroll down to "Other options".
This entry area gives you the option of designating an outgroup for the root of your tree. An outgroup is the sequence you think is most distant from the others, possibly the common ancestor of all. We don't know that in this case, so leave the default of 1.
At the top of the page, click "Run neighbor".
The resulting files are
outfile.consense -- your tree, in a text file, and outtree.consense -- your tree in a format used by tree-printing programs.
Click on outfile.consense to see the tree.
Scroll down to the bottom of this file to see the consensus tree. This tree is "unrooted", meaning that we do not know the ancestor of all these sequences. We learn from this tree which sequences are most alike and which are most different. We also learn how often the connections of this tree were made the same way in the 100 trees made from those 100 difference matrices. The numbers on the branches indicate the number of times that partition of the species into the two sets separated by that branch occurred among the 100 trees. For example, the separation of Red and Green from the other three, indicating that Red and Green are more similar to each other than to the other three, occurred in all 100 trees. The separation of Blue and Peropsin from the other three occurred in only 56 of the 100 trees. In the other 44 trees, Rhodopsin and Peropsin were separated from the other three. (Can you extract this information from this file?) In the tree branching shown, the majority rules, and the results of 44 of the trees are discarded.
Because of the random choices made in constructing the tree, the percentages in the paragraph above my vary. I have gotten as high as 82% consensus on the separation of Blue and Peropsin from the other three.
You can save this file by selecting all and pasting it into a word-processor document. Call it outfile_consense.txt.
Return to the Phylip: neighbor page and click on outtree.consense. This is your tree in Newick format, which is widely used by tree-printing programs like Phylodendron. Let's use this program to give us a tree in attractive graphics, rather than text.
Point your browser to http://iubio.bio.indiana.edu/treeapp/treeprint-form.html.
Paste the contents of your outtree.consense file into the Tree Data box. Select Phenogram from among the Tree Styles. From the menu at Extra Options, Output, select GIF Image format for your output file. Give your tree a title, such as "Human Visual Opsins and Opsin-Like Proteins". Finally, click Submit.
Your GIF-format tree appears in your browser window. To keep it, chose Save As ... from the File menu. Call the file OpsinTree.gif. My tree looks like this:
NOTE TO USERS: 2006/03/22 -- revised to here for changes made to databases in the past year. Beyond this point, I have not yet checked for needed changes. Send me email if you find problems.
What is the structure of an opsin?
By now, I'm particularly curious about peropsin, but it's not likely that the structure of a recently discovered protein of unknown function has been determined. But it is likely that all opsins are similar in structure, so let's see is we can find an opsin in the database for macromolecular structures, the Protein Data Bank (PDB). It will give us an idea of what kind of thing an opsin is.
In fact, the PDB does not contain molecular structures at all. Is is better to say that it contains models of macromolecules. These models are interpretations of data from one of the two main methods of macromolecular structure determination: x-ray crystallography and NMR spectroscopy. When researchers determine the structure of a macromolecule, they deposit a file containing the three-dimensional coordinates of all the atoms in the model. This coordinate file -- along with an online molecular graphics tool (like **) or a computer graphics program like Deep View -- are all that you need to see and study the molecule on your computer. Next we will retrieve a model from the PDB and view it with an online graphics tool. We'll also visit the home of a topnotch computer graphics program that you can download FREE and use on your home computer.
Point your browser to http://www.rcsb.org/pdb/.
Hoo boy.Since I last worked on this tutorial, the PDB has changed a lot. Until I revise the next little section, you'll have to do just what I will be doing: go to the recently revised PDB (same address), and try to accomplish what I describe below. For the most part, I am finding the new PDB even easier to use, so there is hope.
The PDB home page contains a simple search box under Search the Archive. You can search for models using simple keywords or PDB ID codes. An PDB code has four characters, like 1CYO. How would you ever know a model by its code? When a new structure is published, the authors usually give the PDB code in the last reference of the bibiography. With that code, you can go straight to the model you want to see. But more often, your question, like ours, is more general. For such cases, PDB also provides forms for more sophisticated searches. For now, let's just see if any opsin models are availalble. Type "opsin" into the search box, click to remove the checkmark from match exact word, make sure full text search is chosen, and click search.
As of 2003/03/30, this search returns 88 models, and you can see from the first one that our search is too broad. Among other things, we're finding netropsin, an antitumor drug. There's also bacteriorhodopsin, and the last time I looked, bacteria had no eyes, so this is not likely to be a visual pigment. Looking over the first two pages of hits, I see one promising sign: some entries for bovine rhodopsin. Some of the hits appear to be fragments of this molecule. So let's use a more precise search tool to see if other bovine rhodopsin models are available. Return to PDB home.
In the Search the Archive box, click SearchFields.
We have gone from the simplest to the most sophisticated search tool. SearchFields is a customizable form that allows many search criteria. The criteria names are links to the definitions of the criteria, providing information on the contents of PDB files and the criteria that will look in specific parts of the files. At the bottom of the form are criteria you can add to the form. Then you can bookmark a form and always find it with the criteria you want. Now let's get serious and see if there are PDB models that are similar to human rhodopsin.
Scroll down to the list of criteria you can add to the form. Check to add these criteria: FASTA search, Ligand and prosthetic groups, and Source. Click New Form. It looks like you have come back to the same page, but now the new search criteria are available. You can now search with a FASTA sequence, you can limit the search to models contains specific nonprotein ligands (like retinal, the prosthetic group of visual pigments), and you can specify the source organism from which the macromolecule is obtained.
Find your FASTA sequence of human rhodopsin and paste it into the FASTA Search box. To limit your search to models containing the visual prosthetic group, type "retinal" into the Ligand and Prosthetic groups box. Click Search.
This search may take a few minutes. The tool is looking for sequence homology among more than 20,000 entries in the PDB. On 2003/03/30, I got only 5 hits for this search. The first one had PDB code 1LN6, and was listed as a model of bovine rhodopsin. If your search produces other hits, find 1LN6 among them.
Beside FASTA result, you see the number 2.9e-159. This is an alignment score meaning that the probability that this entry and the human rhodopsin sequence are similar just by chance is 2.9 x 10-159; not bloody likely, in other words.
Click alignment. In a new browser window, you see the alignment between the human rhodopsin sequence and that of 1LN6. After alignment, they are over 90% identical. If two proteins are more than about 40% identical, they are almost certain to be practically identical in structure. So this model will show us what human rhodopsin looks like. This display is part of a larger page that contains all PDB entries that show significant homology to the human rhodopsin sequence.
Close the alignment window to reveal again the search results. In the blue bar beside the 1LN6 entry name, click the eye icon (eyecon?).
You are now at the PDB View Structure page, which provides many ways to look at protein structure. The quickest is PDB's own QuickPDB. If your computer has up-to-date Java resources, the QuickPDB button appears at the bottom of the list of viewing options. Click the QuickPDB button.
Just like magic, a window opens with red alpha-carbon model of model 1LN6. Remember that this is bovine rhodopsin, at the moment (2004/02/16) the only opsin of known 3-dimensional structure. Most of the functions of this window are self-explanatory, but here are a few tips.
Click/drag on the image to rotate the structure. You should be able to tell that is has a lot of alpha helix.
Change the Mouse menu to Zoom. Click/drag to adjust the size of the model. Set the mouse menu back to Rotate.
Click the Stereo button to change to a stereo pair. If you cross your eyes to superimpost the two images, you'll see a truly three-dimensional model, and you'll be able to see the shape of the molecule much more clearly. For help in learning to view stereo pairs, click HERE.
Pull down the menu that initially says Secondary Structure, and pick Secondary Structure. Colors change to show alpha helix in red, pleated sheet (beta) strands in blue, and other residues in yellow. You can see that bovine rhodopsin has no pleated sheets.
Choose Charged from the menu that initially says Hydrophobic. Polar residues are shown in red, others blue. Note that polar residues are concentrated at the ends of the molecules (top and bottom in the initial orientation). This protein spans a membrane, and the ends protrude into the aqueous medium on either face of the membrane. The protruding portions contain mostly polar residues, while the residues that lie within the membrane are mostly nonpolar.
The sequence of the protein is shown above the model. Click on any residue to highlight it in the model (cyan by default, but you can change the highlight color with the color menu).
To learn more about QuickPDB, click Help.
Where Do You Go From Here?
Well, there you have a basic introduction to bioinformatics. With the tools you've tried out, you can explore the vast stores of genetic and structural information available on the Internet. Every page we've visited has many more links to other tools. You can figure out a lot just by visiting them and playing around, and there's usually built-in help and tutorials. I hope this tutorial spurs you to learn more about how to use bioinformatics in your explorations. If you want to learn more about using databases in structural investigations of proteins, see the Encore! next.
Test Yourself
Click HERE for a series of questions to answer using the tools of this tutorial. These questions make an appropriate assignment for assessing your (or your students') ability to use the tools of bioinformatics.
Encore! Exploring Protein Structure by Homology Modeling
How would you like to determine the structure of that mysterious peropsin we found among sequences homologous to the human opsins, and to find out whether it contains a binding site for retinal? This is a reasonable goal, if you are willing to learn how to use a more powerful tool for protein viewing and analysis.
QuickPDB is a great structure viewer for beginners, but there are much more sophisticated, powerful, versatile, and free (my favorite price) viewers available. My all-time favorite is DeepView, which is also known as Swiss-PdbViewer. The encore section of this tutorial requires DeepView. We will use it to look more closely at bovine rhodopsin, including exploring its bound retinal molecule. We will also probe more deeply into that mysterious peropsin we noticed among the sequences similar to the human opsins. In fact, we will determine its structure, by threading its sequence onto the three dimensional structure of bovine rhodopsin. (You can't do that with any other free protein viewer!!)
This process is called homology modeling, and it can provide sort of an educated guess about the structure of a protein (the target) from its sequence, as long as one of more structures of homologous proteins (templates) are available in the Protein Data Bank. How good a guess? In short, the more similar are the sequence and function of your target and your template(s), the better the model. So in our case, if we can find a template of similar sequence and function (a homologous retinal-binding opsin, that is), we should be able to get a decent model of peropsin, and see whether it's really feasible to think that it binds retinal. We might even be able to tell whether it binds retinal covalently or noncovalently, depending on whether an appropriate covalent-binding residue is suitably located in the homology model. At the very best, however, you cannot learn fine details of side-chain conformations, as you can from determining protein structure by X-ray crystallography or NMR. But homology models can be useful for preliminary exploration, and might also point to useful target residues for chemical analyses of structure, such as site-directed mutagenesis.
If you already have and know how to use DeepView, just continue with this tutorial. If not, take time to learn DeepView from this Beginner's Tutorial, and then return here.
Exploring Rhodopsin and Peropsin
I'm assuming that you know how to use DeepView, also known as Swiss-PdbViewer, and how to download models (called coordinate files) from the Protein Data Bank. If not, please take sections 1-11 of the DeepView Tutorial and call me in the morning -- er, I mean -- come back to this section. The conventions used in the rest of this tutorial are the same as those in the DeepView Tutorial.
First, let's track down a FASTA file for human peropsin. Go to http://us.expasy.org/, select SwissProt and TrEMBL, and enter OPSX, the protein name for peropsin, in the Search Swiss-Prot and TrEMBL for box. Click Go. On the result page, click OPSX Human. Near the bottom of the resulting NiceProt entry, click on O14718 in FASTA format. Save the resulting text file as peropsin.txt. (I selected and copied the file from the browser display, pasted it into a new word processor file, and saved it in text format).
Start DeepView. Then Cancel the initial dialog box, which is expecting you to load a PDB file. Your peropsin file is in FASTA format, so you have to load it by a different procedure.
SwissModel: Load Raw Sequence to Model.. .A reminder about DeepView Tutorial format: this instruction tells you to select the command Load Raw Sequence to Model.. from the SwissModel menu.
The resulting dialog box looks just like one for loading a PDB file, but now DeepView is looking for a sequence file in FASTA format. Navigate to your peropsin.txt file, select it, and click Open.
DeepView displays the sequence of peropsin as an alpha helix. This is a compact way to get the 337 residues onto the screen. First, let's see what Prosite (above) has to say about the nature of this protein. DeepView contains an internal link to ProSite through the command Edit: Search for ProSite Patterns. Because ProSite works on sequence only, we don't need to know anything about structure to see what signatures of protein function ProSite can find.
Edit: Search for ProSite Patterns
If DeepView returns an error message, your DeepView installation may not have included the ProSite data. Go HERE and follow instructions for this installation. You'll probably have to restart DeepView, but not your computer, after installing.
This command elicits a small window listing signatures or patterns found in peropsin. Note the last two entries, indicating that ProSite recognizes patterns indicating that peropsin is a G protein-coupled receptor with a retinal-binding site. Click the black descriptions of sites to highlight the residues that ProSite recognized. Click the last red ProSite entry numbers to download a full description of the ProSite documentation for recognizing a retinal-binding site. The entry contains a full list of sequences from SwissProt/TrEMBL that contain this site, followed by a description of the types of proteins in which this pattern is found, and the specific consensus sequence that ProSite looks for to identify retinal-binding visual pigments.
Now you know you are likely to find a retinal-binding site in peropsin. Let's see if we can "see" it.
SwissModel: Find Appropriate ExPDB Templates
If DeepView returns an error message, you need to Preferences: Swiss-Model, and make the following settings:
Modeling Server: http://swissmodel.expasy.org/cgi-bin/sm-submit-request.cgi
Template Server: http://swissmodel.expasy.org/cgi-bin/blastexpdb.cgi
Also specify your name, an email address where you can receive large files, and your preferred browser.
Then repeat the command SwissModel: Find Appropriate ExPDB Templates
DeepView starts up your preferred browser and completes a form containing the peropsin sequence in FASTA format. Click Submit to conduct a search for proteins of known three-dimensional structure that are homologous to peropsin. The Swiss-Model Template Server returns a list of model that should server as suitable templates for making a model of peropsin. These models are in a special structure database called ExPDB, for excerpts of PDB models. An ExPDB entry are usually one domain from a multidomain protein, or one chain from a PDB model that contains more than one chain.
On 2004/02/17, I got eight possible templates, all of which were various models of bovine rod rhodopsin. As of this date, rhodopsin from the rod cells of good ol' Bos taurus (whence "Come, Bossy!") is the only visual pigment of known structure. The first recommended template, 1f88A, is chain A from PDB entry 1F88, a model determined by X-ray crystallography. The BLAST score for 1f88A indicates that the odds that the sequencs of of peropsin and model 1f88A are similar by chance is 9x10-38, which implies strong reason for similarity. Biologists would say that the only reasonable explanation for such similarity s evolution: peropsin and bovine rhodopsin evolved from a common ancestor.
Because all the potential templates are the same protein, we will work with just one template, 1f88A. Click its file name in the download ExPDB column of the Template Selection table. Depending on how your browser and DeepView are set up, the file might open automatically in DeepView, or you may have to specify DeepView as the helper application, or you may have to save the file to your desktop. If you've been using DeepView before, you've probably worked out the best way to handle PDB downloads. Use your favorite method, and then open the file in DeepView (by File: Open PDB File, if you saved the file to your desktop).
The model should appear near your large alpha helix of peropsin residues.
Wind: Sequences AlignmentDeepView displays the sequences of both models, right-justified -- that is, aligned at their N-terminal residues. Now for some real magic.
Fit: Magic FitIn the blink of an eye, it appears that your peropsin helix is gone. But its sequence has been aligned with that of 1f88A, and each its residues has been superimposed upon the residue with which it aligns sequentially. In short, the peropsin chain has been threaded onto the 3-dimensional model of bovine rhodopsin.
In the Sequences Alignment window, notice that the sequences are not longer left-justified; they are aligned by homology. Straight vertical lines connect residues that are identical in the two models, two dots connect quite similar residues (like valine and leucine, both bulky nonpolar), one dots connects less similar pairs (serine and glutamine, both polar), and dissimilar pairs are unconnected (glycine and isoleucine). To see the full alignment conveniently, click the little document icon at the left end of the Sequences Alignment window. You can save this diagram as a plain text file for printing (File: Save: Sequence Alignment).
Wind: Layer InfosThis window gives information about the display property of all models currently loaded. With what we are doing, it's a handy way to be more quantitative about similarities. According to the Sel column (far right), 199 residues are currently selected. This is the number of residues in the aligned regions of the two models. Blink to display 1f88A. Select: All. The Sel column tells you that 1f88A contains 346 residues. Blink to display peropsin. How many residues does it contain?
With peropsin remaining on display, Select: aa Identical to ref. Structure. DeepView tells you that 88 residues of peropsin align with identical amino-acid residues of 1f88A. This is 88/346 or only 25% sequence identity. Select: aa Similar to ref. Structure. The percentage of aligned residues that are chemically similar (double dots in the Sequences Alignment window) is 182/346 or 53%. If two proteins show more than about 35% sequence similarity under best alignment, then they are almost certain to be of similar structure.
Select: aa Making ClashesA large number of residues in the homology model are trying to occupy the same space. This is obviously unrealistic. Make the model more feasible by using Tools: Fix Selected Sidechains: Quick and Dirty. Again, Select: aa Making Clashes to see if DeepView has improved these problems. Press
You can solve all of the problems of this primitive model by sending it off to the Swiss-Model server for optimization.
SwissModel: Submit Modeling RequestA dialog box appears, asking you to save your project file. Navigate to the desktop and create a new folder there (model). Save the file in the new folder with the name peropsin. Two files will appear, peropsin and proj_peropsin, and your browser will appear again, with a new form. Under Your Swiss-Model project file can be found in proj_peropsin, click Browse, navigate into the model folder and click to open proj_peropsin. This is the project file that your browser will send to Swiss-Model for optimization. The little file peropsin is just the web page form you are viewing. Complete the form by checking your email information, select Swiss-PdbViewer mode for the format of your final model, and uncheck the option of getting a WhatCheck report of the final model. Then click Send Request.
ALTERNATIVE REQUEST METHOD: If you have problems sending the project file by way of your browser, you can submit it directly to Swiss-Model. Go to http://swissmodel.expasy.org/. Under Modeling Requests, click Project (optimise) mode. You will see a form similar to the one the DeepView creates. Fill in your email address, name, and a project title. Click Choose File and navigate to your proj_peropsin file, and choose it. Select Swiss-PdbViewer mode for the format of your final model, and uncheck the option of getting a WhatCheck report of the final model. Then click Send Request.
By either submission method, your browser should return a message indicating successful uploading of your project file (385728 bytes for this project), and provide further information. You will receive your optimized model by email. It may take several hours. Once you receive several email files from Swiss-Model, you are ready to resume this tutorial.
On 2004/02/17, my modeling request elicited four email messages from Swiss-Model, the last one containing this subject line: SwissModel-Model-AAAa0_C6G. The AAA number is a Swiss-Model project number (yours will be different, of course), and this is the email that contains your homology model as an attachment (mine was named AAAa0_C6G.pdb). Save your model file to a convenient location, and then start DeepView and open the file.
Blink between the target peropsin and the template 1f88A. By default, the template is displayed as backbone only. Turn on all of its side chains by shift-clicking anywhere in the side column of the Control Panel. With peropsin on display, shift-click any checkmark in the show column to turn off display of all residues, leaving a ribbon model. Byh default, the ribbon is colored to show the quality of the model. Most of the ribbon is green, while four short segments are red. Green indicates residues that fit well with the template, while red indicates residues that did not match up well with template residues. (The menu command option for showing this color scheme is Color: B-Factor. although the term B-Factor does not apply here.)
It is typical in a modeling project like this that scaffold residues, such as those in the seven helices, model well, but surface loops, which define the specific function of the protein, constitute the most significant differences between target and template, and do not model as well. Ironically, you learn mostly what you already know about your target (in this case, that it's a seven-helix bundle), and you learn least about the most interesting parts, the parts that differ most from your template, and that give your target protein a different function from that of your template.
Let's see whether optimization really improve our model noticeably. With the peropsin layer displayed, and the Layer Infos window open, Select: aa Making Clashes. The number of residues selected is 0; all clashes were fixed by optimization. Look for other funny stuff, such as long peptide bonds. They are all gone, and the model is now structurally realistic. To learn more about judging the quality of models -- homology models and others -- visit these two resources:
Principles of Protein Structure, Comparative Protein Modelling and Visualisation, by Nicolas Guex (creator of DeepView) and Manual C. Peitsch
Judging the Quality of Macromolecular Models, by Gale Rhodes
Now let's see whether it appears that peropsin contains a pocket for retinal binding. Blink to the 1f88A layer. In the Control Panel, scroll down to the bottom and click RET977, to select the retinal molecule in the bovine rhodopsin model. RET977 should turn red when you click it. Press
Select: Neighbors of Selected aa..: Select groups that are within 4.0 A of the picked atom. Click OK and press
Now let's see whether our model of peropsin allows such a pocket, and provides an appropriately placed lysine residue for covalent bonding. Select: Extend to other layers. This selects the residues in model peropsin that are aligned with the displayed residues of af88A. Blink to the peropsin layer, press
So. Does peropsin carry retinal in your eyes? I don't know. This homology model certainly suggests that retinal binding is feasible. Proving that peropsin is a retinal carrier in the eye requires more than just building models. It requires purifying peropsin from retinal tissue, followed by chemical analysis to detect retinal. Finding out if the binding is just like what we are seeing in the model would require determining the structure of the peropsin-retinal complex by X-ray crystallography or NMR. Or a researcher could use the model we've made to select residues to change (by site-directed mutagenesis) and see if the changes affect binding. This just begins the quest to determine what peropsin actually does. To fully understand peropsin will require a conversation between theory (which includes model building) and experiment (chemical analysis, spectroscopy, structure determination, monitoring peropsin gene expression). This powerful dialog is the engine that propels science, and our growing understanding of nature.
PSYou have just seen an example of how well things can go in homology modeling. For a look at what can go wrong, and how to recognize and avoid some of the pitfalls, click HERE.)
Test Your Bioinformatics Skills
Exam #3 for USM Biochemistry (CHY463/563), spring semester, 2004.See class schedule for instructions.
The subject of this assignment is homeodomain genes, which are involved in numerous aspects of development, including establishment of body plan and differentiation of various types of stem cells into specialized cells.
If your instructor assigns this test, follow these instructions:
Copy and paste the Questions sections into email.
Fill in answers in the spaces provided.
If the answer is a file, save your file with the name provided. Submit all requested files in a format acceptable to your instructor.For USM Biochemistry students, the following are acceptable:
email attachment
CD (full size only)
floppy disk
USB memory device
Questions
Part 1. Sequence Work
Requires completion of all of this tutorial EXCEPT the Encore! structural work.
Start at the NCBI Map Viewer. How many genes in the human genome contain the term "homeo" in their name? To be sure you find them all, search for "*homeo*". The asterisks are wild cards., which means that you are searching for "homeo" preceded or followed by any other characters.Number found: ______ .
Which chromosome contains the largest number of these genes? How many?Chromosome # ______ ; Number of "homeo" genes on this chromosome: ______ .
Among the genes found in question 1, find one that has a role in insulin action.Name of the gene: _____________________________. Four-character ID: ______ .
What chromosome contains this gene?Chromosome # ______ .
According to OMIM, what is the role of the protein encoded by this gene?Role (limit to 25 words): ________________________________________________ .
Obtain the protein sequence of this gene, in FASTA format.File name: HmPrt.txt (You will use this file in Part 2.)
Go to ExPASy. How many annotated human genes in SwissProt and trEMBL contain the term "homeo"? Note that "*" is automatically used as prefix and suffix unless you specify otherwise.Number found: ______ .
Make a phylogenetic tree of the first 24 of these genes plus the insulin-related gene found in question 3, a total of 25 sequences. Use the insulin-related gene as the "outgroup". (NOTE: ClustalW at EMBnet accepts no more than 30 sequences for alignment.)Hint: After conducting the search, you will need to use, in sequence, ClustalW, Phylip, and Phylodendron to complete this task.Files:ClustalW input file name: HmCWIn.txtClustalW output file name for Phylip [clustalw (aln) file]: HmCWOut.txtPhylip output file name [text tree (outfile.consense) file]: HmCons.txtPhylip input for Phylodendron [Newick (outtree.consense) file]: HmOutTree.txtPhylodendron output file hame: HmTree.gif
According to your tree, what two SwissProt entries in this group are the most similar?SwissProt entry numbers ______ and ______ .
What SwissProt entry is most similar to the insulin-related gene?SwissProt entry number ______ .
What can you find out about the function of this similar gene?
Go to the Protein Data Bank. Search for models of human homeodomain proteins.
How many models do you find?Number: ______ .
What method of structure determination produced the first of these models?PDB ID code: ______ . Method: _______________________ .
View the first model on the list with QuickPDB or your favorite molecular viewer.What are the main secondary structural elements (helix, sheet, coil) in this protein?Secondary structural elements: ___________ .
Give beginning and ending residue numbers of three secondary structural elements.1. start: ______ end: ______ .2. start: ______ end: ______ .3. start: ______ end: ______ .
Find a model of a human homeodomain/DNA complex.
How many models do you find?Number: ______ .
What method of structure determination produced the first of these models?PDB ID code: ______ . Method: _______________________ .
View the first model on the list with QuickPDB or your favorite molecular viewer.What secondary structural element(s) (helix, sheet, coil) interact with DNA?Secondary structural elements: ___________ .
Give beginning and ending residues of main secondary structural element in contact with DNA.Residue start: ______ end: ______ .
Part 2. Structural Work
Requires completion of the Encore! structure section of this tutorial.
Search the Protein Data Bank for human homeodomain/DNA complexes. View the first model you find with DeepView (Swiss-PdbViewer).
PDB File ID: _________ .
What patterns does ProSite recognize in this protein?List ProSite patterns:________________________________________________etc
What secondary-structural element (helix, sheet, coil) contains most of the residues that ProSite recognizes as a homeodomain?Element: ________________ ; Residue numbers, start: ______ end:______ .
What secondary-structural element(s) of the homeodomain protein interact(s) directly with DNA?
List three residues involved with hydrogen bonds to DNA.List interactions like this: "LYS89-T6" means a hydrogen bond between lysine-89 of the protein and thymine-6 of the DNA.Interactions:_____________________________________________________________________
Comment on the quality of this model, especially in areas of protein-DNA interaction.Quality criteria (limit to 10 words): ___________________________________________ .Comments: (limit to 25 words): _______________________________________________ .
Use DeepView and the file HmPrt.txt, which you saved in Part 1, to make a homology model of the insulin-related human homeodomain protein (the target), using the best available template. Submit your project file returned from Swiss-Model.Project file name: _________________.pdb
How many residues does the target protein contain?Number of residues in file HmPrt.txt: ______ .
How many residues of the target are modeled by the best template?Number of residues in the homology model: ______ .
Why are the remaining residues missing?Reason(limit to 25 words): ___________________________________________________ .
How many residues of the template are identical to corresponding residues of the target? How many are similar?Number identical: ______ .Number similar: ______ .
How many residues of the target are modeled well by the template?Number modeled well: ______ .Criteria (limit to 25 words): __________________________________________________ .
Bioinformatics Algorithms
Dynamic Programming
Particularly good sites...
http://www.cis.upenn.edu/~sahuguet/MSA/
http://www.blc.arizona.edu/courses/bioinformatics/align.html
http://www.cs.monash.edu.au/~lloyd/tildeStrings/Notes/DPA.html
http://www.cs.orst.edu/~schut/cs325/dynamic.htm
http://www.catalase.com/dprog.htm
http://bioweb.ncsa.uiuc.edu/~bioph490/BIOPH2.html#SEQUENCE_COMP
http://www.qucis.queensu.ca/home/cisc365/javascript/dp1/index.html
Other sites...
http://bioweb.ncsa.uiuc.edu/~bioph490/dynamic_programming_demo.html
http://www.qucis.queensu.ca/home/cisc365/365overheads.html
http://www.qucis.queensu.ca/home/cisc365/dp/dp.p01.html
http://www.dgp.toronto.edu/csc270/tut_dp.html
http://queue.ieor.berkeley.edu/~jshu/knapsack/DP/dp.html
http://mat.gsia.cmu.edu/classes/dynamic/dynamic.html
http://www.cs.sandia.gov/~scistra/class_3
http://levine.sscnet.ucla.edu/Econ101/dynamic.htm
http://mat.gsia.cmu.edu/classes/stoch_dynamic/stoch_dynamic.html
http://mat.gsia.cmu.edu/classes/dynamic/node8.html
http://www.maths.mu.oz.au/~moshe/dp/bibl/bibliography.html
http://cartan.gmd.de/PAPER/ismb95/ismb_html.html
http://screwdriver.bu.edu/bibliography/dynamic_programming.htm
http://www.norvig.com/design-patterns/
http://tome.cbs.univ-montp1.fr/htmltxt/Doc/manual/node137.html
http://poem.princeton.edu/~verdu/dynamic.html
http://www.orca1.com/opushelpweb/opusDynamic_Programming.html
http://screwdriver.bu.edu/cn760-lectures/l7/index.htm
http://www.ms.unimelb.edu.au/~moshe/dp/dp.html
http://mat.gsia.cmu.edu/ORCS/0255.html
http://aae.wisc.edu/e703/notes/a13dynpr.htm
http://bioweb.pasteur.fr/docs/modeller/node137.html
http://www2.uwindsor.ca/~lama/my470/ddynamic.htm
http://students.ceid.upatras.gr/~papagel/project/ex5_6_1.htm
http://www.cs.sunysb.edu/~algorith/lectures-good/node12.html
http://www.cs.sunysb.edu/~algorith/lectures-good/node12.html
http://www.utdallas.edu/~scniu/documents/7315.htm
http://www.ii.uib.no/~pinar/seminar/larry.html
http://www.deakin.edu.au/~gecole/books.html
http://www.cseg.engr.uark.edu/~wessels/algs/notes/dynamic.html
http://www.csc.liv.ac.uk/~ped/teachadmin/algor/dyprog.html
http://www.eli.sdsu.edu/courses/fall96/cs660/notes/dynamicProg/dynamicProg.html
http://www.cs.indiana.edu/l/www/ftp/techreports/TR514.html
http://www.cs.brandeis.edu/~mairson/poems/node3.html
http://www.cis.tu-graz.ac.at/igi/oaich/animations/Dynamic2.html
http://bioweb.ncsa.uiuc.edu/~workshop/
Smith Waterman
http://genome-www.stanford.edu/Saccharomyces/help/sw_alignment.html
http://genome-www.stanford.edu/Saccharomyces/help/sw_details.html
http://www.stanford.edu/~sntaylor/bioc218/final.htm
http://www.maths.tcd.ie/~lily/pres2/sld009.htm
http://bioweb.ncsa.uiuc.edu/~workshop/Lab_3/Smith-Waterman.htm
http://www.tigem.it/LOCAL/SW/threshold.html
http://sgbcd.weizmann.ac.il/genweb/help/smith-waterman.html
http://cbrg.ethz.ch/ServerBooklet/section2_3_5.html
Needleman & Wunsch
http://www.maths.tcd.ie/~lily/pres2/sld003.htm
http://acer.gen.tcd.ie/~amclysag/nwswat.html
http://www.nada.kth.se/~erikw/thesis/chapter2_3.html
http://www.irbm.it/irbm-course95/gb/docs/amps/subsection3_6_1.html
http://www.ibc.wustl.edu/~zuker/Bio-5495/align-html/node3.html
General (NW vs. SW vs. HMM, etc.)
http://www.maths.tcd.ie/~lily/pres2/
http://acer.gen.tcd.ie/~amclysag/nwswat.html
http://laguerre.psc.edu/biomed/TUTORIALS/SEQUENCE/MULTIPLE/tutorial.html
http://www.cse.ucsc.edu/research/compbio/
Hmms
http://www.medmicro.mds.qmw.ac.uk/HMMER/main.html
http://alfredo.wustl.edu/ismb96/abs/p02.html
http://www.cse.ucsc.edu/research/compbio/html_format_papers/hughkrogh96/cabios.html
http://wwwsyseng.anu.edu.au/~jason/hmmlinks.html
http://www.breadfan.com/markov.html
http://cslu.cse.ogi.edu/HLTsurvey/ch1node34.html
http://www.ibc.wustl.edu/service/hmmalign/glocal.html
http://www.cse.ucsc.edu/research/compbio/html_format_papers/ismb94/node5.html
http://www.iscs.nus.edu.sg/~luakt/ic3222/lecture/nlp18new/index.htm
http://www.cse.ucsc.edu/research/compbio/sam.html SAM Software for HMMs
Genetic Algorithms
http://www.staff.uiuc.edu/~carroll/ga.html
http://kal-el.ugr.es/gags.html
http://kal-el.ugr.es/~jmerelo/GAJS.html
Particularly good sites...
http://www.cis.upenn.edu/~sahuguet/MSA/
http://www.blc.arizona.edu/courses/bioinformatics/align.html
http://www.cs.monash.edu.au/~lloyd/tildeStrings/Notes/DPA.html
http://www.cs.orst.edu/~schut/cs325/dynamic.htm
http://www.catalase.com/dprog.htm
http://bioweb.ncsa.uiuc.edu/~bioph490/BIOPH2.html#SEQUENCE_COMP
http://www.qucis.queensu.ca/home/cisc365/javascript/dp1/index.html
Other sites...
http://bioweb.ncsa.uiuc.edu/~bioph490/dynamic_programming_demo.html
http://www.qucis.queensu.ca/home/cisc365/365overheads.html
http://www.qucis.queensu.ca/home/cisc365/dp/dp.p01.html
http://www.dgp.toronto.edu/csc270/tut_dp.html
http://queue.ieor.berkeley.edu/~jshu/knapsack/DP/dp.html
http://mat.gsia.cmu.edu/classes/dynamic/dynamic.html
http://www.cs.sandia.gov/~scistra/class_3
http://levine.sscnet.ucla.edu/Econ101/dynamic.htm
http://mat.gsia.cmu.edu/classes/stoch_dynamic/stoch_dynamic.html
http://mat.gsia.cmu.edu/classes/dynamic/node8.html
http://www.maths.mu.oz.au/~moshe/dp/bibl/bibliography.html
http://cartan.gmd.de/PAPER/ismb95/ismb_html.html
http://screwdriver.bu.edu/bibliography/dynamic_programming.htm
http://www.norvig.com/design-patterns/
http://tome.cbs.univ-montp1.fr/htmltxt/Doc/manual/node137.html
http://poem.princeton.edu/~verdu/dynamic.html
http://www.orca1.com/opushelpweb/opusDynamic_Programming.html
http://screwdriver.bu.edu/cn760-lectures/l7/index.htm
http://www.ms.unimelb.edu.au/~moshe/dp/dp.html
http://mat.gsia.cmu.edu/ORCS/0255.html
http://aae.wisc.edu/e703/notes/a13dynpr.htm
http://bioweb.pasteur.fr/docs/modeller/node137.html
http://www2.uwindsor.ca/~lama/my470/ddynamic.htm
http://students.ceid.upatras.gr/~papagel/project/ex5_6_1.htm
http://www.cs.sunysb.edu/~algorith/lectures-good/node12.html
http://www.cs.sunysb.edu/~algorith/lectures-good/node12.html
http://www.utdallas.edu/~scniu/documents/7315.htm
http://www.ii.uib.no/~pinar/seminar/larry.html
http://www.deakin.edu.au/~gecole/books.html
http://www.cseg.engr.uark.edu/~wessels/algs/notes/dynamic.html
http://www.csc.liv.ac.uk/~ped/teachadmin/algor/dyprog.html
http://www.eli.sdsu.edu/courses/fall96/cs660/notes/dynamicProg/dynamicProg.html
http://www.cs.indiana.edu/l/www/ftp/techreports/TR514.html
http://www.cs.brandeis.edu/~mairson/poems/node3.html
http://www.cis.tu-graz.ac.at/igi/oaich/animations/Dynamic2.html
http://bioweb.ncsa.uiuc.edu/~workshop/
Smith Waterman
http://genome-www.stanford.edu/Saccharomyces/help/sw_alignment.html
http://genome-www.stanford.edu/Saccharomyces/help/sw_details.html
http://www.stanford.edu/~sntaylor/bioc218/final.htm
http://www.maths.tcd.ie/~lily/pres2/sld009.htm
http://bioweb.ncsa.uiuc.edu/~workshop/Lab_3/Smith-Waterman.htm
http://www.tigem.it/LOCAL/SW/threshold.html
http://sgbcd.weizmann.ac.il/genweb/help/smith-waterman.html
http://cbrg.ethz.ch/ServerBooklet/section2_3_5.html
Needleman & Wunsch
http://www.maths.tcd.ie/~lily/pres2/sld003.htm
http://acer.gen.tcd.ie/~amclysag/nwswat.html
http://www.nada.kth.se/~erikw/thesis/chapter2_3.html
http://www.irbm.it/irbm-course95/gb/docs/amps/subsection3_6_1.html
http://www.ibc.wustl.edu/~zuker/Bio-5495/align-html/node3.html
General (NW vs. SW vs. HMM, etc.)
http://www.maths.tcd.ie/~lily/pres2/
http://acer.gen.tcd.ie/~amclysag/nwswat.html
http://laguerre.psc.edu/biomed/TUTORIALS/SEQUENCE/MULTIPLE/tutorial.html
http://www.cse.ucsc.edu/research/compbio/
Hmms
http://www.medmicro.mds.qmw.ac.uk/HMMER/main.html
http://alfredo.wustl.edu/ismb96/abs/p02.html
http://www.cse.ucsc.edu/research/compbio/html_format_papers/hughkrogh96/cabios.html
http://wwwsyseng.anu.edu.au/~jason/hmmlinks.html
http://www.breadfan.com/markov.html
http://cslu.cse.ogi.edu/HLTsurvey/ch1node34.html
http://www.ibc.wustl.edu/service/hmmalign/glocal.html
http://www.cse.ucsc.edu/research/compbio/html_format_papers/ismb94/node5.html
http://www.iscs.nus.edu.sg/~luakt/ic3222/lecture/nlp18new/index.htm
http://www.cse.ucsc.edu/research/compbio/sam.html SAM Software for HMMs
Genetic Algorithms
http://www.staff.uiuc.edu/~carroll/ga.html
http://kal-el.ugr.es/gags.html
http://kal-el.ugr.es/~jmerelo/GAJS.html
Online Lectures on Bioinformatics
Online Lectures on Bioinformatics
Biological preliminaries
Introduction
References
Exercises
Answers
Analysis of individual sequences
Introduction
Helical wheels, amphipatic helices and coiled-coils
References
Exercises
Pairwise sequence comparison
Introduction
Dot plots
Sequence Alignment
References
Exercises
Answers
Algorithms for the comparison of two sequences
Introduction: Type I and type II alignments
Type I alignment
Type II alignment
General gap functions for type II alignments
Minimum distance alignments
Minimum distance alignment score as a metric on sequences
References
Suboptimal Alignments
Introduction
Suboptimal points
Suboptimal Alignments
Stability
Application
References
Variants of the dynamic programming algorithm
Introduction
Free end-gaps
Local Alignments
Suboptimal Alignments
Similarity and Distance
Parametric Alignments
Linear in Space Algorithm
Practical Sections on Pairwise Alignments
Alignment Applet
Alignment Quiz
Multiple sequence alignment
Introduction
References
Exercises
Algorithms for SP-optimal multiple alignments
Definition
Motivation
Score functions
Problem
The exact solution
Saving space
Affine gap-costs
Reduction of search space
Divide-and-Conquer Alignment
References
Phylogenetic Trees and Multiple Alignments
Evolution
Ultrametric Trees
Iterative alignment strategy
Additive trees
Reconstruction of additive trees
Approximating additive metrics
Heuristics
Character-based methods for Phylogeny Construction
Parsimony
Protein Structure
Introduction
Different Levels of Protein Structure
Prediction Methods
Secondary Structure Prediction
Comparative Modelling
References
Related Links
Exercises
Links
Biological preliminaries
Introduction
References
Exercises
Answers
Analysis of individual sequences
Introduction
Helical wheels, amphipatic helices and coiled-coils
References
Exercises
Pairwise sequence comparison
Introduction
Dot plots
Sequence Alignment
References
Exercises
Answers
Algorithms for the comparison of two sequences
Introduction: Type I and type II alignments
Type I alignment
Type II alignment
General gap functions for type II alignments
Minimum distance alignments
Minimum distance alignment score as a metric on sequences
References
Suboptimal Alignments
Introduction
Suboptimal points
Suboptimal Alignments
Stability
Application
References
Variants of the dynamic programming algorithm
Introduction
Free end-gaps
Local Alignments
Suboptimal Alignments
Similarity and Distance
Parametric Alignments
Linear in Space Algorithm
Practical Sections on Pairwise Alignments
Alignment Applet
Alignment Quiz
Multiple sequence alignment
Introduction
References
Exercises
Algorithms for SP-optimal multiple alignments
Definition
Motivation
Score functions
Problem
The exact solution
Saving space
Affine gap-costs
Reduction of search space
Divide-and-Conquer Alignment
References
Phylogenetic Trees and Multiple Alignments
Evolution
Ultrametric Trees
Iterative alignment strategy
Additive trees
Reconstruction of additive trees
Approximating additive metrics
Heuristics
Character-based methods for Phylogeny Construction
Parsimony
Protein Structure
Introduction
Different Levels of Protein Structure
Prediction Methods
Secondary Structure Prediction
Comparative Modelling
References
Related Links
Exercises
Links
Monday, September 11, 2006
About Biomatics
Dear all,
Greetings of the day!
Greetings of the day!
"Bioinfo Consortium -India" -is a non-profit student sector.It helps the students of Biotechnology,Bioinformatics,Microbiology,Biochemistry to get training in the institutes/in industries.It also helps the students in job hunting and for those who looking for higher studies in India and abroad. This cant be done by a single person, we need your help.So I request everyone help me by sending informations about trainings,jobs,education,research opportunities and other useful informations to founder_biomatics@yahoo.co.in. The informations will be displayed with your name,photo and address in this site,so while sending the informations send your profile with your photo. Those who received this URL please forward to your friends in India and abroad.
Advance thanks for your contributions.
With regards,
Rajesh kumar.R
Founder-Biomatics
India
MOTTO: To bring awareness in recent trends in bioinformatics/biotechnology to improve the working knowledge of Students and to help the aspiring students who want to make a career in Bioinformatics/biotechnology.
Founder-Biomatics
India
MOTTO: To bring awareness in recent trends in bioinformatics/biotechnology to improve the working knowledge of Students and to help the aspiring students who want to make a career in Bioinformatics/biotechnology.
Thursday, September 07, 2006
Interactive Biology
------------------------------------------------------
Interactive Biology
Biomatics sees the web as a tool which can make education a more interactive and exploratory process, by making freely available educational resources at a variety of levels and also, even more importantly, be making it increasingly possible to learn by doing . To encourage understanding and use of the web in this way, Biomatics provides annotated lists of links useful for learning/teaching in a variety of areas (see Interactive Physics and Interactive Chemistry). The focus is not on course syllabi or notes but rather on materials from which individuals can learn themselves, and particularly on those with an interactive component. Suggestions for additions to the list are warmly welcomed and should be sent to founder_biomatics@yahoo.co.in.
Comprehensive Biology Tutorials
The Biology Project ... wealth of tutorials on many subjects within biology. many problem sets with tutorials that accompany each question
BioInteractive ... a neat website that promotes understanding of biology through interactive learning. Includes a great set of computer lab experiments where you can click around and become a virtual scientist. You can "identify deadly pathogens, probe heart patients, dissect a leech, or assay antibodies". Also check out the "click and learn" section. Provided by the Howard Hughes Medical Institute
Topics in General Biology ... having difficulty in bio class? This exhaustive list of topics links to helpful descriptions of each.
Online Biology Book ... an on-line textbook developed originally as a series of lectures in introductory biology by a professor at Estrella Mountain Community College. Lots of great accompanying pictures!
Biology Labs on-Line ... Biology Labs On-Line offers a series of interactive, inquiry-based biology simulations and exercises designed for college and AP high school biology students. Unfortunately, it requires a subscription
Scientific American: Ask the experts ... a list of questions you may have had on biology (ie. How is bug blood different from our own?) linked to answers from biologists
Life Science Resources ... the laboratory investigations include some very simple, low-budget labs designed to effectively illustrate some of the basic, important concepts of biology
Sites of General Scientific Interest
World Lecture Hall ... this is a listing of course materials available online on a large variety of topics, including the sciences and biology
Smithsonian National Museum of Natural History ... in addition to finding information about the museum's activities and exhibits, you can actually see online editions of some of the exhibits. The "African Voices" exhibit is particularly beautiful and interesting
The Game of Life ... You control the parameters of life in this interactive game. Students select such factors as loneliness and overcrowding (and even which graphics to use) then play the game to see how the population fares. Continue playing to determine if successive generations will survive
Demos at Bonus.com ... Provides many interactive (uses the shockwave plug-in) presentation topics such as "what do cigarettes do to your lungs" , "why do leaves change colors in the fall" and "what's in blood". Great for younger children and in-class presentations
Florida Museum of Natural History ... Wealth of biology information including photo galleries, virtual exhibits and publications
Science and Education ... Serendip's Science Education website offers many opportunities to explore the changing views on education in a theoretical and practical sense
Playing with Applets
BioInteractive ... a neat website that promotes understanding of biology through interactive learning. Includes a great set of computer lab experiments where you can click around and become a virtual scientist. You can "identify deadly pathogens, probe heart patients, dissect a leech, or assay antibodies". Also check out the "click and learn" section. Provided by the Howard Hughes Medical Institute
The Game of Life ... You control the parameters of life in this interactive game. Students select such factors as loneliness and overcrowding (and even which graphics to use) then play the game to see how the population fares. Continue playing to determine if successive generations will survive
Online Onion Root Tips ... over the course of this tutorial, one learns how to classify the stage of a cell into one of the five phases of the cell cycle, then sees how much time the cell spends in each phase
The Virtual Cell ... This ambitious site presents an interactive, animated exploration of the cell, along with a good virtual textbook.
Mendel's Pea Experiment ... Students learn the basics of Mendelian genetics through an interactive cross between pea plants. Simple but effective in teaching about recessive and dominant traits.
Interactive Frog Dissection ... this site leads you through the dissection and study of a frog. Perhaps not as effective as the real thing, but certainly a valuable tool
The Last Straw ... after exploring some of the brief tutorials on plant life and growth, this site offers a computer simulation model on plants and water stress that allows you to experiment with various climatic conditions, and then to collect data on number of leaves, stomatal opening, water tension, and more, after the plant's growth
HIV 2000 ... a program that simulates the spread of the HIV virus through a population, then try to determine the original carriers of the disease
Karyotyping Activity ... learn about karyotyping of chromosomes by matching the chromosomes of a given patient and diagnosing their genetic disease
Nutritional Analysis Tool 2.0 ... enter in a number of types of food (ie, last night's meal) and find out the exact nutritional information, including Calories, protein, fat, carbohydrates, sodium, vitamin A, vitamin C, saturated fat, and cholesterol
HIV 2000 ... a program that simulates the spread of the HIV virus through a population, then try to determine the original carriers of the disease
Karyotyping Activity ... learn about karyotyping of chromosomes by matching the chromosomes of a given patient and diagnosing their genetic disease
Nutritional Analysis Tool 2.0 ... enter in a number of types of food (ie, last night's meal) and find out the exact nutritional information, including Calories, protein, fat, carbohydrates, sodium, vitamin A, vitamin C, saturated fat, and cholesterol
Kids Stuff
Life Science Resources ... the laboratory investigations include some very simple, low-budget labs designed to effectively illustrate some of the basic, important concepts of biology
Cool Science for Curious Kids ... this interactive website encourages curious kids to explore biology through games of elementary classification and learning, from the Howard Hughes Medical Institute.
Demos at Bonus.com ... Provides many interactive ( uses the shockwave plug-in ) presentation topics such as "what do cigarettes do to your lungs" , "why do leaves change colors in the fall" and "what's in blood". Great for younger children and in-class presentations
Scientific American: Ask the experts ... a list of questions you may have had on biology (ie. How is bug blood different from our own?) linked to answers from biologists
Stalking the Mysterious Microbe ... invites kids to join Sam Sleuth as he unravles the mysteries of microbes, learning all about these "invisible companions". includes sections for news, experiments, and careers
Neuroscience for Kids ... Eric Chudler of the University of Washington presents a clear and easy-to-understand introduction to the field of neuroscience. This searchable site has fun activities, games, ideas for experiments, a question-and-answer page, newsletter subscription, and relevant links. It is an very interesting resource for students and teachers alike.
Brain Connection: Brain and Learning ... This is an excellent site about the brain and learning! You will find articles, brain building activities (for kids and adults), animations of brain processes, a library, a gallery, an anatomy section and more. You will need the free Flash and Shockwave players for the activities and animations
Nature and Wildlife Field Guide ... learn all you want about any of the 4800 species of North American plants and animals described in this field guide. I particularly enjoyed listening to the catalogued bird calls. From enature.com
Virtual Whale Watch ... follow the arrows to see sequential photographs and explanations that detail a humpback-whale watching trip
Life Science Resources ... the laboratory investigations include some very simple, low-budget labs designed to effectively illustrate some of the basic, important concepts of biology
Cool Science for Curious Kids ... this interactive website encourages curious kids to explore biology through games of elementary classification and learning, from the Howard Hughes Medical Institute.
Demos at Bonus.com ... Provides many interactive ( uses the shockwave plug-in ) presentation topics such as "what do cigarettes do to your lungs" , "why do leaves change colors in the fall" and "what's in blood". Great for younger children and in-class presentations
Scientific American: Ask the experts ... a list of questions you may have had on biology (ie. How is bug blood different from our own?) linked to answers from biologists
Stalking the Mysterious Microbe ... invites kids to join Sam Sleuth as he unravles the mysteries of microbes, learning all about these "invisible companions". includes sections for news, experiments, and careers
Neuroscience for Kids ... Eric Chudler of the University of Washington presents a clear and easy-to-understand introduction to the field of neuroscience. This searchable site has fun activities, games, ideas for experiments, a question-and-answer page, newsletter subscription, and relevant links. It is an very interesting resource for students and teachers alike.
Brain Connection: Brain and Learning ... This is an excellent site about the brain and learning! You will find articles, brain building activities (for kids and adults), animations of brain processes, a library, a gallery, an anatomy section and more. You will need the free Flash and Shockwave players for the activities and animations
Nature and Wildlife Field Guide ... learn all you want about any of the 4800 species of North American plants and animals described in this field guide. I particularly enjoyed listening to the catalogued bird calls. From enature.com
Virtual Whale Watch ... follow the arrows to see sequential photographs and explanations that detail a humpback-whale watching trip
Cell Biology
Online Onion Root Tips ... over the course of this tutorial, one learns how to classify the stage of a cell into one of the five phases of the cell cycle, then sees how much time the cell spends in each phase
The Virtual Cell ... This ambitious site presents an interactive, animated exploration of the cell, along with a good virtual textbook.
Virtual Cell ... An interactive journey through a plant cell. Read the "About Virtual Cell" before you begin to understand the controls. This site doesn't require any special plug-ins or browsers
CELLS Alive! ... a great site for learning about cells or even just fun cruising. Tutorials under such topics as Cell structure and function, Microbes, the Immune system, and Microscopy
The Wonders of Microbes ... a very visually interesting site that explores the world of the tiny organisms that are the foundation of life on earth, with very interesting connections to the world at large.
Molecular Expressions... this site offers one of the web's largest collections of color photographs taken through an optical microscope. many fascinating photo galleries including DNA, amino acids, birthstones, and much more.
Stalking the Mysterious Microbe ... invites kids to join Sam Sleuth as he unravles the mysteries of microbes, learning all about these "invisible companions". includes sections for news, experiments, and careers
Online Onion Root Tips ... over the course of this tutorial, one learns how to classify the stage of a cell into one of the five phases of the cell cycle, then sees how much time the cell spends in each phase
The Virtual Cell ... This ambitious site presents an interactive, animated exploration of the cell, along with a good virtual textbook.
Virtual Cell ... An interactive journey through a plant cell. Read the "About Virtual Cell" before you begin to understand the controls. This site doesn't require any special plug-ins or browsers
CELLS Alive! ... a great site for learning about cells or even just fun cruising. Tutorials under such topics as Cell structure and function, Microbes, the Immune system, and Microscopy
The Wonders of Microbes ... a very visually interesting site that explores the world of the tiny organisms that are the foundation of life on earth, with very interesting connections to the world at large.
Molecular Expressions... this site offers one of the web's largest collections of color photographs taken through an optical microscope. many fascinating photo galleries including DNA, amino acids, birthstones, and much more.
Stalking the Mysterious Microbe ... invites kids to join Sam Sleuth as he unravles the mysteries of microbes, learning all about these "invisible companions". includes sections for news, experiments, and careers
Neurobiology
Neuroscience for Kids ... Eric Chudler of the University of Washington presents a clear and easy-to-understand introduction to the field of neuroscience. This searchable site has fun activities, games, ideas for experiments, a question-and-answer page, newsletter subscription, and relevant links. It is an very interesting resource for students and teachers alike.
Neuroscience for Kids ... Eric Chudler of the University of Washington presents a clear and easy-to-understand introduction to the field of neuroscience. This searchable site has fun activities, games, ideas for experiments, a question-and-answer page, newsletter subscription, and relevant links. It is an very interesting resource for students and teachers alike.
Brain Connection: Brain and Learning ... This is an excellent site about the brain and learning! You will find articles, brain building activities (for kids and adults), animations of brain processes, a library, a gallery, an anatomy section and more. You will need the free Flash and Shockwave players for the activities and animations
Time to Think? ... This web exhibit of Serendip's own tries to explain and recreate a scientific experiment conducted in the 1860's. By testing various types of reaction rates, the experiment was to determine whether the act of thinking was one that required time. You can conduct your own experiments on this matter from the convenience of your very own computer, as well as compare your results to those of other exhibit visitors
Brain and Behavior ... Serendip's Brain and Behavior page is designed more to raise questions than to answer them. To this end, there are many interesting interactive exhibits designed to help you draw your own conclusions about the functioning of the brain
Time to Think? ... This web exhibit of Serendip's own tries to explain and recreate a scientific experiment conducted in the 1860's. By testing various types of reaction rates, the experiment was to determine whether the act of thinking was one that required time. You can conduct your own experiments on this matter from the convenience of your very own computer, as well as compare your results to those of other exhibit visitors
Brain and Behavior ... Serendip's Brain and Behavior page is designed more to raise questions than to answer them. To this end, there are many interesting interactive exhibits designed to help you draw your own conclusions about the functioning of the brain
Plants/Animals
Virtual Zoo ... This comprehensive index site organizes virtually every worthwhile link related to pets, veterinary medicine, and domesticated animal behavior and biology. If you have any animal-related question or problem, you will find help in this huge and growing site.
Barry's Carnivorous Plants ... a very comprehensive site from a man who loves his carnivorous plants. There's a huge FAQ section that answers questions on the familiar venus fly-trap as well as the more exotic plants. Anyone wishing to grow their own can find many tips as well as a wealth of pictures
Tree of Life ... starting at the broadest category of life forms, choose categories to narrow the scope finally down to a specific organism, with info and references at every step along the way. A good way to see the interconnectedness of earth's many life forms
Mendel's Pea Experiment ... Students learn the basics of Mendelian genetics through an interactive cross between pea plants. Simple but effective in teaching about recessive and dominant traits.
Interactive Frog Dissection ... this site leads you through the dissection and study of a frog. Perhaps not as effective as the real thing, but certainly a valuable tool
Nature and Wildlife Field Guide ... learn all you want about any of the 4800 species of North American plants and animals described in this field guide. I particularly enjoyed listening to the catalogued bird calls. From enature.com
The Last Straw ... after exploring some of the brief tutorials on plant life and growth, this site offers a computer simulation model on plants and water stress that allows you to experiment with various climatic conditions, and then to collect data on number of leaves, stomatal opening, water tension, and more, after the plant's growth
Disease/Disease Fighting
Internet Pathology Library ... This heavily illustrated pathology resource contains some unsettling images; it is not for the squeamish. Medical novices should start with the "Mini-tutorials." The terse, matter-of-fact essays and detailed medical images give a grim reality to the consequences of drinking and smoking, as well as other, less avoidable health problems. "General Pathology" provides a guided tour of what a pathologist sees when examining a body. Feeling expert? Take a stab at diagnosing the "Case of the Week."
Visualizations of Viruses ... The University of Wisconsin's Institute for Molecular Virology has assembled a gallery of computer images created using electron microscopy and X-ray crystallography
CELLS Alive! ... a great site for learning about cells or even just fun cruising. Tutorials under such topics as Cell structure and function, Microbes, the Immune system, and Microscopy
National Cancer Institute: Science Behind the News ... several detailed, engaging tutorials on the nature of cancer, angiogenesis, gene testing, and the immune system
Stalking the Mysterious Microbe ... invites kids to join Sam Sleuth as he unravles the mysteries of microbes, learning all about these "invisible companions". includes sections for news, experiments, and careers
Blackout Syndrome ... People are suffering from a mystery disease! What is it? Where did it come from? How can it be stopped? Read the interactive mystery and unravel the clues, then e-mail your answer!
Internet Pathology Library ... This heavily illustrated pathology resource contains some unsettling images; it is not for the squeamish. Medical novices should start with the "Mini-tutorials." The terse, matter-of-fact essays and detailed medical images give a grim reality to the consequences of drinking and smoking, as well as other, less avoidable health problems. "General Pathology" provides a guided tour of what a pathologist sees when examining a body. Feeling expert? Take a stab at diagnosing the "Case of the Week."
Visualizations of Viruses ... The University of Wisconsin's Institute for Molecular Virology has assembled a gallery of computer images created using electron microscopy and X-ray crystallography
CELLS Alive! ... a great site for learning about cells or even just fun cruising. Tutorials under such topics as Cell structure and function, Microbes, the Immune system, and Microscopy
National Cancer Institute: Science Behind the News ... several detailed, engaging tutorials on the nature of cancer, angiogenesis, gene testing, and the immune system
Stalking the Mysterious Microbe ... invites kids to join Sam Sleuth as he unravles the mysteries of microbes, learning all about these "invisible companions". includes sections for news, experiments, and careers
Blackout Syndrome ... People are suffering from a mystery disease! What is it? Where did it come from? How can it be stopped? Read the interactive mystery and unravel the clues, then e-mail your answer!
Genetics
Mendel's Pea Experiment ... Students learn the basics of Mendelian genetics through an interactive cross between pea plants. Simple but effective in teaching about recessive and dominant traits.
general
Karyotyping Activity ... learn about karyotyping of chromosomes by matching the chromosomes of a given patient and diagnosing their genetic disease
Genetic Science Learning Center ... Learning genetics is fun at the Genetic Science Learning Center! At this site you can build your own DNA molecule online, discover what makes a firefly glow, and get the recipe for extracting DNA out of any living thing using household items. check out the online and hands-on activities to build a DNA molecule, find a gene on a chromosome map, learn how proteins work, make a cell model, and more. They also feature sections on genetic disorders and genetics in society
Human Anatomy
Anatomy of the Heart ... the Texas Heart Institute provides this website that gives ou a visual tour through the cardiovascular system. pictures of such things as the circulatory system, the coronary arteries, the conduction system, and the heart valves are accompanied by explaining text
Gray's Anatomy of the Human Body ... The Bartleby.com edition of Gray's Anatomy of the Human Body (originally published in 1918) features over a thousand engravings, many in color, as well as a subject index with 13,000 entries ranging from the Antrum of Highmore to the Zonule of Zinn. not very playful, but amazingly complete
Human Anatomy Online ... click on a picture of one of the systems of the body for more detail. Continue clicking to find names and information corresponding to the many parts of the body
Anatomy of the Heart ... the Texas Heart Institute provides this website that gives ou a visual tour through the cardiovascular system. pictures of such things as the circulatory system, the coronary arteries, the conduction system, and the heart valves are accompanied by explaining text
Gray's Anatomy of the Human Body ... The Bartleby.com edition of Gray's Anatomy of the Human Body (originally published in 1918) features over a thousand engravings, many in color, as well as a subject index with 13,000 entries ranging from the Antrum of Highmore to the Zonule of Zinn. not very playful, but amazingly complete
Human Anatomy Online ... click on a picture of one of the systems of the body for more detail. Continue clicking to find names and information corresponding to the many parts of the body
Nutrition
Nutritional Analysis Tool 2.0 ... enter in a number of types of food (ie, last night's meal) and find out the exact nutritional information, including Calories, protein, fat, carbohydrates, sodium, vitamin A, vitamin C, saturated fat, and cholesterol
general
Games/Quizzes
The Biology Project ... wealth of tutorials on many subjects within biology. many problem sets with tutorials that accompany each question
The Game of Life ... You control the parameters of life in this interactive game. Students select such factors as loneliness and overcrowding (and even which graphics to use) then play the game to see how the population fares. Continue playing to determine if successive generations will survive
Quia ... this is a collection of online flashcards, matching, the game of concentration, and word searches on such topics as amino acids and proteins, cell organelles, DNA, heredity, reproduction and development, and simple animals
Miscellaneous
The Game of Life ... You control the parameters of life in this interactive game. Students select such factors as loneliness and overcrowding (and even which graphics to use) then play the game to see how the population fares. Continue playing to determine if successive generations will survive
World Lecture Hall ... this is a listing of course materials available online on a large variety of topics, including the sciences and biology.
Darwin Awards ... you may or may not have heard of the darwin awards, which "celebrate the theory of evolution by commemorating the remains of those who improved our gene pool by removing themselves from it in really stupid ways". Either way, you'd have to find this an interesting take on the study of evolution
Additional Resources
Biol 121 Slide List ... a listing of many pictures that are clear and may be of great use in accompanying studying or class discussion
Graphics Gallery ... this gallery is another great visual resource. the Graphics Galleryis a series of labeled diagrams with explanations representing the important processes of biotechnology. Each diagram is followed by a summary of information, providing a context for the process illustrated.
Virtual Library: Biosciences ... the staggeringly complete list of links of the virtual library make it a great place to go to find additional information
------------------------------------------------------
Wednesday, September 06, 2006
Bioinformatics
Article By
Rajesh kumar.R
MSc.,Bioteh,M.Phil.,Biotech,PGDBI,PGDIPR
INTRODUCTION
The 'first' convergence of computer and communication technologies in the latter half of the last century resulted in networks in general and the Internet in particular. The result of the first convergence, i.e. network technologies, is now converging with biomedical and genetic technologies to give rise to the second convergence. Consequently, an unprecedented increase in the quantity of information is being produced from genetic laboratories all over the world. This exponential increase in the quantity of information seems to be a phenomenon similar to that of the 'information explosion' after World War II, referred by Bowles (1999, p. 156). The phenomenal quantity of information (or merely of data?) currently being produced as a result of the second convergence can therefore be called the 'second information explosion'. In the light of this second information explosion and the coming of age of the information society in many parts of the world, we can evidence the emergence in the last three decades of bioinformatics.
WHAT IS BIOINFORMATICS?
The question that comes to the mind every time the term 'bioinformatics' is uttered is, 'what indeed is bioinformatics?' It is more than two decades now that this term has been used but a consensus definition still eludes, even when a large quantity of literature spanning many disciplines is being produced in both printed and digital form (Jeevan, 2002). The rise of undergraduate and advanced degree and training programmes in bioinformatics all over the world has been phenomenal (Altschul, 2005). In this burgeoning field, it is becoming increasingly difficult to establish what are or will be the essentials [of bioinformatics]. The readers of this volume will have a role in defining bioinformatics' future. (Altschul, 2005) It is not that there have not been attempts to define the emerging subject. The origin of many subjects at the end of the last century and in the 21st century is cross-disciplinary in nature, and bioinformatics does not seem to be an exception to this phenomenon. The cross-disciplinary nature of bioinformatics is evident in the words of Critchlow et al. (2000): Depending on whom you ask, bioinformatics can refer to almost any collaborative effort between biologists or geneticists and computer scientists - from database development, to simulating the chemical reaction between proteins, to automatically identifying tumors in MRI images. Attwood and Parry-Smith (2001, pp. 2-3) trace the context of evolution of the term as follows: During the last decade, molecular biology has witnessed an information revolution as a result of both of the development of rapid DNA sequencing techniques and of the corresponding progress in computer-based technologies, which are allowing us to cope with this information deluge in increasingly efficient ways. The broad term that was coined in the mid-1980s to encompass computer applications in biological sciences is bioinformatics. The emergence of bioinformatics as a cross-disciplinary subject has been well recognized. Biologists and geneticists were the first ones who started on the path that we today call bioinformatics. People from many other subjects, such as computer scientists, mathematicians and statisticians joined the bandwagon. The initial intended meaning of the term bioinformatics was the application of computers and associated machines for handling, storing and manipulating biological data. As Attwood and Parry-Smith (2001, p. 3) further point out, the term bioinformatics has been commandeered by several different disciplines to mean rather different things . . .. In the context of genome initiatives, the term was originally applied to the computational manipulation and analysis of biological sequence data (DNA and/or protein). Going through various definitions of this term, it appears that "the term 'Bioinformatics' is not really well-defined" (Weizmann Institute of Science, 200?) and many definitions are simply descriptions of activities carried out under this name. Some of the following identify subject areas where the term can be used, while others specify activities. It is used synonymously and interchangeably with computational biology, genetics, genomics, and molecular biology. Altschul (2005) commented on the growth and development of bioinformatics in the Foreword to the collected papers of the National Conference on Bioinformatics Computing, held in March 2005 at Patiala, India. But it can safely be said, from the above definitions (strict, loose, functional, etc.) and descriptions, that bioinformatics is concerned primarily with all aspects of the lifecycle of genetic and related data and information after its generation, such as its retrieval, dissemination, interpretation, and interrelation with other information (largely biological), in "converting it into knowledge." (Liebman, 1995). A diverse pattern emerges out of the above definitions that show the variety of activities covered under this term. Therefore, it is like a umbrella term, with every activity dealing with genetic and associated information for various purposes being covered under the bioinformatics umbrella. People from many disciplines are now coming together under this umbrella as tools-suppliers. The need of the hour is to evolve a cross-disciplinary definition that incorporates the concerns of all disciplines that have been contributing to this emerging subject.
INDIAN BIOINFORMATICS
Bioinformatics has the potential to be one of the fastest growing part of the Indian economy, after considering the factors like bio-diversity, human resources, infrastructure facilities and government's initiatives. International Data Corporation (IDC) has been reported that the pharmaceutical firms and research institutes in India are looking forward for cost-effective and high-quality research, development, and manufacturing of drugs with more speed. Moreover, Studies of IDC points out that India could capture the 8 per cent of the global bioinformatics market by 2008-09. The Indian Bioinformatics market was US $ 15 million as estimated by IDC in 2001. Growing at a compound annual growth rate (CAGR) of 20 per cent, the market size in 2003- 04 was around US $ 22 million. Indian Bioinformatics is in its nascent stage, only 6-7 yeas old. According to the survey conducted by CII Out (2005) of all the present companies 15.4 per cent are 5 years old, 23.1 are 3 years old and 61.5 per cent are only 2 years old. These companies are well diversified: 22 per cent are working in bioinformatics software development and another 22 per cent are working for molecular sequence analysis and data mining. The next major area is functional genomics in which 19 per cent of the companies are involved and 11 per cent of the firms are such that they are working in all the areas mentioned. The CII survey results also show that only 20 per cent of the companies in the sample are working exclusively in Bioinformatics. For 80 per cent of the companies, Bioinformatics is just a part of the business. Pure cost benefits for the biotech companies will definitely drive the bioinformatics field in the country. The biotech industry in 2000 has spent an estimated 36 percent on R & D. Success for many will mean a drastic reduction in R&D costs. Thus biotech companies will be forced to outsource software rather than developing propriety software like in the past. Since the cost of programs for handling this data is extremely high in the west, Indian IT companies have a great business opportunity to offer complete database solutions to major pharmaceutical and genome-based biotech companies in the world. The IT industry can also diversify its business focusing more on genomics through different levels of participation areas such as hardware, database product and packages, implementation and customization of software, and functionality enhancement of database. Abraham Thomas, managing director, IBM India Ltd, says, "the alignment of a vast pool of scientific talent, a world-class IT industry, a vigorous generic pharmaceutical sector and government initiatives in establishment of public sector infrastructure and research labs are positioning India to emerge as a significant participant on the global bioinformatics map."(BioSpectrum 2005) With an objective to help and rise bioinformatics sector to the world map the Bioinformatics Society of India (Inbios) has been working since August 2001. The Inbios already has over 270 members in a short span of one and half years. It has become a common informal platform for the younger generation to learn and contribute to this sun rising field in India. Next to these opportunities, it is also necessary to identify few problems that India shells solve to find a real advantage in the bioinformatics. The identifiable areas are in computation biology and bioinformatics, where a substantial level of development skills are required to develop custom applications to knot together and integrate disparate databases (usually from several global locations), simulations, molecular images, docking programs etc. The industry people, meanwhile, say that the mushrooming of bioinformatic institutes is creating a problem of finding talented and trained individuals in this industry. While many of them has a superficial knowledge and a certificate, India lacks true professionals in this area. Most people, who opt for bioinformatics are from the life sciences areas that do not have exposure to the IT side of bioinformatics, which is very important. Another issue is that some companies face shortage of funds and infrastructure. The turn around time for an average biotech industry to breakeven would be around three to five years. Most of the venture capitals and other sources of funding would not be very supportive, especially if the company is not part of a larger group venture. It would help if the government would take an active role in building infrastructure and funding small and medium entrepreneurs.
THE BANGALORE AGGLOMERATION
At this step, we want to point out the attention towards the geographical agglomeration of the bioinformatics in India. The most developed area is in the South India and, in particular, Bangalore and Karnataka Region. We are interested to incorporate culture and ethnicity as independent variables in location decisions in order to explain the regional concentration. Literature on the history and anthropology of India has also been consulted. There are two sub-question belonging to the location argument, which are tested empirically through secondary data from existing interview-based. Firstly, some ethnic and cultural groups in India apparently are more prone to knowledge-intensive industries due to their higher appreciation of learning. There are diverse culturally rooted attitudes towards education and technological as well as economic change. Secondly, the 'regional' culture of the South seems to be also more open in the sense of accommodating entrants from elsewhere - thereby converging initially diverse populations to a 'monocultural' one (cf. Klemm et al., 2005). There is a general misconception of 'the Hindu culture' or attitude towards modernization and innovation. Economists arrived at the crude conclusion that in principle it impedes the modernization of the Indian economy (e.g. Akerlof, 1976; Lal, 1988) not acknowledging existing anthropological fieldwork. While this pessimistic view of traditional 'cultures' has been restated in more general fashion in the collection of essays edited by Harrison and Huntington (2000), there are also more nuanced discussions of the interplay between culture and the economic realm (e.g. Rao and Walton, eds., 2004). Recently, there is more than anecdotal evidence that new Indian enterprises are determined even by the formerly priestly Brahmin caste rather than Vaishyas, the traditional business caste (Das, 2001). It might result from the fact that Brahmins have been involved more generally with activities relating to knowledge (Sen, 1997). Earlier Brahmins had a much more negative attitude towards business, trade and commerce in general (Adams, 2001; Rutten, 2002). With regard to South India there are a few notable deviations. Primarily, there have always been high-caste non-Brahmins pertaining to the indigenous population who were not only engaged with the learning of their sacred scripts but 'who were adept in Sanskrit learning as well' (Stein, 1999: 52). Hence, the foundations for a knowledge-based society have existed in South India ever since and, moreover, have been much more diffused throughout the broader society. Secondly, and related to the first, the population of the South is said to be much more homogenous than in the North. For instance, political movements in favor of backward groups started much earlier in South India and led to a more equal pattern compared to the still traditionally dominated, hierarchically oriented North (Jaffrelot 2002). Altogether, the Southern part of India seems to exhibit a more distinct regional culture of learning, not only in the sense of the regional development literature (Gertler 1997) but also literally. Apparently, this attitude is a solid foundation for the absorptive capacity necessary in order to adapt to new technologies. Although institutions of higher education have been allocated evenly over the whole country, there is a more than proportionate share of colleges, especially for engineering, and enrolment in the South (Chalam, 2000) (see tab. 1).
Table 1: Number of engineering colleges and enrolment compared to population Region Engineering collegesI EnrolmentI PopulationII No. National share Sanctioned capacity National share National share Central 50 7,54% 9,470 6,05% - East 25 3,77% 4,812 3,07% 25,8% North 140 21,12% 25,449 16,26% 31,3% West 140 21,12% 34,165 21,83% 19,6% South 308 46,46% 82,597 52,78% 23,2% Total 663 100,00% 156,493 100,00% 100,00%
I Source: Arora & Athreye (2002) II Source: Dossani (2002)
The regional distribution seems to be influenced by historical and geographical factors, at least to a certain extent. There are explanations like university-industry linkages with the premier research institutes, the establishment of Science - Technology Parks (STPs) close to the IITs and the IISc, as well as historical circumstances that led to the initial localizations. The historical factors rest in the early localization of science and technology related research and teaching institutions in Bangalore as an ideal place in terms of climate and infrastructure to conduct scientific research in strategic areas like defense and electronics. What is more surprising, however, is the distribution of socio-cultural and ethnic background. There have never in Indian history been so many entrepreneurial and managerial Brahmins as are seen in the bioinformatics now, and especially there have been few entrepreneurs from South India (Fromhold-Eisebith 1999; Kapur & Ramamurti 2001). Generally speaking, Brahmins were rather associated with priestly tasks, government jobs, all sorts of administration and landholding (Adams 2001). On the other hand, Brahmins as members of the priestly caste were always connected to all sorts of scholarly activities being related to knowledge, learning and teaching, like mathematics, but the brahminical education includes other sciences like grammar, geometry and logic (Sen 1997). Hence, there are many disciplines that are very useful for intellectually challenging professions like sciences or research related pharmaceuticals, biotechnology or software. Moreover, the combination of the subjects emphasized by a brahminical syllabus seems to be especially apt for bioinformatics, which requires not only mathematics but also language. Being handed down from one generation to the next for decades or even centuries would place descendants in a privileged position regarding such professions and, thus, be an example for a regional culture. Eventually, this has been compounded by land ownership and political power. Deshpande (2000) calls a cumulative advantage that the upper castes today are in such a strong position that in order to retain their privilege they do not need the customary inheritance of status anymore. However, a dominant position in administration could have been used in order to assure a more than proportionate share of Brahmins in high schools and universities (Adams 2001). But even if Brahmins have monopolized learning there might be a positive impact on the Indian economy in the 'knowledge age' (Das 2001). In addition what is unexpected is the relative under-representation of Vaishyas, the traditional group of entrepreneurs, although recent studies do not show a significant change in this occupation pattern (Deshpande 2000; Adams 2001). They have always been the entrepreneurial castes of the Hindu population providing economic services like trading, money lending (Rutten 2002). One explanation resides in the attitude of the traditional merchants and trader class towards risk and quick profits. They often prefer the latter and avoid taking risks, thus foregoing higher profits in the longer term (Frederking 2002). In the same vein, what follows for the ethnic background might simply be an eventual consequence - a path dependent process that resulted in a lock-in in South India. However, as has been argued above, these Southern states exhibit not only a higher appreciation of learning but also a more hospitable climate towards change, both technological and social.
THE BIOINFORMATIC MANPOWER
Considering that India has a large pool of scientific talent available at reasonable cost, a strong IT skilled, English speaking population, huge bio-diversity and a large number of research and development institutes, it would have a big role to play in the sunrise of Bioinformatics. A true Bioinformatician is the one who can form biological questions and find answers to those using Bioinformatics as tools. His objective should be to get real time solutions. There is enough demand for Bioinformaticians but the demand is of quality people who are not available. Software for a particular experiment can be structured into algorithms with fed in combinational results, so that the life scientist is able to get a result, which may have taken him much longer otherwise. But in all this what is required is the ability for either of these parties to communicate each other. Therefore, there is need for people from the Biology, Chemistry and other pure science background to work in a team with computer professionals. What is important to note here is that both these fields have absolutely different work ethics. While a computer scientist will take in a mathematical and set language, a biologist will take time to understand a sequence; he will be involved in research that may not result into a discovery over night. This is the importance of Bioinformatics creeps in. As Bioinformatics involves software development and/or implementation for storage and analysis of a vast amount of biological data, the professionals in the field are required to have the following skills:
· In depth programming · In depth knowledge of biology · Relational database skills
Most of the bioinformatics companies prefer people from physics, molecular biology, mathematics, statistics or computer science background rather than biologists turned programmers. Otherwise, it creates isolated groups that have little interaction with molecular biologists and biochemists, which will ultimately fail to achieve promise of better understanding of biology. Next to these good perspectives that we talked till now, we need to discuss the main deficiencies that are found in the Bioinformatics manpower. Despite the large availability of workforces, India is facing a shortage of "skilled" manpower in this field. Part of it is contributed by lack of high-end research and concentration on only product/process based research. Human capital in India is available in abundance, and this can be a boon if adequate efforts are made to train, preserve and retain the available manpower. Industry strongly feels that the available manpower is quite sufficient in number but it seriously lacks in the terms of skills. The reason behind this is most of the institutes and colleges are misusing this new hype. While the awareness is low, a huge number of inappropriately designed courses have been started. Students who cannot enroll themselves in High Graduate Institutes (like Pune University, Institute of Bioinformatics and Applied Biotechnology, JNU etc.) join courses in sub-standard institutes, getting degrees and creating more and more non-employable students, hence creating the wide resource gap in the field. Therefore, the Government needs to ensure that the courses, which are carefully designed and include relevant training experience should only get license to operate.
THE KLEPPER VIEW AND INDIAN BIOINFORMATICS
The geographical concentration can emerge for several reasons and usually it is explained by agglomeration economies involving knowledge spillovers across co-located companies to demonstrate a persistent regional leadership. A different theory, with many practical examples, argued by Klepper, shows that geographical concentration can emerge as a natural consequence of the clustering of early leaders in certain narrow regions without any agglomeration economies. Klepper's findings also indicate an important role for related industries in producing the initial group of successful firms and for the persistence of the early leaders (for example the case of tire industry agglomerated around a small northeastern Ohio city, Akron, with no compelling advantages for tire production but just with a tradition in the production of rubber). Geographical concentration and persistence of leading firms are closely related. So it appears - even for India - comparative advantage (English speaking, low cost of labor, etc.) is simply a necessary but not sufficient condition for explaining a good condition in the international market. One way to think this issue is, in addition to the comparative advantage, to be successful it is necessary distinctive sources of competences to breed successful firms. This interplay between macro and micro, between comparative advantage at the country level and organizational capabilities at the firm level, is the right framework. In the Indian bioinformatics, this theory can explain the origin of a significant percentage of the global market with the diversification from the Indian IT leaders. In fact, as we see before, the bioinformatics competences are very close to IT skills. And with the high performance of software leaders (the most important are TCS, Kshema Technologies, V-Moksha and Wipro), they decided to invest in this new field and now they are one of the best companies in this area.
Next to this condition it is important to underscore also the presence of several spin-offs from these leaders even if only a few percentage has a persistence in the market. Next to the organizational capabilities of IT leaders and their diversification towards bioinformatics is the primary reason to this concentration of companies in the Bangalore Area and moreover in Karnataka, it is relevant also to look at the government policies, which help to exchange the science competences in a smooth way in an already dynamic context.
The Wipro case
Wipro Technologies was ranked 21st worldwide among software services companies, and 86th in terms of best performing technology companies by BusinessWeek in 2002 and 2001. The company's IT business began in 1980, and by 2002 had about 12,000 employees operating out of 27 offices worldwide. With $736 million in revenues in 2001-2002 and a CAGR of 45 percent over the last five years, Wipro was one of the major success stories of the Indian IT industry (http://www.wipro.com/). In April 2002, Wipro CEO Azim Premji created Wipro Health Care and Life Sciences, a wholly-owned subsidiary located in Bangalore. This business addressed the requirements of the "bio-IT market," which Wipro defined to include traditional IT services to hospitals, health insurance companies, and medical and analytical devices companies as well as to players in the drug discovery value chain. Wipro estimated this market to be about $25 billion, growing at over 20 percent annually. Wipro's application of IT to drug discovery (or "pure BI") was still a relatively small part of its business, and most of their efforts were in the traditional IT service areas of outsourcing operations, IT consulting and enterprise package implementation in health care sector companies. Wipro Healthcare and Life Sciences had about 250 employees. Over the course of 2002, the company's plans to attack the pure BI market underwent some changes. The original plan to go after large pharmaceutical companies to offer customized software had evolved into a strategy based on partnering with other firms to integrate domain and IT knowledge. The CEO of the health care and life sciences business had said, "Though we have experience in this area, gained from our association with GE Medical, we would like to focus more on the domain knowledge side, where partners would bring in the necessary strength." The company was interested in concentrating on IT support for clinical trials, data management and statistical analysis. The partners would help in providing the necessary support services to accelerate the drug discovery process and generate data on chemical compounds. In the drug discovery stage, for example, Wipro wanted to work as a technology partner for large consortia of life science equipment vendors who supplied the equipment that aids the drug discovery process. Wipro would help in synchronizing the data that these equipment churned out so that customers would not need to custom design them using their in-house IT resources. In the drug approval phase, Wipro intended to capitalize on the FDA's plans to allow electronic filing of Investigational New Drug (IND) reports by pharmaceutical companies. Wipro would use its knowledge of Web solutions, data security and imaging technology to assist customers in the electronic filing process.
CONCLUSIONS
With the convergence of genetics and computers, the increase in the quantity of information being produced from genetic laboratories constitutes a 'second information explosion'. As a result of this convergence, bioinformatics has emerged as a transdisciplinary subject with literature on it being produced in many established disciplines. We focused our attention on the position of Indian Bioinformatics, which is just in the first steps. We analyzed the condition of the Bangalore area from an anthropologic point of view to explain a positive correlation between the South Indian culture and the knowledge economy, and in particular with Bioinformatics. Moreover, we explained the rise of a cluster of bioinformatics companies in the same area also using the Klepper's theory about the use of specific capabilities of a developed industry (IT), also with the example of Wipro.
Bioinformatics Tutorials
Bioinformatics Tutorials & Articles are here.Tutorials classified laying foundation course for both science and computer students for those aspiring bioinformatics.List also includes Bioinformatics Tutorials & Articles related to various tools.
Foundation Tutorials for Bioinformatics Aspiriants Computer Tutorials for Science Stream Students
Introduction to computer Concepts
oac3.hsc.uth.tmc.edu
java.sun.com/docs/books/tutorial/
developer.java.sun.com/developer/onlineTraining/
javaboutique.internet.com/tutorials/
Perl for Biologists (Weizmann Institute)
Perl for Biologists
savage.net.au/Perl-tutorials.html
Welcome to the Bioperl Project !
XML Tutorials
wdvl.internet.com
php.weblogs.com/sql_tutorial
perl.about.com/cs/beginningsql/
http://www.db.cs.ucdavis.edu/teaching/sqltutorial/
C & C++ Tutorials
dmoz.org/Computers/Programming/Languages/C/Tutorials/
CGI Tutorials
webdesign.about.com/cs/cgi/
http://www.cgi101.com/class/
Visual Basic Tutorials
visualbasic.about.com/
visualbasic.ittoolbox.com/
members.tripod.com/~vkliew/vb.html
webreference.com/programming/unix/
scidiv.bcc.ctc.edu
library.thinkquest.org/12413/
biology-online.org/
Cartoon Guide to Genetics
genomebiology.com/tutorials/
Tutorials in Molecular Biology
BioBook Glossary
Highveld.com - Internet Directory of Biology and Biotechnology
ESG Biology Hypertextbook Home Page
DOE Primer on Molecular Genetics
Introduction to Chemistry
users.rcn.com/bobsalsa/tutorial.htm
lrc-srvr.chemistry.ohio-state.edu
Periodic table of the elements
Interactive periodic table of the elements
Introduction to Biochemistry
xray.bmc.uu.se/Courses/Bke1/Tutorials/ Tutorialindex.html
http://www.jonmaber.demon.co.uk/
About DNA
biog-101-104.bio.cornell.edu/BioG101_104/ tutorials/recomb_DNA.html
avery.rutgers.edu/WSSP/Tutorials/ (chime plugin required)
DNA tutorial
DNA from the beginning
Central Dogma Glossary
About RNA
zombie.imsb.au.dk/~raybrown/
ndbserver.rutgers.edu/NDB/structure-finder/ tutorials/full_ndb.dna.rna.res.html
About Genome
genomebiology.com/tutorials/
anatomy.med.unsw.edu.au/cbl/GENOME/tutorials.htm
rsat.ulb.ac.be/rsat/tutorials/ tut_genome-scale-patser.html
home.uchicago.edu/~ebetran/guides.html
Basic Genome Glossary
Limited Genome Glossary
Genome Glossary
The Gene-School Glossary
Glossary of Genetic Terms
Bioinformatics Tutorials
Introduction to Bioinformatics
Bioinformatics (Genomics)
Biocomputing in a Nutshell.
Biologist's Guide to Internet Resources
Computational Molecular Biology Course
Course on Bioinformatics
EMBNet Biocomputing Tutorials
Finding the genes in the genomic sequences
The Genetic Programming Tutorial
Jose R. Valverde's training course documents
Principles of Computational Biology, Steven Salzberg.
Principles of Protein Structure Using the Internet
Practical Course "Bioinformatics: Computer Methods in Molecular Biology"
Sequence analysis course (José R, Valverde, EMBNet/CNB)
Bioinformatics -An excellent review on genetic code and information processing
Molecular Sequence Analysis -Introductory sequence analysis by Andrew S Louka
Homology Modelling -Protein and homology modelling for beginers
B iocompanion -Tutorial for sequence analysis
Bioinformatics and Genomic Analysis -Link to graduate student course at the university of Arizona
EMBnet Biocomputing Tutorials - Introduction
Integrative Bioinformatics: Practical Kinetic Modeling of Biological Systems
Biocomputing For Everyone !
The Biocomputing Glossary
Computational Biology Course, Martin Tompa
Course Distance Learning in Bioinformatics
Functional genomics glossaries
How to become a bioinformatics expert
Internet for biologists
Jose R. Valverde's 'dirty' training course documents
Algorithms in Molecular Biology (University of Washington)
Protein Sequence Analysis in the Genomic Era
Protein sequence and structure analysis: A practical guide.
Topics of Evolutionary Computation
VSNS BioComputing Division
Bioinformatics -Primer on biosequence comparisons
Algorithms in Molecular Biolgy -Excellent for learning bascis about many bioinfo tools
Biocomputing -Biocomputing tutorial at EBI
Bioinforamtics Training Resources -Links to an excellent selection of bioinformatics tools training at NYU
DNA composition and Exon prediction -Sequence based measures indicative of protein-coding function in genomic DNA
BCD BioComputing Tutorial
Documentation on Major Sequence Databases
EMBL Nucleic Acid Sequence Database Documentation
SwissProt Protein Sequence Database Documentation
GenBank Nucleic Acid Sequence Database Documentation
Sequence Database Feature Table
CORBA Servers and Services at EBI (Oib99)
Guide to the GCG Package
AGRENET's Unofficial Guide to GCG Software
W2H web interface to the GCG package (help pages)
PDF files for the GCG Wisconsin Package
Wisconsin Package User's guide
Guides to Multiple Alignment
A Gentle Guide to Multiple Alignment
ClustalW documentation
BioComputing Hypertext Coursebook
VSNS BioComputing Division Multiple Alignment,Resource Page
Guides to Phylogenetics
Phylogenetic Analysis of Sequences (MCB416/516)
Glossary of terms used in Phylogeny Reconstruction.
PHYLIP Guide (EMBNet)
PHYLIP Phylogeny Inference Package documentation
PAUP tutorial
Guides on Similarity Searching
An Overview of Sequence Comparison Algorithms in Molecular
Bill Pearson talks about Protein Evolution
Biological Sequences and Information
BLAST HELP MANUAL
BLAST tutorial
Pedestrian guide to analysing sequence databases
Sequence Comparison (Keith Robison)
Database Research at Penn
Biology, E. W. Myers
Bill Pearson talks about Fasta
Bioinformatics: Elementary Sequence Analysis, Brian Golding and Dick Morton
Distant homologies: motifs, patterns, profiles
A Guide to Molecular Sequence Analysis
Exploring Distant Protein Sequence Relationships
Sequence Analysis tutorial
The Sequence and Structure Searching Site
Flexible Information Visualization of Multivariate Data from Biological
Sequence Similarity Searches
Computing Guides & Documentations for Other Programs
Alan Robinson's CORBA Page
Clustalw Algorithm
Ferritin Molecular-Graphics Tutorial
Genome Data Base (GDB) and Online Mendelian Inheritance in Man (OMIM)
Introduction to XML (eXtropia)
XML Workshop at EBI
The MathMol Hypermedia Textbook
Molecular dynamics simulations and CHARMM
Quantitative Genetics Resources
SRS TIPS
Staden Documentation Web Site
Networking, Phylogenetics, Protein Folding etc.
bioWidget Consortium
Computational Genefinding
GAMS: Guide to Available Mathematical Software
Introduction to Perl 5 for Web Developers (eXtropia)
Linkage analysis information - bionet . Gene-linkage FAQ
RasMol tutorial
Review of DNA transcription element searches
VSNS BioComputing Hypertext Coursebook - Alignment,
VSNS BioComputing Division Homepage
Virtual Online Tutorials
Virtual Institute of Bioinformatics National University of Ireland , Ireland
UNIX, GCG, SEQLAB and STADEN Tutorials Oxford Univ , UK
BIOTOOLS96 (Univ of) Nottingham , UK, Virtual school of molecular sciences
the principles of protein structure, using the internet Birkback College (Univ of London) , UK
Free online bioinformatics courses! s-star.org
> Science and technology directory
Weizmann Institute of Science Genome and Bioinformatics
Algorithms for Molecular Biology - Bioinformatics course notes, Tel Aviv University (TAU, Israel)
Certificate Program in Bioinformatics - Standford
Courses Offered by BU Bioinformatics Program
ISCB Training information
Penn Database Research Group- Classes
VSNS Biocomputing Division
Yale Bioinformatics -- Courses and Lectures
Bioinformtics Online lecture ( I )
Bioinformtics Online lecture ( II )
MRes Biomolecular Sciences Lecture Notes: 1. The Gene and Bioinformatics
MRes Biomolecular Sciences Lecture Notes: 2. The Gene and Bioinformatics
biocomputing, on internet (Univ of) Bielefeld , Germany Virtual School of Natural Sciences
Sequence comparison Universite de) Rouen , France
A Guide to Molecular Sequence Analysis National Hospital Univ of Oslo , Norway
Distant homologies: motifs, patterns, profiles International Centre for Genetic Engineering and Biotechnology , Trieste, Italy
Virtual School of Natural Sciences BioComputing Division - Virtual biocomputing course
Algorithms for Computational Biology (Advanced Topics #6, 236606) - Israel Institute of Technology
CSE 590BI - Computational Biology, University of Washington
MBB 447b3 (747b3) Classes - Yale
UCSC School of Engineering- Class Home Pages - University of California at Santa Cruz
Virtual Bioinformatics Distance Learning - Bioinformatics and Functional genomics courses offered by IMC Bioinformatics, University of Tampere
Tutorials using NCBI Bioinformtics Tools
Bioinformatics Articles
What is bioinformatics??? -A brief description about this emerging field
What is Bioinformatics - An introduction article by Mark Gerstein at Yale University.
The powerful world of bioinformatics
Bioinformatics: Key to 21st Century Biology
FAQs - Frequently asked questions on bioinformatics related topics Now you know what it is, find out what it takes to become a bioinformatician . Bioinformatics career questions answered.
The commercialization of bioinformatics by Phillip B.C. Jones
A Curriculum for Bioinformatics: The Time is Ripe This article proproses requirements for a standard bioinformatics curriculum. By Russ Altman.
Human Genome Research A description of the Human Genome Project.
Retooling for Bioinformatics An article from The Scientist
A Programming Course in Bioinformatics A discussion of the task of teaching an introductory bioinformatics course. By Russ Altman and John Koza.
Sequence analysis Keith Robison's guide to the exciting world of biosequence comparison! Useful background information on a variety of computational biology algorithms.
Bioinformatics, Supercomputing, and Complex Genome Analysis , DOE/NIH Human Genome News , 4(5) January 1993.
Medical Informatics Training at Stanford University School of Medicine An article from 1995 by Edward Shortliffe describing our medical informatics training program, the nature of the curriculum, the backgrounds of our students, and the career paths of our graduates.
Bioinformatics review articles published 1993-1996. A collection of off-line references to review papers, compiled by The Irish National Centre for BioInformatics. A great starting point for anyone wanting a general introduction to the field.
Elementary Sequence Analysis - Database Searching by B. Golding Jan 1996. Fasta, blast, blitz, blaze, flash, blocks.
Bioinformatics in Support of Molecular Medicine A description of bioinformatics and its connection to clinical informatics. By Russ Altman.
Biology as a Business Venture and the Rise of Bioinformatics , 1996.
Preface to Molecular Bioinformatics- Sequence Analysis , 1997.
Bioinformatics & Cheminformatics in the Drug Discovery Cycle , 1997.
Bioinformatics in a post-genomics age Sept 1997
Bioinformatics takes charge , Trends in Biotech. , Vol. 16 No. 3 (170) , pp. 104-107, March 1998.
"A Curriculum For Bioinformatics: The Time Is Ripe" An editorial from the journal BIOINFORMATICS-Bioinformatics, Vol 14, Issue 7, pages 549-550 (August 1998)
Bioinformatics, pharma and farmers , Trends in Biotech. , Vol. 17 No. 3 (182) , pp. 85-88. March 1999.
Bioinformatics/Computational Biology Programs (May 1999)
Biocomputing For Everone an introduction to biocomputing for the layperson published by the VSNS biocomputing division.
Biocomputing For Schools Another VSNS publication - this time aimed at highschool students, but fun to read for everyone. Includes articles on the application of bioinformatics to BSE research, and a ``Do-It-Yourself'' detailed example of a WWW search.
Understanding the human genome By D. L. Brutlag, in Scientific American: Introduction to Molecular Medicine , P. In Leder, D. A. Clayton, E. Rubenstein, Eds., (New York: Scientific American , 1994), pp. 153-168.
Viva bioinformatics, but who survives? , 1999.
Bioinformatics: Playing The Numbers Game June 1999
Mining the Genome Sept 1999
Commercialization of biological information and the rise of bioinformatics - Part I , Nov/Dec 1999, 20/21, pp. 40-47.
Commercialization of biological information and the rise of bioinformatics - Part II , Jan 2000, pp. 51-56.
Turbo-charging bioinformation for drug discovery , Feb 2000, pp. 38-49.
Bioinformatics: low supply, high demand June 2000
The Next Wave of the Genomics Business July 2000
The Bioinformatics Gold Rush July 2000
Bioinformatics . Next Wave feature on careers in bioinformatics. September 2000
A prerequisite for working in thie field: love of computers article from The Scientist (Nov 2000)
Sep/Oct 2000 issue of the MIT Technology Review
How to become a bioinformatics expert , including a listing of European opportunities to study bioinformatics. Compiled by theVirtual School of Natural Sciences BioComputing Division.
Protein Sequence Alignment and Database Scanning Geoff Barton's review.
Pattern matching and motifs
Knowledge-based Analysis of Microarray Gene Expression Data Using Support Vector Machines
VHG Virtural HyperGlossary. Defines terms used in different subfields, currently Glycoscience, Protein Structure.
LASSAP a LArge Scale Sequence compArison Package
Bioinformatics in pre- and post-genomics eras , Trends in Biotech. , Vol. 18 , pp. 133-135, April 2000.
Confluence of Western and Traditional Medicines and Future Prospectes - Part I , Mar/Apr 2000, pp. 34-37.
Confluence of Western and Traditional Medicines and Future Prospectes - Part II , May/Jun 2000, pp. 66-74.
Computers + Biology = Bioinformatics April 2001
Bioinformatics U., Genome Technology, September, 2001 An article from Genome Technology written by Nat Goodman about bioinformatics curricula.
Training in a Hybrid Discipline, Nature, October 25, 2001 An article from Nature written by Potter Wickare and Paul Smaglik on bioinformatics training programmes in North America.
Bioinformatics Knowledge Vital to Careers - Competition from mathematicians and computer scientists compels biologists to become computational article from The Scientist (Sept 2002)
The babel of bioinformatics By Teresa Attwood, Science , 5491: 471 (2000).
The quiet revolution: Biodiversity informatics and the internet By Frank A. Bisby, Science , 289: 2309 (2000).
Are you ready for the revolution? By Declan Butler, Nature , 409: 758-760 (2001).
Beyond the genome: Biotech's next holy grail By Ellen Licking, Business Week , April 10, 2000.
The genome explained By Ellen Licking, Business Week , June 12, 2000.
Why bioinformatics is hot career By Stacey Wells, San Francisco Chronicle , March 4, 2001.
Proteomics: Beyond the genome Edited by Patricia O'Connell, Business Week Online, June 7, 2001.
A true believer dismisses indifference to bioinformatics By Terence Chea, Washington Post , March 14, 2002.
Genome map on a grain of rice By Kristen Philipkoski, Wired News , March 29, 2002.
Informatics moves to the head of the class By Beth Schachter, Bio-IT World , June 12, 2002.
The proteomics odyssey By Malorye Branca, Bio-IT World , Aug. 13, 2002.
Dell goes nuts for clusters By Michael Kanellos, CNET News.com , Sept. 2, 2002.
IBM teams with TurboGenomics By Salvatore Salamone, Bio-IT World , Sept. 5, 2002.
The new, new pharmacogenomics By Malorye Branca, Bio-IT World , Sept. 9, 2002.
RLX introduces a biocluster in a box By Salvatore Salamone, Bio-IT World , Sept. 17, 2002.
CombinatorX gets $40 million to look for drug synergies By Salvatore Salamone, Bio-IT World , Sept. 24, 2002.
Calculating with DNA By Salvatore Salamone, Bio-IT World , October 2002.
Hitachi Soft develops low-cost human genome DNA chip By Kuriko Miyake, ITworld.com , Oct. 8, 2002.
IBM chooses Linux for 'Blue Gene' supercomputer By Lisa Gill, NewsFactor Network , Oct. 24, 2002.
The international 'HapMap' project cbsnews.com , Oct. 29, 2002.
US stem cell policy deters investors By Steve Mitchell, UPI , Nov. 2, 2002.
Biochip sprouts DNA strands By Kimberly Patch, Technology Research News , Nov. 13, 2002
Genetic code of mouse published By Justin Gillis, Washington Post , Dec. 5, 2002
Genomics consolidation — no pain, no gain By Malorye Branca, Bio-IT World , Dec. 10, 2002.
Data stored in multiplying bacteria By Natasha McDowell, NewScientist.com , Jan. 3, 2003.
Cowabunga! Scientists to start Bovine Genome Project Science Daily , March 5, 2003.
Computational Analysis of Complexity in Gene Expression Arrays
Building your own Bioinformatics Supercomputer for Cheap Using Grid Technology
Career Advice for Computational Biology By Amjad-Ali Khoja at U. Texas.
Proceedings of the National Academy of Sciences
A Practical Guide to protein sequence and
Foundation Tutorials for Bioinformatics Aspiriants Computer Tutorials for Science Stream Students
Introduction to computer Concepts
oac3.hsc.uth.tmc.edu
java.sun.com/docs/books/tutorial/
developer.java.sun.com/developer/onlineTraining/
javaboutique.internet.com/tutorials/
Perl for Biologists (Weizmann Institute)
Perl for Biologists
savage.net.au/Perl-tutorials.html
Welcome to the Bioperl Project !
XML Tutorials
wdvl.internet.com
php.weblogs.com/sql_tutorial
perl.about.com/cs/beginningsql/
http://www.db.cs.ucdavis.edu/teaching/sqltutorial/
C & C++ Tutorials
dmoz.org/Computers/Programming/Languages/C/Tutorials/
CGI Tutorials
webdesign.about.com/cs/cgi/
http://www.cgi101.com/class/
Visual Basic Tutorials
visualbasic.about.com/
visualbasic.ittoolbox.com/
members.tripod.com/~vkliew/vb.html
webreference.com/programming/unix/
scidiv.bcc.ctc.edu
library.thinkquest.org/12413/
biology-online.org/
Cartoon Guide to Genetics
genomebiology.com/tutorials/
Tutorials in Molecular Biology
BioBook Glossary
Highveld.com - Internet Directory of Biology and Biotechnology
ESG Biology Hypertextbook Home Page
DOE Primer on Molecular Genetics
Introduction to Chemistry
users.rcn.com/bobsalsa/tutorial.htm
lrc-srvr.chemistry.ohio-state.edu
Periodic table of the elements
Interactive periodic table of the elements
Introduction to Biochemistry
xray.bmc.uu.se/Courses/Bke1/Tutorials/ Tutorialindex.html
http://www.jonmaber.demon.co.uk/
About DNA
biog-101-104.bio.cornell.edu/BioG101_104/ tutorials/recomb_DNA.html
avery.rutgers.edu/WSSP/Tutorials/ (chime plugin required)
DNA tutorial
DNA from the beginning
Central Dogma Glossary
About RNA
zombie.imsb.au.dk/~raybrown/
ndbserver.rutgers.edu/NDB/structure-finder/ tutorials/full_ndb.dna.rna.res.html
About Genome
genomebiology.com/tutorials/
anatomy.med.unsw.edu.au/cbl/GENOME/tutorials.htm
rsat.ulb.ac.be/rsat/tutorials/ tut_genome-scale-patser.html
home.uchicago.edu/~ebetran/guides.html
Basic Genome Glossary
Limited Genome Glossary
Genome Glossary
The Gene-School Glossary
Glossary of Genetic Terms
Bioinformatics Tutorials
Introduction to Bioinformatics
Bioinformatics (Genomics)
Biocomputing in a Nutshell.
Biologist's Guide to Internet Resources
Computational Molecular Biology Course
Course on Bioinformatics
EMBNet Biocomputing Tutorials
Finding the genes in the genomic sequences
The Genetic Programming Tutorial
Jose R. Valverde's training course documents
Principles of Computational Biology, Steven Salzberg.
Principles of Protein Structure Using the Internet
Practical Course "Bioinformatics: Computer Methods in Molecular Biology"
Sequence analysis course (José R, Valverde, EMBNet/CNB)
Bioinformatics -An excellent review on genetic code and information processing
Molecular Sequence Analysis -Introductory sequence analysis by Andrew S Louka
Homology Modelling -Protein and homology modelling for beginers
B iocompanion -Tutorial for sequence analysis
Bioinformatics and Genomic Analysis -Link to graduate student course at the university of Arizona
EMBnet Biocomputing Tutorials - Introduction
Integrative Bioinformatics: Practical Kinetic Modeling of Biological Systems
Biocomputing For Everyone !
The Biocomputing Glossary
Computational Biology Course, Martin Tompa
Course Distance Learning in Bioinformatics
Functional genomics glossaries
How to become a bioinformatics expert
Internet for biologists
Jose R. Valverde's 'dirty' training course documents
Algorithms in Molecular Biology (University of Washington)
Protein Sequence Analysis in the Genomic Era
Protein sequence and structure analysis: A practical guide.
Topics of Evolutionary Computation
VSNS BioComputing Division
Bioinformatics -Primer on biosequence comparisons
Algorithms in Molecular Biolgy -Excellent for learning bascis about many bioinfo tools
Biocomputing -Biocomputing tutorial at EBI
Bioinforamtics Training Resources -Links to an excellent selection of bioinformatics tools training at NYU
DNA composition and Exon prediction -Sequence based measures indicative of protein-coding function in genomic DNA
BCD BioComputing Tutorial
Documentation on Major Sequence Databases
EMBL Nucleic Acid Sequence Database Documentation
SwissProt Protein Sequence Database Documentation
GenBank Nucleic Acid Sequence Database Documentation
Sequence Database Feature Table
CORBA Servers and Services at EBI (Oib99)
Guide to the GCG Package
AGRENET's Unofficial Guide to GCG Software
W2H web interface to the GCG package (help pages)
PDF files for the GCG Wisconsin Package
Wisconsin Package User's guide
Guides to Multiple Alignment
A Gentle Guide to Multiple Alignment
ClustalW documentation
BioComputing Hypertext Coursebook
VSNS BioComputing Division Multiple Alignment,Resource Page
Guides to Phylogenetics
Phylogenetic Analysis of Sequences (MCB416/516)
Glossary of terms used in Phylogeny Reconstruction.
PHYLIP Guide (EMBNet)
PHYLIP Phylogeny Inference Package documentation
PAUP tutorial
Guides on Similarity Searching
An Overview of Sequence Comparison Algorithms in Molecular
Bill Pearson talks about Protein Evolution
Biological Sequences and Information
BLAST HELP MANUAL
BLAST tutorial
Pedestrian guide to analysing sequence databases
Sequence Comparison (Keith Robison)
Database Research at Penn
Biology, E. W. Myers
Bill Pearson talks about Fasta
Bioinformatics: Elementary Sequence Analysis, Brian Golding and Dick Morton
Distant homologies: motifs, patterns, profiles
A Guide to Molecular Sequence Analysis
Exploring Distant Protein Sequence Relationships
Sequence Analysis tutorial
The Sequence and Structure Searching Site
Flexible Information Visualization of Multivariate Data from Biological
Sequence Similarity Searches
Computing Guides & Documentations for Other Programs
Alan Robinson's CORBA Page
Clustalw Algorithm
Ferritin Molecular-Graphics Tutorial
Genome Data Base (GDB) and Online Mendelian Inheritance in Man (OMIM)
Introduction to XML (eXtropia)
XML Workshop at EBI
The MathMol Hypermedia Textbook
Molecular dynamics simulations and CHARMM
Quantitative Genetics Resources
SRS TIPS
Staden Documentation Web Site
Networking, Phylogenetics, Protein Folding etc.
bioWidget Consortium
Computational Genefinding
GAMS: Guide to Available Mathematical Software
Introduction to Perl 5 for Web Developers (eXtropia)
Linkage analysis information - bionet . Gene-linkage FAQ
RasMol tutorial
Review of DNA transcription element searches
VSNS BioComputing Hypertext Coursebook - Alignment,
VSNS BioComputing Division Homepage
Virtual Online Tutorials
Virtual Institute of Bioinformatics National University of Ireland , Ireland
UNIX, GCG, SEQLAB and STADEN Tutorials Oxford Univ , UK
BIOTOOLS96 (Univ of) Nottingham , UK, Virtual school of molecular sciences
the principles of protein structure, using the internet Birkback College (Univ of London) , UK
Free online bioinformatics courses! s-star.org
> Science and technology directory
Weizmann Institute of Science Genome and Bioinformatics
Algorithms for Molecular Biology - Bioinformatics course notes, Tel Aviv University (TAU, Israel)
Certificate Program in Bioinformatics - Standford
Courses Offered by BU Bioinformatics Program
ISCB Training information
Penn Database Research Group- Classes
VSNS Biocomputing Division
Yale Bioinformatics -- Courses and Lectures
Bioinformtics Online lecture ( I )
Bioinformtics Online lecture ( II )
MRes Biomolecular Sciences Lecture Notes: 1. The Gene and Bioinformatics
MRes Biomolecular Sciences Lecture Notes: 2. The Gene and Bioinformatics
biocomputing, on internet (Univ of) Bielefeld , Germany Virtual School of Natural Sciences
Sequence comparison Universite de) Rouen , France
A Guide to Molecular Sequence Analysis National Hospital Univ of Oslo , Norway
Distant homologies: motifs, patterns, profiles International Centre for Genetic Engineering and Biotechnology , Trieste, Italy
Virtual School of Natural Sciences BioComputing Division - Virtual biocomputing course
Algorithms for Computational Biology (Advanced Topics #6, 236606) - Israel Institute of Technology
CSE 590BI - Computational Biology, University of Washington
MBB 447b3 (747b3) Classes - Yale
UCSC School of Engineering- Class Home Pages - University of California at Santa Cruz
Virtual Bioinformatics Distance Learning - Bioinformatics and Functional genomics courses offered by IMC Bioinformatics, University of Tampere
Tutorials using NCBI Bioinformtics Tools
Bioinformatics Articles
What is bioinformatics??? -A brief description about this emerging field
What is Bioinformatics - An introduction article by Mark Gerstein at Yale University.
The powerful world of bioinformatics
Bioinformatics: Key to 21st Century Biology
FAQs - Frequently asked questions on bioinformatics related topics Now you know what it is, find out what it takes to become a bioinformatician . Bioinformatics career questions answered.
The commercialization of bioinformatics by Phillip B.C. Jones
A Curriculum for Bioinformatics: The Time is Ripe This article proproses requirements for a standard bioinformatics curriculum. By Russ Altman.
Human Genome Research A description of the Human Genome Project.
Retooling for Bioinformatics An article from The Scientist
A Programming Course in Bioinformatics A discussion of the task of teaching an introductory bioinformatics course. By Russ Altman and John Koza.
Sequence analysis Keith Robison's guide to the exciting world of biosequence comparison! Useful background information on a variety of computational biology algorithms.
Bioinformatics, Supercomputing, and Complex Genome Analysis , DOE/NIH Human Genome News , 4(5) January 1993.
Medical Informatics Training at Stanford University School of Medicine An article from 1995 by Edward Shortliffe describing our medical informatics training program, the nature of the curriculum, the backgrounds of our students, and the career paths of our graduates.
Bioinformatics review articles published 1993-1996. A collection of off-line references to review papers, compiled by The Irish National Centre for BioInformatics. A great starting point for anyone wanting a general introduction to the field.
Elementary Sequence Analysis - Database Searching by B. Golding Jan 1996. Fasta, blast, blitz, blaze, flash, blocks.
Bioinformatics in Support of Molecular Medicine A description of bioinformatics and its connection to clinical informatics. By Russ Altman.
Biology as a Business Venture and the Rise of Bioinformatics , 1996.
Preface to Molecular Bioinformatics- Sequence Analysis , 1997.
Bioinformatics & Cheminformatics in the Drug Discovery Cycle , 1997.
Bioinformatics in a post-genomics age Sept 1997
Bioinformatics takes charge , Trends in Biotech. , Vol. 16 No. 3 (170) , pp. 104-107, March 1998.
"A Curriculum For Bioinformatics: The Time Is Ripe" An editorial from the journal BIOINFORMATICS-Bioinformatics, Vol 14, Issue 7, pages 549-550 (August 1998)
Bioinformatics, pharma and farmers , Trends in Biotech. , Vol. 17 No. 3 (182) , pp. 85-88. March 1999.
Bioinformatics/Computational Biology Programs (May 1999)
Biocomputing For Everone an introduction to biocomputing for the layperson published by the VSNS biocomputing division.
Biocomputing For Schools Another VSNS publication - this time aimed at highschool students, but fun to read for everyone. Includes articles on the application of bioinformatics to BSE research, and a ``Do-It-Yourself'' detailed example of a WWW search.
Understanding the human genome By D. L. Brutlag, in Scientific American: Introduction to Molecular Medicine , P. In Leder, D. A. Clayton, E. Rubenstein, Eds., (New York: Scientific American , 1994), pp. 153-168.
Viva bioinformatics, but who survives? , 1999.
Bioinformatics: Playing The Numbers Game June 1999
Mining the Genome Sept 1999
Commercialization of biological information and the rise of bioinformatics - Part I , Nov/Dec 1999, 20/21, pp. 40-47.
Commercialization of biological information and the rise of bioinformatics - Part II , Jan 2000, pp. 51-56.
Turbo-charging bioinformation for drug discovery , Feb 2000, pp. 38-49.
Bioinformatics: low supply, high demand June 2000
The Next Wave of the Genomics Business July 2000
The Bioinformatics Gold Rush July 2000
Bioinformatics . Next Wave feature on careers in bioinformatics. September 2000
A prerequisite for working in thie field: love of computers article from The Scientist (Nov 2000)
Sep/Oct 2000 issue of the MIT Technology Review
How to become a bioinformatics expert , including a listing of European opportunities to study bioinformatics. Compiled by theVirtual School of Natural Sciences BioComputing Division.
Protein Sequence Alignment and Database Scanning Geoff Barton's review.
Pattern matching and motifs
Knowledge-based Analysis of Microarray Gene Expression Data Using Support Vector Machines
VHG Virtural HyperGlossary. Defines terms used in different subfields, currently Glycoscience, Protein Structure.
LASSAP a LArge Scale Sequence compArison Package
Bioinformatics in pre- and post-genomics eras , Trends in Biotech. , Vol. 18 , pp. 133-135, April 2000.
Confluence of Western and Traditional Medicines and Future Prospectes - Part I , Mar/Apr 2000, pp. 34-37.
Confluence of Western and Traditional Medicines and Future Prospectes - Part II , May/Jun 2000, pp. 66-74.
Computers + Biology = Bioinformatics April 2001
Bioinformatics U., Genome Technology, September, 2001 An article from Genome Technology written by Nat Goodman about bioinformatics curricula.
Training in a Hybrid Discipline, Nature, October 25, 2001 An article from Nature written by Potter Wickare and Paul Smaglik on bioinformatics training programmes in North America.
Bioinformatics Knowledge Vital to Careers - Competition from mathematicians and computer scientists compels biologists to become computational article from The Scientist (Sept 2002)
The babel of bioinformatics By Teresa Attwood, Science , 5491: 471 (2000).
The quiet revolution: Biodiversity informatics and the internet By Frank A. Bisby, Science , 289: 2309 (2000).
Are you ready for the revolution? By Declan Butler, Nature , 409: 758-760 (2001).
Beyond the genome: Biotech's next holy grail By Ellen Licking, Business Week , April 10, 2000.
The genome explained By Ellen Licking, Business Week , June 12, 2000.
Why bioinformatics is hot career By Stacey Wells, San Francisco Chronicle , March 4, 2001.
Proteomics: Beyond the genome Edited by Patricia O'Connell, Business Week Online, June 7, 2001.
A true believer dismisses indifference to bioinformatics By Terence Chea, Washington Post , March 14, 2002.
Genome map on a grain of rice By Kristen Philipkoski, Wired News , March 29, 2002.
Informatics moves to the head of the class By Beth Schachter, Bio-IT World , June 12, 2002.
The proteomics odyssey By Malorye Branca, Bio-IT World , Aug. 13, 2002.
Dell goes nuts for clusters By Michael Kanellos, CNET News.com , Sept. 2, 2002.
IBM teams with TurboGenomics By Salvatore Salamone, Bio-IT World , Sept. 5, 2002.
The new, new pharmacogenomics By Malorye Branca, Bio-IT World , Sept. 9, 2002.
RLX introduces a biocluster in a box By Salvatore Salamone, Bio-IT World , Sept. 17, 2002.
CombinatorX gets $40 million to look for drug synergies By Salvatore Salamone, Bio-IT World , Sept. 24, 2002.
Calculating with DNA By Salvatore Salamone, Bio-IT World , October 2002.
Hitachi Soft develops low-cost human genome DNA chip By Kuriko Miyake, ITworld.com , Oct. 8, 2002.
IBM chooses Linux for 'Blue Gene' supercomputer By Lisa Gill, NewsFactor Network , Oct. 24, 2002.
The international 'HapMap' project cbsnews.com , Oct. 29, 2002.
US stem cell policy deters investors By Steve Mitchell, UPI , Nov. 2, 2002.
Biochip sprouts DNA strands By Kimberly Patch, Technology Research News , Nov. 13, 2002
Genetic code of mouse published By Justin Gillis, Washington Post , Dec. 5, 2002
Genomics consolidation — no pain, no gain By Malorye Branca, Bio-IT World , Dec. 10, 2002.
Data stored in multiplying bacteria By Natasha McDowell, NewScientist.com , Jan. 3, 2003.
Cowabunga! Scientists to start Bovine Genome Project Science Daily , March 5, 2003.
Computational Analysis of Complexity in Gene Expression Arrays
Building your own Bioinformatics Supercomputer for Cheap Using Grid Technology
Career Advice for Computational Biology By Amjad-Ali Khoja at U. Texas.
Proceedings of the National Academy of Sciences
A Practical Guide to protein sequence and
Bioinformatics Protocols
GCG/SeqWeb
-Program Manual 1 -Program Manual 2 -GCG User's Guide -GCG Command Line Summary -SeqLab User's Guide -SeqLab Tutorial
EMBOSS
- Tutorials 1 - Tutorials 2 - EMBOSS Homepage - User Documentation - EMBOSS/GCG tools comparison chart
Microarray Suite/GCOS
-Microarray Suite User Manual -GCOS User Manual -Data Analysis Fundamentals
GeneSifter.net
-User Manual -Batch Upload Guide -Manual Column Detection for Affymetrix Users
GeneSpring
-User Manual -Quickstart Guide -Analysis Guides
VectorNTI
-Quickstart Guide (PC) -User Manual (PC)
VectorExpression
-User Manual (PC)
PathwayAssist
-User Manual
EisenLab Software
-ScanAlyze Manual -Cluster Manual -Treeview Manual
Tigr TM4 Microarray Package
-MIDAS Manual -Spotfinder Manual -MeV Manual
ABI Sequence (.ab1) Conversion Tool for Mac OS 8.5-9.2 Users
BASIC PROTOCOL USING CLUSTALW AND CLUSTALX TO DO MULTIPLE ALIGNMENTS
Subscribe to:
Posts (Atom)
