James Tisdall has worked as a musician, as a programmer and member of technical staff at Bell Labs (where he
programmed for speech research and discovered a formal language for musical rhythm), as a programmer and systems
manager at the Human Genome Project in the Computational Biology and Informatics Laboratory (where he began using
Perl for bioinformatics in 1991 with his program DNA WorkBench), as computational biologist at Mercator Genetics
in Menlo Park, California (where his Perl programs helped discover the gene involved in the common hereditary disease
hemochromatosis), as manager of Bioinformatics at the Fox Chase Cancer Center in Philadelphia, and most recently
as a consultant for Biocomputing Associates of Kimberton, Pennsylvania, and the Burke Research Institute affiliated
with Cornell University, working on neurodegenerative diseases such as Alzheimer's and Parkinson's.
Review
"Newcomers to Perl who understand biological information
will find James Tisdall's Beginning Perl for Bioinformatics to be an excellent compendium of examples. Teachers
of Perl
will likewise find the text to be filled with fresh programming illustrations of growing scientific importance."
-- Peter Leopold, amazon.com
"Put simply, programming is becoming a critical skill for more and more biologists. James Tisdall's timely
Beginning Perl for Bioinformatics will help them gain the specific programming skills they'll need in their day-to-day
work?If you're a working biologist, or working on becoming one, Beginning Perl for Bioinformatics will be an invaluable
resource -- and we've seen nothing like it."
--Bill Camarda, Barnesandnoble.com, Dec 2001
O'Reilly Publishing Web Site, August, 2002
Summary
With its highly developed capacity to detect patterns in data, Perl has
become one of the most popular languages for biological data analysis.
But if you're a biologist with little or no programming experience,
starting out in Perl can be a challenge.
Table of Contents
Preface
What Is Bioinformatics?
About This Book
Who This Book Is For
Why Should I Learn to Program?
Structure of This Book
Conventions Used in This Book
Comments and Questions
Acknowledgments
1. Biology and Computer Science
1.1 The Organization of DNA
1.2 The Organization of Proteins
1.3 In Silico
1.4 Limits to Computation
2. Getting Started with Perl
2.1 A Low and Long Learning Curve
2.2 Perl's Benefits
2.3 Installing Perl on Your Computer
2.4 How to Run Perl Programs
2.5 Text Editors
2.6 Finding Help
3. The Art of Programming
3.1 Individual Approaches to Programming
3.2 Edit�Run�Revise (and Save)
3.3 An Environment of Programs
3.4 Programming Strategies
3.5 The Programming Process
4. Sequences and Strings
4.1 Representing Sequence Data
4.2 A Program to Store a DNA Sequence
4.3 Concatenating DNA Fragments
4.4 Transcription: DNA to RNA
4.5 Using the Perl Documentation
4.6 Calculating the Reverse Complement in Perl
4.7 Proteins, Files, and Arrays
4.8 Reading Proteins in Files
4.9 Arrays
4.10 Scalar and List Context
4.11 Exercises
5. Motifs and Loops
5.1 Flow Control
5.2 Code Layout
5.3 Finding Motifs
5.4 Counting Nucleotides
5.5 Exploding Strings into Arrays
5.6 Operating on Strings
5.7 Writing to Files
5.8 Exercises
6. Subroutines and Bugs
6.1 Subroutines
6.2 Scoping and Subroutines
6.3 Command-Line Arguments and Arrays
6.4 Passing Data to Subroutines
6.5 Modules and Libraries of Subroutines
6.6 Fixing Bugs in Your Code
6.7 Exercises
7. Mutations and Randomization
7.1 Random Number Generators
7.2 A Program Using Randomization
7.3 A Program to Simulate DNA Mutation
7.4 Generating Random DNA
7.5 Analyzing DNA
7.6 Exercises
8. The Genetic Code
8.1 Hashes
8.2 Data Structures and Algorithms for Biology
8.3 The Genetic Code
8.4 Translating DNA into Proteins
8.5 Reading DNA from Files in FASTA Format
8.6 Reading Frames
8.7 Exercises
9. Restriction Maps and Regular Expressions
9.1 Regular Expressions
9.2 Restriction Maps and Restriction Enzymes
9.3 Perl Operations
9.4 Exercises
13. Further Topics
13.1 The Art of Program Design
13.2 Web Programming
13.3 Algorithms and Sequence Alignment
13.4 Object-Oriented Programming
13.5 Perl Modules
13.6 Complex Data Structures
13.7 Relational Databases
13.8 Microarrays and XML
13.9 Graphics Programming
13.10 Modeling Networks
13.11 DNA Computers
A. Resources
A.1 Perl
A.2 Computer Science
A.3 Linux
A.4 Bioinformatics
A.5 Molecular Biology
B. Perl Summary
B.1 Command Interpretation
B.2 Comments
B.3 Scalar Values and Scalar Variables
B.4 Assignment
B.5 Statements and Blocks
B.6 Arrays
B.7 Hashes
B.8 Operators
B.9 Operator Precedence
B.10 Basic Operators
B.11 Conditionals and Logical Operators
B.12 Binding Operators
B.13 Loops
B.14 Input/Output
B.15 Regular Expressions
B.16 Scalar and List Context
B.17 Subroutines and Modules
B.18 Built-in Functions