Computer Skills


Apply Now

COMPUTER SKILS

Content

Preface

__________________________________________________________6

Audience for This Book_______________________________________6

Structure of This Book________________________________________7

Our Approach to Bioinformatics________________________________9

URLs Referenced in This Book________________________________________9

Conventions Used in This Book________________________________________9

Comments and Questions________________________________________9

Acknowledgments________________________________________10

Chapter 1. Biology in the Computer Age________________________________________11

1.1 How Is Computing Changing Biology? ________________________________________11

1.2 Isn't Bioinformatics Just About Building Databases? ________________________________________15

1.3 What Does Informatics Mean to Biologists? ________________________________________18

1.4 What Challenges Does Biology Offer Computer Scientists? ________________________________________18

1.5 What Skills Should a Bioinformatician Have? ________________________________________19

1.6 Why Should Biologists Use Computers? ________________________________________20

1.7 How Can I Configure a PC to Do Bioinformatics Research? ________________________________________21

1.8 What Information and Software Are Available? ________________________________________22

1.9 Can I Learn a Programming Language Without Classes? ________________________________________23

1.10 How Can I Use Web Information? ________________________________________23

1.11 How Do I Understand Sequence Alignment Data? ________________________________________24

1.12 How Do I Write a Program to Align Two Biological Sequences? ___________________________________24

1.13 How Do I Predict Protein Structure from Sequence? ________________________________________24

1.14 What Questions Can Bioinformatics Answer? ________________________________________24

Chapter 2. Computational Approaches to Biological Questions________________________________________26

2.1 Molecular Biology's Central Dogma________________________________________26

2.2 What Biologists Model________________________________________30

2.3 Why Biologists Model________________________________________33

2.4 Computational Methods Covered in This Book________________________________________34

2.5 A Computational Biology Experiment________________________________________38

Chapter 3. Setting Up Your Workstation________________________________________44

3.1 Working on a Unix System________________________________________44

3.2 Setting Up a Linux Workstation________________________________________46

3.3 How to Get Software Working________________________________________51

3.4 What Software Is Needed? ________________________________________57

Chapter 4. Files and Directories in Unix________________________________________58

4.1 Filesystem Basics________________________________________58

4.2 Commands for Working with Directories and Files________________________________________63

4.3 Working in a Multiuser Environment________________________________________70

5. Working on a Unix System________________________________________78

5.1 The Unix Shell________________________________________78

5.2 Issuing Commands on a Unix System________________________________________79

5.3 Viewing and Editing Files________________________________________84

5.4 Transformations and Filters________________________________________90

5.5 File Statistics and Comparisons________________________________________97

5.6 The Language of Regular Expressions________________________________________99

5.7 Unix Shell Scripts________________________________________102

5.8 Communicating with Other Computers________________________________________103

5.9 Playing Nicely with Others in a Shared Environment________________________________________108

Chapter 6. Biological Research on the Web________________________________________120

6.1 Using Search Engines_________________________________________120

6.2 Finding Scientific Articles________________________________________122

6.3 The Public Biological Databases________________________________________126

6.4 Searching Biological Databases________________________________________131

6.5 Depositing Data into the Public Databases________________________________________138

6.6 Finding Software________________________________________138

6.7 Judging the Quality of Information________________________________________139

Chapter 7. Sequence Analysis,  Pairwise Alignment, and Database Searching________________________142

7.1 Chemical Composition of Biomolecules________________________________________143

7.2 Composition of DNA and RNA________________________________________143

7.3 Watson and Crick Solve the Structure of DNA________________________________________144

7.4 Development of DNA Sequencing Methods________________________________________146

7.5 Genefinders and Feature Detection in DNA________________________________________149

7.6 DNA Translation________________________________________151

7.7 Pairwise Sequence Comparison________________________________________152

7.8 Sequence Queries Against Biological Databases________________________________________160

7.9 Multifunctional Tools for Sequence Analysis________________________________________167

Chapter 8. Multiple Sequence Alignments, Trees, and Profiles________________________________________169

8.1 The Morphological to the Molecular________________________________________169

8.2 Multiple Sequence Alignment________________________________________170

8.3 Phylogenetic Analysis________________________________________175

8.4 Profiles and Motifs________________________________________180

Chapter 9. Visualizing Protein Structures and Computing Structural Properties_________________189

9.1 A Word About Protein Structure Data________________________________________189

9.2 The Chemistry of Proteins________________________________________190

9.3 Web-Based Protein Structure Tools________________________________________201

9.4 Structure Visualization________________________________________202

9.5 Structure Classification________________________________________210

9.6 Structural Alignment________________________________________215

9.7 Structure Analysis________________________________________218

9.8 Solvent Accessibility and Interactions________________________________________221

9.9 Computing Physicochemical Properties________________________________________224

9.10 Structure Optimization________________________________________226

9.11 Protein Resource Databases________________________________________229

9.12 Putting It All Together________________________________________230

Chapter 10. Predicting Protein Structure and Function from Sequence___________________________232

10.1 Determining the Structures of Proteins________________________________________232

10.2 Predicting the Structures of Proteins________________________________________236

10.3 From 3D to 1D________________________________________237

10.4 Feature Detection in Protein Sequences________________________________________238

10.5 Secondary Structure Prediction________________________________________239

10.6 Predicting 3D Structure________________________________________243

10.7 Putting It All Together: A Protein Modeling Project________________________________________247

10.8 Summary________________________________________252

Chapter 11. Tools for Genomics and Proteomics________________________________________253

11.1 From Sequencing Genes to Sequencing Genomes________________________________________254

11.2 Sequence Assembly________________________________________258

11.3 Accessing Genome Informationon the Web________________________________________259

11.4 Annotating and Analyzing Whole Genome Sequences________________________________________263

11.5 Functional Genomics: New Data Analysis Challenges________________________________________265

11.6 Proteomics________________________________270

11.7 Biochemical Pathway Databases________________________________________274

11.8 Modeling Kinetics and Physiology________________________________________277

11.9 Summary________________________________________278

Chapter 12. Automating Data Analysis with Perl________________________________________280

12.1 Why Perl? ________________________________________280

12.2 Perl Basics________________________________________281

12.3 Pattern Matching and Regular Expressions________________________________________286

12.4 Parsing BLAST Output Using Perl________________________________________287

12.5 Applying Perl to Bioinformatics________________________________________292

Chapter 13. Building Biological Databases________________________________________296

13.1 Types of Databases________________________________________296

13.2 Database Software________________________________________303

13.3 Introduction to SQL________________________________________305

13.4 Installing the MySQL DBMS________________________________________310

13.5 Database Design________________________________________314

13.6 Developing WebBased Software That Interacts with Databases___________________________________317

Chapter 14. Visualization and Data Mining________________________________________324

14.1 Preparing Your Data________________________________________324

14.2 Viewing Graphics________________________________________325

14.3 Sequence Data Visualization________________________________________326

14.4 Networks and Pathway Visualization________________________________________328

14.5 Working with Numerical Data________________________________________329

14.6 Visualization: Summary________________________________________334

14.7 Data Mining and Biological Information________________________________________335

Biblio.1 Unix________________________________________340

Biblio.2 SysAdmin________________________________________340

Biblio.3 Perl________________________________________340

Biblio.4 General Reference________________________________________341

Biblio.5 Bioinformatics Reference________________________________________341

Biblio.6 Molecular Biology/Biology Reference________________________________________341

Biblio.7 Protein Structure and Biophysics________________________________________341

Biblio.8 Genomics________________________________________342

Biblio.9 Biotechnology________________________________________342

Biblio.10 Databases________________________________________342

Biblio.11 Visualization________________________________________342

Biblio.12 Data Mining________________________________________343

Colophon________________________________________344