Description
This book is about data mining tools for integrated genomic databases. It provides an overview of the different tools and databases available, and how to use them. It is written specifically for biology and bioinformatics students and researchers, and is designed to help them understand the different tools and make the most effective use of them.
The recent explosive growth of biological data has lead to a rapid increase in the number of molecular biology databases. Held in many different locations and often using varying interfaces and non-standard data formats, integrating and comparing data from these multiple databases can be difficult and time-consuming. This book provides an overview of the key tools currently available for large-scale comparisons of gene sequences and annotations, focusing on the databases and tools from the University of California, Santa Cruz (UCSC), Ensembl, and the National Centre for Biotechnology Information (NCBI). Written specifically for biology and bioinformatics students and researchers, it aims to give an appreciation for the methods by which the browsers and their databases are constructed, enabling readers to determine which tool is the most appropriate for their requirements. Each chapter contains a summary and exercises to aid understanding and promote effective use of these important tools.