|
Mining the Web: Analysis of Hypertext and Semi Structured Data
 |
Author: Soumen Chakrabarti List Price: $54.95 Our Price: Click to see the latest and low price ISBN: 1558607544 Publisher: Morgan Kaufmann (15 August, 2002) Edition: Hardcover Sales Rank: 35,620 Average Customer Rating: 5 out of 5
|
Customer ReviewsRating: 5 out of 5 Excellent, comprehensive, readable book on mining the Web Executive summary: This is a fabulous book, written with care and precision, easy to read yet covering in detail a wide variety of the most beautiful and promising developments in data mining and machine learning as it relates to the World Wide Web, including a prescient vision of where the field is headed in the future.More detail: There are science authors who are clear experts in their field, yet have trouble communicating their knowledge. Then there are science authors who write with clarity, but achieve it by dumbing down technical details to cater to a broad readership. Finally, there are authors who are experts and leaders in their field, who are actively contributing to the forefront of research, who are excellent writers, and who can communicate complex concepts to a diverse audience with acumen, without glossing over important details. Soumen Chakrabarti is one such author. "Mining the Web" is a stunning achievement. It is an excellent summary of the past decade or so of research in the area, covering nearly all of the important bases, including the machinery of Web crawling, Web information retrieval (i.e., search engines), clustering, automated classification, semi-supervised approaches, social network analysis, and focused crawling. Though Chakrabarti himself has contributed prominently to the field, this book is not at all the vehicle for self-promotion that other specialist texts sometimes feel like. The book should be valuable to newcomers, students, and experts alike, and could certainly serve as an excellent course textbook. High-level concepts can be grasped with little mathematical background, yet more technically sophisticated readers will not be disappointed: most topics do include rigorous coverage. The text is well organized, well written, and well conceived. It's design, including generous and illuminating figures and illustrations, possesses an artist's touch, perhaps not surprising given that Chakrabarti designs his own font libraries in his (apparently scant) spare time. It's hard to imagine where Chakrabarti found the time to write such a comprehensive and thoughtful book, but I'm not asking any questions: I'm thrilled with the outcome. The book is a must-have reference for anyone working in -- or aspiring to work in -- the crossroads of Web algorithmics, data mining, and machine learning. David M. Pennock Senior Research Scientist, Overture Services, Inc. [website] Rating: 5 out of 5 The Best Web Data Mining Text This book is simply the best web data mining text available. It is simultaneously broad and deep, covering a wide array of topics yet delving into the meatiest parts of Web data mining. Topics covered include classic information retrieval, graph theoretic approaches, Web measurements, and even machine learning methods such as clustering and text classification. One of the reasons why the book succeeds is that Chakrabarti is himself a major contributor to the field. His writing is always clear and precise probably because he frequently lectures on these topics. If you buy one book about data mining on the Web, this should be that book. Rating: 5 out of 5 Much needed book on Web mining This book is an excellent introduction to a number of techniques in information retrieval, machine learning, data mining, network analysis and the application of such techniques to the Web. It discusses many research issues as well as provides practical insights into constructing Web mining tools and systems. Chakrabarti has brought the wisdom of researchers in the area of Web mining to a wider audience. I think the book will prompt the development of new courses for graduate as well as senior undergraduate students. The first part of the book deals with interesting practical and theoretical issues related with designing large-scale Web crawlers and search engines. Chapter 4 and 5 are a good introduction to various unsupervised and supervised learning methods. Although proper understanding of advanced methods like the LSI are possible only through adequate foundation in linear algebra (you can get only a flavor of the technique in the book). Part III of the book is my personal favorite. It has detailed description of various social network analysis methods, some of which have been applied by modern search engines like Google. Focused crawling, an area that the author has personally shaped, is also explained well. The book ends with a brief peek into the future of Web mining. The comprehensive yet easy to read nature of the book makes it a valuable addition to my shelf. It is hard to find a comparable book in the area of Web mining.
Similar Products
· Managing Gigabytes: Compressing and Indexing Documents and Images
· Modeling the Internet and the Web: Probabilistic Methods and Algorithms
· Modern Information Retrieval
· Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization (Natural Language Processing, 5)
|