Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from your Data

Front Cover
Apress, Nov 30, 2016 - Computers - 385 pages
0 Reviews

Derive useful insights from your data using Python. You will learn both basic and advanced concepts, including text and language syntax, structure, and semantics. You will focus on algorithms and techniques, such as text classification, clustering, topic modeling, and text summarization.

Text Analytics with Python teaches you the techniques related to natural language processing and text analytics, and you will gain the skills to know which technique is best suited to solve a particular problem. You will look at each technique and algorithm with both a bird's eye view to understand how it can be used as well as with a microscopic view to understand the mathematical concepts and to implement them to solve your own problems.

What You Will Learn:

  • Understand the major concepts and techniques of natural language processing (NLP) and text analytics, including syntax and structure
  • Build a text classification system to categorize news articles, analyze app or game reviews using topic modeling and text summarization, and cluster popular movie synopses and analyze the sentiment of movie reviews
  • Implement Python and popular open source libraries in NLP and text analytics, such as the natural language toolkit (nltk), gensim, scikit-learn, spaCy and Pattern


Who This Book Is For :
IT professionals, analysts, developers, linguistic experts, data scientists, and anyone with a keen interest in linguistics, analytics, and generating insights from textual data
 

What people are saying - Write a review

We haven't found any reviews in the usual places.

Contents

Natural Language Basics
1
Python Refresher
51
Processing and Understanding Text
107
Text Classification
167
Text Summarization
217
Text Similarity and Clustering
265
Semantic and Sentiment Analysis
319
Index
377
Copyright

Other editions - View all

Common terms and phrases

About the author (2016)

Dipanjan Sarkar is a Data Scientist at Intel, the world's largest silicon company which is on a mission to make the world more connected and productive. He primarily works on Analytics, Business Intelligence, Application Development and building large scale Intelligent Systems. He received his master's degree in Information Technology from the International Institute of Information Technology, Bangalore with a focus on Data Science and Software Engineering. He is also an avid supporter of self-learning, especially Massive Open Online Courses and holds a Data Science Specialization from Johns Hopkins University on Coursera.

He has been an analytics practitioner for over 4 years now specializing in statistical, predictive and text analytics. He has also authored a couple of books on R and Machine Learning and occasionally reviews technical books and acts as a course beta tester for Coursera. Dipanjan's interests include learning about new technology, financial markets, disruptive start-ups, data science and more recently, artificial intelligence and deep learning. In his spare time he loves reading, gaming and watching popular sitcoms and football.

Bibliographic information