Python 3 Text Processing with NLTK 3 Cookbook

Читать

This book will show you the essential techniques of text and language processing. Starting with tokenization, stemming, and the WordNet dictionary, you'll progress to part-of-speech tagging, phrase chunking, and named entity recognition. You'll learn how various text corpora are organized, as well as how to create your own custom corpus. Then, you'll move onto text classification with a focus on sentiment analysis. And because NLP can be computationally expensive on large bodies of text, you'll try a few methods for distributed text processing. Finally, you'll be introduced to a number of other small but complementary Python libraries for text analysis, cleaning, and parsing.
This cookbook provides simple, straightforward examples so you can quickly learn text processing with Python and NLTK.

больше

387 бумажных страниц

Год выхода издания: 2014
Издательство: Packt Publishing

Цитаты

niodeyaцитирует4 года назад
Most of the time, the default sentence tokenizer will be sufficient
- Нравится
- Комментировать
- Поделиться
  Facebook
  Twitter
  Скопировать ссылку
- Пожаловаться
niodeyaцитирует4 года назад
Once you have a custom sentence tokenizer, you can use it for your own corpora
- Нравится
- Комментировать
- Поделиться
  Facebook
  Twitter
  Скопировать ссылку
- Пожаловаться
niodeyaцитирует4 года назад
The PunktSentenceTokenizer class uses an unsupervised learning algorithm to learn what constitutes a sentence break. It is unsupervised because you don't have to give it any labeled training data, just raw text
- Нравится
- Комментировать
- Поделиться
  Facebook
  Twitter
  Скопировать ссылку
- Пожаловаться

На полках

Paweł Owczarek
Computers, Science
- 174
- 20
Отписаться

Python 3 Text Processing with NLTK 3 Cookbook

Похожие книгиВсе

Цитаты

На полках