Text Processing in Python

Front Cover
Addison-Wesley Professional, 2003 - Computers - 520 pages
Text Processing in Python describes techniques for manipulation of text using the Python programming language. At the broadest level, text processing is simply taking textual information and doing something with it. This might be restructuring or reformatting it, extracting smaller bits of information from it, or performing calculations that depend on the text. Text processing is arguably what most programmers spend most of their time doing. Because Python is clear, expressive, and object-oriented it is a perfect language for doing text processing, even better than Perl. As the amount of data everywhere continues to increase, this is more and more of a challenge for programmers. This book is not a tutorial on Python. It has two other goals: helping the programmer get the job done pragmatically and efficiently; and giving the reader an understanding - both theoretically and conceptually - of why what works works and what doesn't work doesn't work. Mertz provides practical pointers and tips that emphasize efficent, flexible, and maintainable approaches to the textprocessing tasks that working programmers face daily.
 

What people are saying - Write a review

LibraryThing Review

User Review  - rajanb - LibraryThing

Very detailed book, not suitable for a Python beginner, but if you take your time to work through it you'll gain a clear understanding of complex topics such as regular expressions and will marvel at David Mertz's fluency in both English and Python. Read full review

User Review - Flag as inappropriate

This book is a intermdiate to advanced level reading, too difficult for me now but should be great to improve understanding of python later. Also it provides much practical techniques

Contents

I
1
II
7
III
8
IV
13
V
34
VI
37
VII
41
IX
57
LXIII
280
LXIV
281
LXV
282
LXVI
286
LXVII
316
LXVIII
328
LXIX
343
LXX
344

X
73
XI
82
XII
89
XIII
90
XIV
100
XV
104
XVI
105
XVII
111
XVIII
112
XIX
115
XX
117
XXI
120
XXII
121
XXIII
123
XXIV
126
XXV
128
XXVI
147
XXVII
158
XXVIII
163
XXIX
172
XXX
185
XXXI
194
XXXII
195
XXXIII
199
XXXIV
203
XXXV
204
XXXVII
205
XXXVIII
209
XXXIX
215
XL
220
XLI
221
XLII
223
XLIII
224
XLIV
226
XLV
228
XLVI
229
XLVII
231
XLVIII
232
XLIX
234
L
257
LI
258
LII
260
LIII
263
LIV
264
LV
265
LVI
267
LVIII
268
LIX
269
LX
272
LXI
273
LXII
274
LXXI
345
LXXII
366
LXXIII
372
LXXIV
376
LXXV
383
LXXVI
388
LXXVII
394
LXXVIII
395
LXXIX
398
LXXX
399
LXXXI
403
LXXXII
408
LXXXIII
417
LXXXIV
418
LXXXVII
420
LXXXIX
421
XCII
423
XCIII
425
XCIV
427
XCV
430
XCVI
432
XCVII
433
XCVIII
434
XCIX
435
C
438
CI
439
CII
441
CIII
445
CIV
446
CV
447
CVI
448
CVII
450
CIX
453
CX
454
CXII
455
CXIV
456
CXV
457
CXVI
458
CXVII
459
CXVIII
464
CXIX
465
CXX
466
CXXI
467
CXXII
468
CXXIII
469
CXXIV
470
CXXV
471
CXXVI
481
CXXVII
485
Copyright

Common terms and phrases

About the author (2003)

David Mertz came to writing about programming via the unlikely route of first being a humanities professor. Along the way, he was a senior software developer, and now runs his own development company, Gnosis Software ("We know stuff!"). David writes regular columns and articles for IBM developerWorks, Intel Developer Network, O'Reilly ONLamp, and other publications.



0321112547AB05022003

Bibliographic information