Semisupervised Learning for Computational Linguistics
The rapid advancement in the theoretical understanding of statistical and machine learning methods for semisupervised learning has made it difficult for nonspecialists to keep up to date in the field. Providing a broad, accessible treatment of the theory as well as linguistic applications, Semisupervised Learning for Computational Linguistics offers self-contained coverage of semisupervised methods that includes background material on supervised and unsupervised learning.
The book presents a brief history of semisupervised learning and its place in the spectrum of learning methods before moving on to discuss well-known natural language processing methods, such as self-training and co-training. It then centers on machine learning techniques, including the boundary-oriented methods of perceptrons, boosting, support vector machines (SVMs), and the null-category noise model. In addition, the book covers clustering, the expectation-maximization (EM) algorithm, related generative methods, and agreement methods. It concludes with the graph-based method of label propagation as well as a detailed discussion of spectral methods.
Taking an intuitive approach to the material, this lucid book facilitates the application of semisupervised learning methods to natural language processing and provides the framework and motivation for a more systematic study of machine learning.
What people are saying - Write a review
We haven't found any reviews in the usual places.
Selftraining and Cotraining
Applications of SelfTraining and CoTraining
Mathematics for BoundaryOriented Methods
Other editions - View all
algorithm arbitrary assignment Association for Computational averaging property boundary nodes centers centroid choose classifier clustering co-training column component Computational Linguistics conditional accuracy conditional independence consider constraint corresponding cross entropy data points decision boundary decision list defined diagonal direction discussed distance distribution dot product East Stroudsburg edge eigenvalues eigenvectors equal equation error example feasible set figure fixed flow Gaussian given gradient graph half-instances harmonic function Hence hyperplane i-th input iteration label propagation labeled instances labeled nodes Laplacian learner linear combination machine learning matrix maximizes measure methods minimizes Naive Bayes Natural Language negative neighbors objective function optimal orthonormal matrix pairs part-of-speech tagging particle perpendicular positive prediction predictor probability problem Proceedings Rayleigh quotient represents rule sample self-training semisupervised learning solution space target function training data transductive unlabeled data unsupervised learning update WordNet zero