TitleSyntactic Wordclass Tagging [electronic resource] / edited by Hans van Halteren
ImprintDordrecht : Springer Netherlands : Imprint: Springer, 1999
Connect tohttp://dx.doi.org/10.1007/978-94-015-9273-4
Descript XVII, 334 p. online resource

SUMMARY

In both the linguistic and the language engineering community, the creation and use of annotated text collections (or annotated corpora) is currently a hot topic. Annotated texts are of interest for research as well as for the development of natural language proยญ cessing (NLP) applications. Unfortunately, the annotation of text material, especially more interesting linguistic annotation, is as yet a difficult task and can entail a substanยญ tial amount of human involvement. Allover the world, work is being done to replace as much as possible of this human effort by computer processing. At the frontier of what can already be done (mostly) automatically we find syntactic wordclass tagging, the annotation of the individual words in a text with an indication of their morpho syntactic classification. This book describes the state of the art in syntactic wordclass tagging. As an attempt to give an overall view of the field, this book is of interest to (at least) two, possibly very different, types of reader. The first type consists of those people who are using, or are planning to use, tagged material and taggers. They will want to know what the possibilities and impossibilities of tagging are, but are not necessarily interested in the internal working of automatic taggers. This, on the other hand, is the main interest of our second type of reader, the builders of automatic taggers and other natural language processing software


CONTENT

I The Userโs View -- 1 Orientation -- 2 A Short History of Tagging -- 3 The Use of Tagging -- 4 Tagsets -- 5 Standards for Tagsets -- 6 Performance of Taggers -- 7 Selection and Operation of Taggers -- II The Implementerโs View -- 8 Automatic Taggers: An Introduction -- 9 Tokenization -- 10 Lexicons for Tagging -- 11 Standardization in the Lexicon -- 12 Morphological Analysis -- 13 Tagging Unknown Words -- 14 Hand-Crafted Rules -- 15 Corpus-Based Rules -- 16 Hidden Markov Models -- 17 Machine Learning Approaches -- Appendix A: Example tagsets -- A.1 The Brown Corpus tagset -- A.2 The Penn Treebanktagset -- A.3 The EngCG tagset -- References


SUBJECT

  1. Linguistics
  2. Information storage and retrieval
  3. Artificial intelligence
  4. Probabilities
  5. Computational linguistics
  6. Linguistics
  7. Computational Linguistics
  8. Information Storage and Retrieval
  9. Artificial Intelligence (incl. Robotics)
  10. Probability Theory and Stochastic Processes