Neural Network Methods for Natural Language Processing 1st edition by Yoav Goldber – Ebook PDF Instant Download/Delivery. 9783031021657, 3031021657
Full download Neural Network Methods for Natural Language Processing 1st edition after payment
Product details:
ISBN 10: 3031021657
ISBN 13: 9783031021657
Author: Yoav Goldber
Neural networks are a family of powerful machine learning models. This book focuses on the application of neural network models to natural language data. The first half of the book (Parts I and II) covers the basics of supervised machine learning and feed-forward neural networks, the basics of working with machine learning over language data, and the use of vector-based rather than symbolic representations for words. It also covers the computation-graph abstraction, which allows to easily define and train arbitrary neural networks, and is the basis behind the design of contemporary neural network software libraries. The second part of the book (Parts III and IV) introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural networks, conditioned-generation models, and attention-based models. These architectures and techniques are the driving force behind state-of-the-art algorithms for machine translation, syntactic parsing, and many other applications. Finally, we also discuss tree-shaped networks, structured prediction, and the prospects of multi-task learning.
Neural Network Methods for Natural Language Processing 1st Table of contents:
The Challenges of Natural Language Processing
Neural Networks and Deep Learning
Deep Learning in NLP
Success Stories
Coverage and Organization
What’s not Covered
A Note on Terminology
Mathematical Notation
Supervised Classification and Feed-forward Neural Networks
Learning Basics and Linear Models
Supervised Learning and Parameterized Functions
Train, Test, and Validation Sets
Linear Models
Binary Classification
Log-linear Binary Classification
Multi-class Classification
Representations
One-Hot and Dense Vector Representations
Log-linear Multi-class Classification
Training as Optimization
Loss Functions
Regularization
Gradient-based Optimization
Stochastic Gradient Descent
Worked-out Example
Beyond SGD
From Linear Models to Multi-layer Perceptrons
Limitations of Linear Models: The XOR Problem
Nonlinear Input Transformations
Kernel Methods
Trainable Mapping Functions
Feed-forward Neural Networks
A Brain-inspired Metaphor
In Mathematical Notation
Representation Power
Common Nonlinearities
Loss Functions
Regularization and Dropout
Similarity and Distance Layers
Embedding Layers
Neural Network Training
The Computation Graph Abstraction
Forward Computation
Backward Computation (Derivatives, Backprop)
Software
Implementation Recipe
Network Composition
Practicalities
Choice of Optimization Algorithm
Initialization
Restarts and Ensembles
Vanishing and Exploding Gradients
Saturation and Dead Neurons
Shuffling
Learning Rate
Minibatches
Working with Natural Language Data
Features for Textual Data
Typology of NLP Classification Problems
Features for NLP Problems
Directly Observable Properties
Inferred Linguistic Properties
Core Features vs. Combination Features
Ngram Features
Distributional Features
Case Studies of NLP Features
Document Classification: Language Identification
Document Classification: Topic Classification
Document Classification: Authorship Attribution
Word-in-context: Part of Speech Tagging
Word-in-context: Named Entity Recognition
Word in Context, Linguistic Features: Preposition Sense Disambiguation
Relation Between Words in Context: Arc-Factored Parsing
From Textual Features to Inputs
Encoding Categorical Features
One-hot Encodings
Dense Encodings (Feature Embeddings)
Dense Vectors vs. One-hot Representations
Combining Dense Vectors
Window-based Features
Variable Number of Features: Continuous Bag of Words
Relation Between One-hot and Dense Vectors
Odds and Ends
Distance and Position Features
Padding, Unknown Words, and Word Dropout
Feature Combinations
Vector Sharing
Dimensionality
Embeddings Vocabulary
Network’s Output
Example: Part-of-Speech Tagging
Example: Arc-factored Parsing
Pre-trained Word Representations
Language Modeling
The Language Modeling Task
Evaluating Language Models: Perplexity
Traditional Approaches to Language Modeling
Further Reading
Limitations of Traditional Language Models
Neural Language Models
Using Language Models for Generation
By product: Word Representations
Pre-trained Word Representations
Random Initialization
Supervised Task-specific Pre-training
Unsupervised Pre-training
Using Pre-trained Embeddings
Word Embedding Algorithms
Distributional Hypothesis and Word Representations
From Neural Language Models to Distributed Representations
Connecting the Worlds
Other Algorithms
The Choice of Contexts
Window Approach
Sentences, Paragraphs, or Documents
Syntactic Window
Multilingual
Character-based and Sub-word Representations
Dealing with Multi-word Units and Word Inflections
Limitations of Distributional Methods
Using Word Embeddings
Obtaining Word Vectors
Word Similarity
Word Clustering
Finding Similar Words
Similarity to a Group of Words
Odd-one Out
Short Document Similarity
Word Analogies
Retrofitting and Projections
Practicalities and Pitfalls
Case Study: A Feed-forward Architecture for Sentence Meaning Inference
Natural Language Inference and the SNLI Dataset
A Textual Similarity Network
Specialized Architectures
Ngram Detectors: Convolutional Neural Networks
Basic Convolution + Pooling
1D Convolutions Over Text
Vector Pooling
Variations
Alternative: Feature Hashing
Hierarchical Convolutions
Recurrent Neural Networks: Modeling Sequences and Stacks
The RNN Abstraction
RNN Training
Common RNN Usage-patterns
Acceptor
Encoder
Transducer
Bidirectional RNNs (biRNN)
Multi-layer (stacked) RNNs
RNNs for Representing Stacks
A Note on Reading the Literature
Concrete Recurrent Neural Network Architectures
CBOW as an RNN
Simple RNN
Gated Architectures
LSTM
GRU
Other Variants
Dropout in RNNs
Modeling with Recurrent Networks
Acceptors
Sentiment Classification
Subject-verb Agreement Grammaticality
RNNs as Feature Extractors
Part-of-speech Tagging
RNN–CNN Document Classification
Arc-factored Dependency Parsing
Conditioned Generation
RNN Generators
Training Generators
Conditioned Generation (Encoder-Decoder)
Sequence to Sequence Models
Applications
Other Conditioning Contexts
Unsupervised Sentence Similarity
Conditioned Generation with Attention
Computational Complexity
Interpretability
Attention-based Models in NLP
Machine Translation
Morphological Inflection
Syntactic Parsing
Additional Topics
Modeling Trees with Recursive Neural Networks
Formal Definition
Extensions and Variations
Training Recursive Neural Networks
A Simple Alternative–Linearized Trees
Outlook
Structured Output Prediction
Search-based Structured Prediction
Structured Prediction with Linear Models
Nonlinear Structured Prediction
Probabilistic Objective (CRF)
Approximate Search
Reranking
See Also
Greedy Structured Prediction
Conditional Generation as Structured Output Prediction
Examples
Search-based Structured Prediction: First-order Dependency Parsing
Neural-CRF for Named Entity Recognition
Approximate NER-CRF With Beam-Search
Cascaded, Multi-task and Semi-supervised Learning
Model Cascading
Multi-task Learning
Training in a Multi-task Setup
Selective Sharing
Word-embeddings Pre-training as Multi-task Learning
Multi-task Learning in Conditioned Generation
Multi-task Learning as Regularization
Caveats
Semi-supervised Learning
Examples
Gaze-prediction and Sentence Compression
Arc Labeling and Syntactic Parsing
Preposition Sense Disambiguation and Preposition Translation Prediction
Conditioned Generation: Multilingual Machine Translation, Parsing, and Image Captioning
Outlook
Conclusion
What Have We Seen?
The Challenges Ahead
Bibliography
Author’s Biography
People also search for Neural Network Methods for Natural Language Processing 1st :
natural language processing techniques
neural networks for natural language processing
natural language method
yoav goldberg neural network