Neural Network Methods for Natural Language Processing 1st Edition by Yoav Goldberg – Ebook PDF Instant Download/Delivery. 9783031021657 ,3031021657
Full download Neural Network Methods for Natural Language Processing 1st Edition after payment
Product details:
ISBN 10: 3031021657
ISBN 13: 9783031021657
Author: Yoav Goldberg
Neural Network Methods for Natural Language Processing 1st Edition Table of contents:
- The Challenges of Natural Language Processing
- Neural Networks and Deep Learning
- Deep Learning in NLP
- Success Stories
- Coverage and Organization
- What’s not Covered
- A Note on Terminology
- Mathematical Notation
- Supervised Classification and Feed-forward Neural Networks
- Learning Basics and Linear Models
- Supervised Learning and Parameterized Functions
- Train, Test, and Validation Sets
- Linear Models
- Binary Classification
- Log-linear Binary Classification
- Multi-class Classification
- Representations
- One-Hot and Dense Vector Representations
- Log-linear Multi-class Classification
- Training as Optimization
- Loss Functions
- Regularization
- Gradient-based Optimization
- Stochastic Gradient Descent
- Worked-out Example
- Beyond SGD
- From Linear Models to Multi-layer Perceptrons
- Limitations of Linear Models: The XOR Problem
- Nonlinear Input Transformations
- Kernel Methods
- Trainable Mapping Functions
- Feed-forward Neural Networks
- A Brain-inspired Metaphor
- In Mathematical Notation
- Representation Power
- Common Nonlinearities
- Loss Functions
- Regularization and Dropout
- Similarity and Distance Layers
- Embedding Layers
- Neural Network Training
- The Computation Graph Abstraction
- Forward Computation
- Backward Computation (Derivatives, Backprop)
- Software
- Implementation Recipe
- Network Composition
- Practicalities
- Choice of Optimization Algorithm
- Initialization
- Restarts and Ensembles
- Vanishing and Exploding Gradients
- Saturation and Dead Neurons
- Shuffling
- Learning Rate
- Minibatches
- Working with Natural Language Data
- Features for Textual Data
- Typology of NLP Classification Problems
- Features for NLP Problems
- Directly Observable Properties
- Inferred Linguistic Properties
- Core Features vs. Combination Features
- Ngram Features
- Distributional Features
- Case Studies of NLP Features
- Document Classification: Language Identification
- Document Classification: Topic Classification
- Document Classification: Authorship Attribution
- Word-in-context: Part of Speech Tagging
- Word-in-context: Named Entity Recognition
- Word in Context, Linguistic Features: Preposition Sense Disambiguation
- Relation Between Words in Context: Arc-Factored Parsing
- From Textual Features to Inputs
- Encoding Categorical Features
- One-hot Encodings
- Dense Encodings (Feature Embeddings)
- Dense Vectors vs. One-hot Representations
- Combining Dense Vectors
- Window-based Features
- Variable Number of Features: Continuous Bag of Words
- Relation Between One-hot and Dense Vectors
- Odds and Ends
- Distance and Position Features
- Padding, Unknown Words, and Word Dropout
- Feature Combinations
- Vector Sharing
- Dimensionality
- Embeddings Vocabulary
- Network’s Output
- Example: Part-of-Speech Tagging
- Example: Arc-factored Parsing
- Pre-trained Word Representations
- Language Modeling
- The Language Modeling Task
- Evaluating Language Models: Perplexity
- Traditional Approaches to Language Modeling
- Further Reading
- Limitations of Traditional Language Models
- Neural Language Models
- Using Language Models for Generation
- By product: Word Representations
- Pre-trained Word Representations
- Random Initialization
- Supervised Task-specific Pre-training
- Unsupervised Pre-training
- Using Pre-trained Embeddings
- Word Embedding Algorithms
- Distributional Hypothesis and Word Representations
- From Neural Language Models to Distributed Representations
- Connecting the Worlds
- Other Algorithms
- The Choice of Contexts
- Window Approach
- Sentences, Paragraphs, or Documents
- Syntactic Window
- Multilingual
- Character-based and Sub-word Representations
- Dealing with Multi-word Units and Word Inflections
- Limitations of Distributional Methods
- Using Word Embeddings
- Obtaining Word Vectors
- Word Similarity
- Word Clustering
- Finding Similar Words
- Similarity to a Group of Words
- Odd-one Out
- Short Document Similarity
- Word Analogies
- Retrofitting and Projections
- Practicalities and Pitfalls
- Case Study: A Feed-forward Architecture for Sentence Meaning Inference
- Natural Language Inference and the SNLI Dataset
- A Textual Similarity Network
- Specialized Architectures
- Ngram Detectors: Convolutional Neural Networks
- Basic Convolution + Pooling
- 1D Convolutions Over Text
- Vector Pooling
- Variations
- Alternative: Feature Hashing
- Hierarchical Convolutions
- Recurrent Neural Networks: Modeling Sequences and Stacks
- The RNN Abstraction
- RNN Training
- Common RNN Usage-patterns
- Acceptor
- Encoder
- Transducer
- Bidirectional RNNs (biRNN)
- Multi-layer (stacked) RNNs
- RNNs for Representing Stacks
- A Note on Reading the Literature
- Concrete Recurrent Neural Network Architectures
- CBOW as an RNN
- Simple RNN
- Gated Architectures
- LSTM
- GRU
- Other Variants
- Dropout in RNNs
- Modeling with Recurrent Networks
- Acceptors
- Sentiment Classification
- Subject-verb Agreement Grammaticality
- RNNs as Feature Extractors
- Part-of-speech Tagging
- RNN–CNN Document Classification
- Arc-factored Dependency Parsing
- Conditioned Generation
- RNN Generators
- Training Generators
- Conditioned Generation (Encoder-Decoder)
- Sequence to Sequence Models
- Applications
- Other Conditioning Contexts
- Unsupervised Sentence Similarity
- Conditioned Generation with Attention
- Computational Complexity
- Interpretability
- Attention-based Models in NLP
- Machine Translation
- Morphological Inflection
- Syntactic Parsing
- Additional Topics
- Modeling Trees with Recursive Neural Networks
- Formal Definition
- Extensions and Variations
- Training Recursive Neural Networks
- A Simple Alternative–Linearized Trees
- Outlook
- Structured Output Prediction
- Search-based Structured Prediction
- Structured Prediction with Linear Models
- Nonlinear Structured Prediction
- Probabilistic Objective (CRF)
- Approximate Search
- Reranking
- See Also
- Greedy Structured Prediction
- Conditional Generation as Structured Output Prediction
- Examples
- Search-based Structured Prediction: First-order Dependency Parsing
- Neural-CRF for Named Entity Recognition
- Approximate NER-CRF With Beam-Search
- Cascaded, Multi-task and Semi-supervised Learning
- Model Cascading
- Multi-task Learning
- Training in a Multi-task Setup
- Selective Sharing
- Word-embeddings Pre-training as Multi-task Learning
- Multi-task Learning in Conditioned Generation
- Multi-task Learning as Regularization
- Caveats
- Semi-supervised Learning
- Examples
- Gaze-prediction and Sentence Compression
- Arc Labeling and Syntactic Parsing
- Preposition Sense Disambiguation and Preposition Translation Prediction
- Conditioned Generation: Multilingual Machine Translation, Parsing, and Image Captioning
- Outlook
- Conclusion
- What Have We Seen?
- The Challenges Ahead
People also search for Neural Network Methods for Natural Language Processing 1st Edition:
artificial neural network methods in quantum mechanics
deep neural network methods
convolutional neural network methods
statistical and neural network methods