Inductive Databases and Constraint-Based Data Mining 1st Edition by Saso Dzeroski, Bart Goethals, Pance Panov – Ebook PDF Instant Download/Delivery. 1441977384, 9781441977380
Full download Inductive Databases and Constraint-Based Data Mining 1st Edition after payment
Product details:
ISBN 10: 1441977384
ISBN 13: 9781441977380
Author: Saso Dzeroski, Bart Goethals, Pance Panov
This book is about inductive databases and constraint-based data mining, emerging research topics lying at the intersection of data mining and database research. The aim of the book as to provide an overview of the state-of- the art in this novel and – citing research area. Of special interest are the recent methods for constraint-based mining of global models for prediction and clustering, the uni?cation of pattern mining approaches through constraint programming, the clari?cation of the re- tionship between mining local patterns and global models, and the proposed in- grative frameworks and approaches for inducive databases. On the application side, applications to practically relevant problems from bioinformatics are presented. Inductive databases (IDBs) represent a database view on data mining and kno- edge discovery. IDBs contain not only data, but also generalizations (patterns and models) valid in the data. In an IDB, ordinary queries can be used to access and – nipulate data, while inductive queries can be used to generate (mine), manipulate, and apply patterns and models. In the IDB framework, patterns and models become ”?rst-class citizens” and KDD becomes an extended querying process in which both the data and the patterns/models that hold in the data are queried.
Inductive Databases and Constraint-Based Data Mining 1st Table of contents:
Part I Introduction
Chapter 1 Inductive Databases and Constraint-based Data Mining: Introduction and Overview
1.1 Inductive Databases
1.1.1 Inductive Databases and Queries: An Example
1.1.2 Inductive Queries and Constraints
1.1.3 The Promise of Inductive Databases
1.2 Constraint-based Data Mining
1.2.1 Basic Data Mining Entities
1.2.2 The Task(s) of (Constraint-Based) Data Mining
1.3 Types of Constraints
1.3.1 Primitive and Composite Constraints
1.3.2 Language and Evaluation Constraints
1.3.3 Hard, Soft and Optimization Constraints
1.4 Functions Used in Constraints
1.4.1 Language Cost Functions
1.4.2 Evaluation Functions
1.4.3 Monotonicity and Closedness
1.5 KDD Scenarios
1.6 A Brief Review of Literature Resources
1.7 The IQ (Inductive Queries for Mining Patterns and Models) Project
1.7.1 Background (The cInQ project)
1.7.2 IQ Project Consortium and Structure
1.7.3 Major Results of the IQ project
1.8 What’s in this Book
1.8.1 Introduction
1.8.2 Constraint-based Data Mining: Selected Techniques
1.8.3 Inductive Databases: Integration Approaches
1.8.4 Applications
References
Chapter 2 Representing Entities in the OntoDM Data Mining Ontology
2.1 Introduction
2.2 Design Principles for the OntoDM ontology
2.2.1 Motivation
2.2.2 OntoDM design principles
2.2.3 Ontologies for representing scientific investigations
2.3 OntoDM Structure and Implementation
2.3.1 Upper level is-a hierarchy
2.3.2 Ontological relations
2.3.3 Modularity: Specification, implementation, application
2.4 Identification of Data Mining Entities
2.4.1 A general framework for data mining: Basic principles
2.4.2 Data
2.4.3 Generalizations
2.4.4 Data mining task
2.4.5 Data mining algorithms
2.4.6 OntoDM modeling issues
2.5 Representing Data Mining Enitities in OntoDM
2.5.1 Specification entities in OntoDM
2.5.2 Implementation and application entities in OntoDM
2.6 Related Work
2.7 Conclusion
References
Chapter 3 A Practical Comparative Study Of Data Mining Query Languages
3.1 Introduction
3.2 Data Mining Tasks
3.3 Comparison of Data Mining Query Languages
3.3.1 DMQL
3.3.2 MSQL
3.3.3 MINE RULE
3.3.4 SIQL
3.3.5 SPQL
3.3.6 DMX
3.4 Summary of the Results
3.5 Conclusions
References
Chapter 4 A Theory of Inductive Query Answering
4.1 Introduction
4.2 Boolean Inductive Queries
4.2.1 Predicates
4.2.2 Illustrations of Inductive Querying
4.2.3 A General Framework
4.3 Generalized Version Spaces
4.4 Query Decomposition
4.4.1 Query Plans
4.4.2 Canonical Decomposition
4.5 Normal Forms
4.6 Conclusions
References
Part II Constraint-based Mining: Selected Techniques
Chapter 5 Generalizing Itemset Mining in a Constraint Programming Setting
5.1 Introduction
5.2 General Concepts
5.3 Specialized Approaches
5.4 A Generalized Algorithm
5.5 A Dedicated Solver
5.5.1 Principles
5.5.2 Case study on formal concepts and fault-tolerant patterns
5.6 Using Constraint Programming Systems
5.6.1 Principles
5.6.2 Case study on formal concepts and fault-tolerant patterns
5.7 Conclusions
References
Chapter 6 From Local Patterns to Classification Models
6.1 Introduction
6.2 Preliminaries
6.3 Correlated Patterns
6.3.1 Upper Bound
6.3.2 Top-
Correlated Pattern Mining
6.3.3 Correlation Measures
6.3.4 Type I Errors
6.3.5 Closed and Free Pattern Mining
6.4 Finding Pattern Sets
6.4.1 Constrained Pattern Set Mining
6.4.2 The Chosen Few
6.4.3 Turning Pattern Sets Maximally Informative by Post-Processing
6.5 Direct Predictions from Patterns
6.5.1
6.5.2
6.5.3
6.6 Integrated Pattern Mining
6.6.1
6.6.2 Mining Maximally Informative Pattern Sets Directly
6.6.3
6.6.4
6.7 Conclusions
References
Chapter 7 Constrained Predictive Clustering
7.1 Introduction
7.2 Predictive Clustering Trees
7.2.1 Clustering and Intra-cluster Variance
7.2.2 Clustering Trees
7.2.3 Predictive Clustering and Predictive Clustering Trees
7.2.4 Learning (Predictive) Clustering Trees
7.2.5 Instantiations of (Predictive) Clustering Trees
7.3 Constrained Predictive Clustering Trees and Constraint Types
7.3.1 Cluster Level Constraints
7.3.2 Constraints on Clusterings
7.3.3 Constraints on Clustering Models
7.3.4 Hard Versus Soft Constrained Clustering
7.4 A Search Space of (Predictive) Clustering Trees
7.5 Algorithms for Enforcing Constraints
7.5.1 Post Pruning
7.5.2 Beam Search
7.5.3 Instance Level Constraints
7.6 Conclusion
References
Chapter 8 Finding Segmentations of Sequences
8.1 Introduction
8.2 Efficient Algorithms for Segmentation
8.3 Dimensionality Reduction
8.4 Recurrent Models
8.5 Unimodal Segmentation
8.6 Rearranging the Input Data Points
8.7 Aggregate Segmentation
8.8 Evaluating the Quality of a Segmentation: Randomization
8.9 Model Selection by BIC and Cross-validation
8.10 Bursty Sequences
8.11 Conclusion
References
Chapter 9 Mining Constrained Cross-Graph Cliques in Dynamic Networks
9.1 Introduction
9.2 Problem Setting
9.3 DATA-PEELER
9.3.1 Traversing the Search Space
9.3.2 Piecewise (Anti)-Monotone Constraints
9.4 Extracting
Contiguous Closed
Sets
9.4.1 A Piecewise (Anti)-Monotone Constraint. . .
9.4.2 . . . Partially Handled in Another Way
9.4.3 Enforcing the
Closedness
9.5 Constraining the Enumeration to Extract
Cliques
9.5.1 A Piecewise (Anti)-Monotone Constraint. . .
9.5.2 . . . Better Handled in Another Way
9.5.3 Constraining the Enumeration
9.5.4 Contraposition of the Enumeration Constraints
9.5.5 Enforcing the Symmetric
Closedness
9.6 Experimental Results
9.6.1 Presentation of the V´elo’v Dataset
9.6.2 Extracting Cliques Via Enumeration Constraints
9.6.3 Extraction of
Contiguous Closed 3-Cliques
9.6.4 Qualitative Validation
9.7 Related Work
9.8 Conclusion
References
Chapter 10 Probabilistic Inductive Querying Using ProbLog
10.1 Introduction
10.2 ProbLog: Probabilistic Prolog
10.3 Probabilistic Inference
10.3.1 Exact Inference
10.3.2 Bounded Approximation
10.3.3 K-Best
10.3.4 Monte Carlo
10.4 Implementation
10.4.1 Source-to-source transformation
10.4.2 Tries
10.4.3 Binary Decision Diagrams
10.4.4 Monte Carlo
10.5 Probabilistic Explanation Based Learning
10.6 Local Pattern Mining
10.7 Theory Compression
10.8 Parameter Estimation
10.9 Application
10.10 Related Work in Statistical Relational Learning
10.11 Conclusions
References
Part III Inductive Databases: Integration Approaches
Chapter 11 Inductive Querying with Virtual Mining Views
11.1 Introduction
11.2 The Mining Views Framework
11.2.1 The Mining View Concepts
11.2.2 Representing Patterns and Models as Sets of Concepts
11.2.3 Putting It All Together
11.2.4 Mining Views vs. Data Mining Tasks
11.2.5 Conclusions
11.3 An Illustrative Scenario
11.3.1 Implementation
11.3.2 Scenario
11.4 Conclusions and Future Work
References
Chapter 12 SINDBAD and SiQL: Overview, Applications and Future Developments
12.1 Introduction
12.2 SiQL
12.2.1 Preliminaries
12.2.2 Main Ideas
12.2.3 The
Query
12.2.4 The
Query
12.2.5 Parsing and Executing SiQL Queries
12.3 Example Applications
12.3.1 Gene Expression Analysis
12.3.2 Gene Regulation Prediction
12.3.3 Structure-Activity Relationships
12.4 A Web Service Interface for
12.4.1 Web Services
12.4.2 Motivation
12.4.3 Features
12.5 Future Developments
12.5.1 Types and Signatures
12.5.2 Integration of Mining Views
12.5.3 String Mining
12.6 Conclusion
References
Chapter 13 Patterns on Queries
13.1 Introduction
13.2 Preliminaries
13.2.1 Data
13.2.2 Models
13.2.3 Algorithms
13.3 Frequent Item Set Mining
13.3.1 Selection
13.3.2 Project
13.3.3 EquiJoin
13.3.4 Discussion
13.4 Transforming
13.4.1 Model Approximation
13.4.2 Transforming
13.4.3 The Experiments
13.4.4 The Results
13.4.5 Discussion
13.5 Comparing the two Approaches
13.6 Conclusions and Prospects for Further Research
References
Chapter 14 Experiment Databases
14.1 Introduction
14.2 Motivation
14.2.1 Reproducibility and Reuse
14.2.2 Generalizability and Interpretation
14.2.3 Experiment Databases
14.2.4 Overview of Benefits
14.3 Related Work
14.3.1 e-Sciences
14.3.2 Extension to Machine Learning
14.4 A Pilot Experiment Database
14.4.1 Conceptual Framework
14.4.2 Using the Database
14.4.3 Populating the Database
14.5 Learning from the Past
14.5.1 Model-level Analysis
14.5.2 Data-level Analysis
14.5.3 Method level analysis
14.6 Conclusions
References
Part IV Applications
Chapter 15 Predicting Gene Function using Predictive Clustering Trees
15.1 Introduction
15.2 Related Work
15.3 Predictive Clustering Tree Approaches for HMC
15.3.1 Formal Task Description
15.3.2 Clus-HMC: An HMC Decision Tree Learner
15.3.3 Clus-SC: Learning a Separate Tree for Each Class
15.3.4 Clus-HSC: Learning a Separate Tree for Each Hierarchy Edge
15.3.5 Ensembles of Predictive Clustering Trees
15.4 Evaluation Measure
15.5 Datasets
15.5.1 Saccharomyces cerevisiae datasets
15.5.2 Arabidopsis thaliana datasets
15.6 Comparison of Clus-HMC/SC/HSC
15.7 Comparison of (Ensembles of) CLUS-HMC to State-of-the-art Methods
15.7.1 Comparison of
CLUS-HMC
to Decision Tree based Approaches
15.7.2 Comparison of Ensembles of
CLUS-HMC
to an SVM based Approach
15.8 Conclusions
References
Chapter 16 Analyzing Gene Expression Data with Predictive Clustering Trees
16.1 Introduction
16.2 Datasets
16.2.1 Liver cancer dataset
16.2.2 Huntington’s disease dataset
16.2.3 Neuroblastoma dataset
16.2.4 Yeast time series expression data
16.3 Predicting Multiple Clinical Parameters
16.3.1 Huntington disease progress
16.3.2 Neuroblastoma recurrence
16.4 Evaluating Gene Importance with Ensembles of PCTs
16.4.1 Feature ranking with multi-target Random Forests
16.4.2 Gene importance in Neuroblastoma
16.5 Constrained Clustering of Gene Expression Data
16.5.1 Predictive clustering of gene expression profiles
16.5.2 Itemset constrained clustering
16.6 Clustering gene expression time series data
16.6.1 PCTs for clustering short time-series
16.6.2 Explained groups of yeast time-course gene expression profiles
16.7 Conclusions
References
Chapter 17 Using a Solver Over the String Pattern Domain to Analyze Gene Promoter Sequences
17.1 Introduction
17.2 A Promoter Sequence Analysis Scenario
17.2.1 A generic scenario
17.2.2 Instantiation of the abstract scenario
17.3 The
Solver
17.4 Tuning the Extraction Parameters
17.5 An Objective Interestingness Measure
17.6 Execution of the Scenario
17.6.1 Data preparation
17.6.2 Parameter tuning
17.6.3 Post-processing and biological pattern discovery
17.7 Conclusion
References
Chapter 18 Inductive Queries for a Drug Designing Robot Scientist
18.1 Introduction
18.2 The Robot Scientist Eve
18.2.1 Eve’s Robotics
18.2.2 Compound Library and Screening
18.2.3 QSAR Learning
18.3 Representations of Molecular Data
18.3.1 Traditional Representations
18.3.2 Graph Mining
18.3.3 Inductive Logic Programming
18.4 Selecting Compounds for a Drug Screening Library
18.5 Active learning
18.5.1 Selection strategies
18.5.2 Effects of properties of experimental equipment
18.6 Conclusions
People also search for Inductive Databases and Constraint-Based Data Mining 1st:
inductive databases and constraint-based data mining
inductive data analysis definition
inductive and deductive data analysis
inductive data types
inductive databases