Classification and Multivariate Analysis for Complex Data Structures 1st Edition by Bernard Fichet, Domenico Piccolo, Rosanna Verde – Ebook PDF Instant Download/Delivery. 3642133126, 9783642133121
Full download Classification and Multivariate Analysis for Complex Data Structures 1st Edition after payment
Product details:
ISBN 10: 3642133126
ISBN 13: 9783642133121
Author: Bernard Fichet, Domenico Piccolo, Rosanna Verde
The growing capabilities in generating and collecting data has risen an urgent need of new techniques and tools in order to analyze, classify and summarize statistical information, as well as to discover and characterize trends, and to automatically bag anomalies. This volume provides the latest advances in data analysis methods for multidimensional data which can present a complex structure: The book offers a selection of papers presented at the first Joint Meeting of the Société Francophone de Classification and the Classification and Data Analysis Group of the Italian Statistical Society. Special attention is paid to new methodological contributions from both the theoretical and the applicative point of views, in the fields of Clustering, Classification, Time Series Analysis, Multidimensional Data Analysis, Knowledge Discovery from Large Datasets, Spatial Statistics.
Classification and Multivariate Analysis for Complex Data Structures 1st Table of contents:
Part I Key Notes
Principal Component Analysis for Categorical Histogram Data: Some Open Directions of Research
Edwin Diday
1 Introduction
2 The Categorical Histogram Data Table
3 Building “Metabins” by Scoring the Bins in Case of Nominal Histogram Variables
3.1 What Are “Metabins”?
3.2 Metabins Quality Criteria, Correlation or Copulas Based
4 PCA for Histogram Data Table Using Copulas
4.1 The Standard PCA on Histogram Data
4.2 The Copular PCA
5 Data Tables Derived from the Initial Categorical Histogram Data Table
5.1 The Use of the Derived Data Tables
5.2 Representation of Supplementary Individuals, Variables and Metabins
5.3 Representation of the Variation of the Individuals According to the Bins, the Metabins and the
5.4 “Histogram Stars” Associated to the Bins, the Metabins or the Variables, as Output of the PCA
5.5 Representation of the Symbolic Variables in a Hypercube
6 Conclusion
References
Factorial Conjoint Analysis Based Methodologies
Giuseppe Giordano, Carlo N. Lauro, and Germana Scepi
1 Introduction
2 The Metric Approach to Conjoint Analysis
3 The Factorial Conjoint Analysis
4 The FCA with Two Informative Structures
5 Cluster Based Factorial Conjoint Analysis
6 Multi Criteria Factorial Conjoint Analysis
7 Some Conclusions
References
Ordering and Scaling Objects in Multivariate Data Under Nonlinear Transformations of Variables
Jacqueline J. Meulman, Lawrence J. Hubert, and Phipps Arabie
1 Introduction
2 Our Data Analytic Problem
2.1 The Three Subtasks and Corresponding Optimization Methods
3 An Empirical Illustration
4 Concluding Remarks
References
Statistical Models to Predict Academic Churn Risk
Paolo Giudici and Emanuele Dequarti
1 Introduction
2 Data Set
3 Methodological Proposal
4 Empirical Evidence
References
The Poisson Processes in Cluster Analysis
André Hardy
1 Introduction
2 A Statistical Model for Cluster Analysis Based on the Homogeneous Poisson Process
2.1 The Homogeneous Poisson Process
2.2 Starting Problem: The Estimation of a Convex Set
2.3 The Hypervolumes Clustering Method
3 The Statistical Model Based on the Non-stationary Poisson Process: The Generalized Hypervolumes C
3.1 The Non-stationary Poisson Process
3.2 The generalized Hypervolumes Clustering Method
3.3 Estimation of the Intensity of the Non-stationary Poisson Process
4 Statistical Tests for the Number of Clusters Based on the Homogeneous Poisson Point Process
4.1 The Hypervolumes Test
4.2 The Gap Test
5 Monothetic Divisive Clustering Methods Based on the Poisson Processes
5.1 The Model
5.2 The Splitting Process
5.3 The Pruning Method
5.4 The Merging Process
6 Conclusion
References
TWO-CLASS Trees for Non-Parametric Regression Analysis
Roberta Siciliano and Massimo Aria
1 Previous Work
2 TWO-CLASS Trees Methodology
3 Comparative Study
4 Concluding Remarks
References
Part II Classification and Discrimination
Efficient Incorporation of Additional Information to Classification Rules
Miguel Fernández, Cristina Rueda, and Bonifacio Salvador
1 Introduction
2 Discrimination Rules That Incorporate Additional Information
2.1 Restricted Parameter Estimation
2.2 Latent Space Rules
2.3 New Rules Combining Both Approaches
3 Simulations
4 Example. Pima Indians Diabetes Database
5 Conclusions
References
The Choice of the Parameter Values in a Multivariate Model of a Second Order Surface with Heterosced
Umberto Magagnoli and Gabriele Cantaluppi
1 Introduction
2 The Procedure
3 Some Results Obtained by Simulation
4 Conclusions
References
Mixed Mode Data Clustering: An Approach Based on Tetrachoric Correlations
Isabella Morlini
1 Introduction
2 From Binary Variables to Continuous Variables
3 Main Results on a Simulation Study and on a Real Data Set
4 Concluding Remarks
References
Optimal Scaling Trees for Three-Way Data
Valerio A. Tutore
1 Introduction
2 The Data and the Two-Stage Splitting Criterion
3 The Method
4 The Analysis
5 Conclusions
References
Part III Data Mining
A Study on Text Modelling via Dirichlet Compound Multinomial
Concetto Elvio Bonafede and Paola Cerchiello
1 Introduction
2 Background: The Dirichlet Compound Multinomial
3 Application
3.1 Performance of the Dirichlet Compound Multinomial
4 Conclusion
References
Automatic Multilevel Thresholding Based on a Fuzzy Entropy Measure
D. Bruzzese and U. Giani
1 Introduction
2 Fuzzy Set Theory and Histogram Thresholding
3 Automatic Multilevel Fuzzy Histogram Thresholding
4 Experimental Results
5 Concluding Remarks
References
Some Developments in Forward Search Clustering
Daniela G. Calò
1 Introduction
2 A Model-Based Formulation of FS Clustering
2.1 An Illustrative Example
3 An Additional Advantage of Mixture-Based Forward Search
4 Concluding Remarks
References
Spectral Graph Theory Tools for Social Network Comparison
Domenico De Stefano
1 Introduction
2 Network Comparison in Social Network Analysis
3 The Graph Embedding Approach for SocialNetworks Comparison
4 A Simulated Applicative Example
5 Concluding Remarks
References
Improving the MHIST-p Algorithm for Multivariate Histograms of Continuous Data
Mauro Iacono and Antonio Irpino
1 Introduction
2 Motivation
3 Histograms and Related Works
4 Improving MHIST-p
5 An Application on Real and Artificial Datasets
6 Conclusions and Perspectives
References
On Building and Visualizing Proximity Graphs for Large Data Sets with Artificial Ants
Julien Lavergne, Hanane Azzag, Christiane Guinot, and Gilles Venturini
1 Introduction
2 Initial Bio-Inspired Algorithm
3 Hierarchical Approach and Visualization
4 Results
5 Conclusions
References
Including Empirical Prior Information in Test Administration
Mariagiulia Matteucci and Bernard P. Veldkamp
1 Introduction
2 Joint Modelling of Measurement Model and Prior Information
3 Simulation Study
4 Real Data Application
5 Concluding Remarks
References
Part IV Robustness and Classification
Italian Firms’ Geographical Location in High-tech Industries: A Robust Analysis
Matilde Bini and Margherita Velucchi
1 Introduction
2 The Model: A Robust Approach to Detect Outliers
3 Data Set Description and Some Results
4 Conclusive Remarks
References
Robust Tests for Pareto Density Estimation
Aldo Corbellini and Lisa Crosato
1 Introduction
2 Robust Stepwise Fitting of the Pareto II Distribution
3 Forward Chi-Square Test
4 Concluding Remarks
References
Bootstrap and Nonparametric Predictors to Impute Missing Data
Agostino Di Ciaccio
1 Introduction
2 Multiple Imputation
3 Bootstrap and Missing Data Imputation
4 Bagged Trees to Predict Missing Data
References
On the Use of Boosting Procedures to Predict the Risk of Default
Giovanna Menardi, Federico Tedeschi and Nicola Torelli
1 Introduction
2 Boosting Overview
2.1 Boosting in Presence of Unbalanced Classes
3 The ROSEBoost Approach for Dealing with Class Imbalance
4 Some Real Data Applications
5 Discussion and Concluding Remarks
References
Part V Categorical Data and Latent Class Approach
Assessing Similarity of Rating Distributionsby Kullback-Leibler Divergence
Marcella Corduas
1 Introduction
2 The CUB Model
3 Assessing Similarity of CUB Models
4 An Application
5 Final Remarks
References
Sector Classification in Stock Markets:A Latent Class Approach
Michele Costa and Luca De Angelis
1 Introduction
2 Methodology
3 Model Estimation
4 The New Stock’s Classification
5 Conclusions
References
Partitioning the Geometric Variability in Multivariate Analysis and Contingency Tables
Carles M. Cuadras and Daniel Cuadras
1 Introduction
2 Geometric Variability
2.1 Finite Set
2.2 Random Vector
2.3 Mixtures
3 Distance-Based Analysis of Variance
4 Contingency Tables
4.1 General Approach
4.2 Correspondence Analysis (Centered = Uncentered)
4.3 Hellinger Distance Analysis (Centered and Uncentered)
4.4 Nonsymmetrical CA (Centered = Uncentered)
4.5 Log-Ratio Analysis (Centered and Uncentered)
4.6 Double Centered Log-Ratio Analysis
References
One-Dimensional Preference Data Imputation Through Transition Rules
Luigi Fabbris
1 Preference Elicitation in Surveys
2 One-Dimensional Preference Rating Method
3 An Application
4 Conclusions
References
About a Type of Quasi Linear Estimating Equation Approach
Giulio D’Epifanio
1 Introduction
2 The Reference Model in GLMM
3 Quasi-Linearization and the Main Note
4 Estimating
4.1 A General Quasi-Linear Estimating System
4.2 A Decomposable Estimating System
4.3 Special Weighting and Approximate Asymptotic Formulas
5 A Comparative Study
6 Concluding Remarks
References
Causal Inference Through Principal Stratification: A Special Type of Latent Class Modelling
Leonardo Grilli
1 Introduction
2 Principal Stratification: Basic Ideas and an Application
3 Principal Stratification and Latent Class Modelling
References
Scaling the Latent Variable Cultural Capital via Item Response Models and Latent Class Analysis
Isabella Sulis, Mariano Porcu, and Marco Pitzalis
1 Introduction
2 The Survey
3 Scaling the Cultural Capital via LCA
4 Assessing the Difficulty Level of the Survey Questionnaire Using a Bidimensional IRT
5 Some Final Remarks
References
Assessment of Latent Class Detection in PLS Path Modeling: a Simulation Study to Evaluate the Group
Laura Trinchera
1 Introduction
2 A New Index to Assess Group Separation in PLS-PM: The Group Quality Index
3 Simulation Study
3.1 Design of the Numerical Example and Data Simulation
3.2 Simulation Study Results
4 Discussion and Conclusions
References
Part VI Latent Variables and Related Methods
Non-Linear Relationships in SEM with Latent Variables: Some Theoretical Remarks and a Case Study
Giuseppe Boari, Gabriele Cantaluppi, and Stefano Bertelli
1 Introduction
2 Presence of Non-Linearity in the Inner Model
3 Presence of Non-Linearity in the Outer Model
3.1 Scaling Problems
3.2 Kano Model Relationships
3.3 An Application and Concluding Remarks
References
Multidimensional Scaling Versus Multiple Correspondence Analysis When Analyzing Categorization Data
Marine Cadoret, Sébastien Lê, and Jérôme Pagès
1 Introduction
2 Methods
2.1 Multidimensional Scaling
2.2 Multiple Correspondence Analysis
2.3 Elements of Comparison Between the Two Methods
3 Application
3.1 Data
3.2 Case of an Object Isolated by All the Subjects
4 Conclusion
References
Multidimensional Scaling as Visualization Tool of Web Sequence Rules
Antonio D’Ambrosio and Marcello Pecoraro
1 Introduction
2 The Idea
3 The Data
4 The MDS Solution
5 Direct and Indirect Sequence Rules
6 Conclusions
References
Partial Compliance, Effect of Treatment on the Treated and Instrumental Variables
Antonio Forcina
1 Introduction
2 A Latent Class Model for Partial Compliance
2.1 Notation
3 The Effect of Treatment on the Treated
3.1 The Instrumental Variable Estimand
4 Discussion
5 Application
References
Method of Quantification for Qualitative Variables and their Use in the Structural Equations Models
C. Lauro, D. Nappo, M.G. Grassia, and R. Miele
1 Introduction
2 Different Ways to Quantify the Qualitative Variables
3 Alternating Least Squares Algorithm
3.1 Alternating Least Squares Algorithm: The Model for AVSI
4 Conclusion
References
Monitoring Panel Performance Within and Between Sensory Experiments by Multi-Way Analysis
Rosaria Romano, Jannie S. Vestergaard, Mohsen Kompany-Zareh, and Wender L.P. Bredie
1 Introduction and Data Description
2 Methods
2.1 Modeling Assessors’ Performance by PARAFAC
2.2 Modeling Panel Predictive Ability by N-PLS
2.3 Modeling Panel Performance Between Experiments
3 Results
4 Conclusions
References
A Proposal for Handling Categorical Predictors in PLS Regression Framework
Giorgio Russolillo and Carlo Natale Lauro
1 Introduction
2 The Univariate Response Case
2.1 PLS1 Algorithm Backgrounds
2.2 A Quantification Criterion for Categorical Predictors in Univariate PLS Regression
3 The Multivariate Response Case
3.1 The PLS2 Algorithm
3.2 Quantifying the Categorical Predictors in the Multivariate Case: The PLS-CAP Algorithm
4 An Application to Real Data: The “Cars” Dataset
5 Conclusions
References
Part VII Symbolic, Multivalued and Conceptual Data Analysis
On the Use of Archetypes and Interval Codingin Sensory Analysis
Maria Rosaria D’Esposito, Francesco Palumbo, and Giancarlo Ragozini
1 Introduction
2 Archetypal Analysis
3 Interval Coding in Sensory Analysis
4 Archetypes for Interval Coded Data
5 An Illustrative Example
References
From Histogram Data to Model Data Analysis
Marina Marino and Simona Signoriello
1 Introduction
2 Histogram Data and Model Data
2.1 Histogram Data
2.2 Model Data
3 Histogram Approximation by B-spline
4 Histogram Transformation Process: An Example
5 Conclusion and Future Work
References
Use of Genetic Algorithms When Computing Variance of Interval Data
Jaromír Antoch and Raffaele Miele
1 Introduction
2 Specific Problem
3 Genetic Algorithm
3.1 General Remarks
3.2 Specification for Our Specific Problem
4 Example
5 Conclusions and Perspectives
References
Spatial Visualization of Conceptual Data
Michel Soto, Bénédicte Le Grand, and Marie-Aude Aufaure
1 Introduction
2 Context
2.1 Formal Concept Analysis and Galois Lattices
2.2 Galois Lattices’ Interpretation
2.3 Related Work
3 Galois Lattice’s Pixel-Oriented Visualization
4 Galois Lattice’s Tree-Based Visualization
4.1 Tree Extraction Algorithm
4.2 Clusters Analysis
5 Conclusion
References
Part VIII Spatial, Temporal, Streaming and Functional Data Analysis
A Test of LBO Firms’ Acquisition Rationale: The French Case
R. Abdesselam, S. Cieply and A.L. Le Nadant
1 Introduction
2 Theoretical Predictions
3 Sample Selection and Methodology
4 Empirical Results
5 Discussion and Conclusion
References
Kernel Intensity for Space-Time Point Processes with Application to Seismological Problems
Giada Adelfio and Marcello Chiodi
1 Introduction
2 Intensity Function and Predictive Likelihood
3 Applications to Seismological Field
3.1 Evaluation of Seismic Gap
4 Conclusive Remarks
References
Summarizing and Mining Streaming Data via a Functional Data Approach
Antonio Balzanella, Elvira Romano, and Rosanna Verde
1 Introduction
2 A Functional Approach for Dealing with Streaming Data
3 Main Results
4 Conclusions
References
Clustering Complex Time Series Databases
Francesco Giordano, Michele La Rocca, and Maria Lucia Parrella
1 Introduction
2 The Clustering Algorithm
3 An Application to a Real Temporal Data Base
References
Use of a Flexible Weight Matrix in a Local Spatial Statistic
Massimo Mucciardi
1 Introduction
2 Local Measure of Spatial Autocorrelationwith S-DSMA Procedure
3 Application and Conclusion
References
Constrained Variable Clustering and the Best Basis Problem in Functional Data Analysis
Fabrice Rossi and Yves Lechevallier
1 Introduction
2 Best Basis for Functional Data
3 Best Basis via Constrained Clustering
3.1 From Best Basis to Constrained Clustering
3.2 Dynamic Programming
3.3 Extensions
4 Experiments
References
Part IX Bio and Health Science
Plaid Model for Microarray Data: an Enhancement of the Pruning Step
Luigi Augugliaro and Angelo M. Mineo
1 Introduction
2 The Plaid Model
3 The Proposed Enhancement
4 Simulation Studies
5 Conclusions
References
Classification of the Human Papilloma Viruses
Abdoulaye Baniré Diallo, Dunarel Badescu, Mathieu Blanchette, and Vladimir Makarenkov
1 Introduction
2 Inferring the History of Evolutionary Events
3 Finding Relationships Between the Two Types of Cancer and the Indel/Conservation Distributions in
4 Conclusion
References
Toward the Discovery of Itemsets with Significant Variations in Gene Expression Matrices
Mehdi Kaytoue, Sébastien Duplessis, and Amedeo Napoli
1 Introduction and Motivations
2 Gene Expression Profile Representation
3 Itemset Search
4 Minimal Variation Constraints
5 Experiments
6 Conclusion
People also search for Classification and Multivariate Analysis for Complex Data Structures 1st:
classification and multivariate analysis
some methods for classification and analysis of multivariate observations
discrimination and classification in multivariate analysis
classification and analysis of multivariate observations
some methods for classification and analysis of multivariate observations doi