Advances in Knowledge Discovery and Data Mining: 8th Pacific-Asia Conference, PAKDD 2004, Sydney, Australia, May 26-28, 2004, Proceedings, Volume 8

Front Cover
Springer Science & Business Media, May 11, 2004 - Business & Economics - 713 pages
0 Reviews
ThePaci?c-AsiaConferenceonKnowledgeDiscoveryandDataMining(PAKDD) has been held every year since 1997. This year, the eighth in the series (PAKDD 2004) was held at Carlton Crest Hotel, Sydney, Australia, 26–28 May 2004. PAKDD is a leading international conference in the area of data mining. It p- vides an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all KDD-related areas including data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition and automatic scienti?c discovery, data visualization, causal induction, and knowledge-based systems. The selection process this year was extremely competitive. We received 238 researchpapersfrom23countries,whichisthehighestinthehistoryofPAKDD, and re?ects the recognition of and interest in this conference. Each submitted research paper was reviewed by three members of the program committee. F- lowing this independent review, there were discussions among the reviewers, and when necessary, additional reviews from other experts were requested. A total of 50 papers were selected as full papers (21%), and another 31 were selected as short papers (13%), yielding a combined acceptance rate of approximately 34%. The conference accommodated both research papers presenting original - vestigation results and industrial papers reporting real data mining applications andsystemdevelopmentexperience.Theconferencealsoincludedthreetutorials on key technologies of knowledge discovery and data mining, and one workshop focusing on speci?c new challenges and emerging issues of knowledge discovery anddatamining.ThePAKDD2004programwasfurtherenhancedwithkeynote speeches by two outstanding researchers in the area of knowledge discovery and data mining: Philip Yu, Manager of Software Tools and Techniques, IBM T.J.
 

What people are saying - Write a review

We haven't found any reviews in the usual places.

Contents

Mining of Evolving Data Streams with Privacy Preservation
1
Data Mining Grand Challenges
2
Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms
3
Spectral Energy Minimization for Semisupervised Learning
13
Discriminative Methods for Multilabeled Classification
22
Subspace Clustering of High Dimensional Spatial Data with Noises
31
ConstraintBased Graph Clustering through Node Sequencing and Partitioning
41
Mining Expressive Process Models by Clustering Workflow Traces
52
Exploring Potential of LeaveOneOut Estimator for Calibration of SVM in Text Mining
361
Classifying Text Streams in the Presence of Concept Drifts
373
Using ClusterBased Sampling to Select Initial Training Set for Active Learning in Text Classification
384
Spectral Analysis of Text Collection for SimilarityBased Clustering
389
Clustering Multirepresented Objects with Noise
394
Providing Diversity in KNearest Neighbor Query Results
404
Cluster Structure of Kmeans Clustering via Principal Component Analysis
414
A Novel and Efficient Technique
419

Mining Both Closed and Maximal Frequent Subtrees
63
Secure Association Rule Sharing
74
SelfSimilar Mining of Time Association Rules
86
An Efficient Parallel Implementation of the DualMiner Algorithm
96
A Novel Distributed Collaborative Filtering Algorithm and Its Implementation on P2P Overlay Network
106
An Efficient Algorithm for Dense Regions Discovery from LargeScale Data Streams
116
Blind Data Linkage Using ngram Similarity Comparisons
121
Condensed Representation of Emerging Patterns
127
Discovery of Maximally Frequent Tag Tree Patterns with Contractible Variables from Semistructured Documents
133
Mining Term Association Rules for Heuristic Query Construction
145
The Art of Growing and Pruning Small FPTrees
155
Mining Negative Rules Using GRD
161
Applying Association Rules for Interesting Recommendations Using Rule Templates
166
Feature Extraction and Classification System for Nonlinear and Online Data
171
A Metric Approach to Building Decision Trees Based on GoodmanKruskal Association Index
181
Mining Classification Rules with Help of SVM
191
A New Data Mining Method Using Organizational Coevolutionary Mechanism
196
Noise Tolerant Classification by Chi Emerging Patterns
201
The Application of Emerging Patterns for Improving the Quality of RareClass Classification
207
Finding Negative EventOriented Patterns in Long Temporal Sequences
212
Outlier by Example
222
Temporal Sequence Associations for Rare Events
235
Summarization of Spacecraft Telemetry Data by Extracting Significant Temporal Patterns
240
An Extended Negative Selection Algorithm for Anomaly Detection
245
Adaptive Clustering for Network Intrusion Detection
255
Ensembling MML Causal Discovery
260
Logistic Regression and Boosting for Labeled Bags of Instances
272
Fast and Light Boosting for Adaptive Mining of Data Streams
282
Compact Dual Ensembles for Active Learning
293
On the Size of Training Set and the Benefit from Ensemble
298
Identifying Markov Blankets Using Lasso Estimation
308
Selective Augmented Bayesian Network Classifiers Based on Rough Set Theory
319
Using SelfConsistent NaiveBayes to Detect Masquerades
329
Database Approach to Graph Mining
341
Finding Frequent Structural Features among Words in TreeStructured Documents
351
An Alternative Methodology for Mining Seasonal Pattern Using SelfOrganizing Map
424
Item Selection for Marketing with CrossSelling Considerations
431
Efficient PatternGrowth Methods for Frequent Tree Pattern Mining
441
Mining Association Rules from Structural Deltas of Historical XML Documents
452
Serving Large Number of Users for Efficient Frequent Itemset Mining
458
Formal Approach and Automated Tool for Translating ER Schemata into OWL Ontologies
464
Separating Structure from Interestingness
476
Exploiting Recurring Usage Patterns to Enhance Filesystem and Memory Subsystem Performance
486
Automatic Text Extraction for ContentBased Image Indexing
497
Peculiarity Oriented Analysis in Multipeople Tracking Images
508
Fast and Scalable Discovery of Hidden Variables in Stream and Multimedia Databases
519
A Method of Document Copy Detection
529
Extracting Citation Metadata from Online Publication Lists Using BLAST
539
Mining of WebPage Visiting Patterns with ContinuousTime Markov Models
549
Discovering Ordered Tree Patterns from XML Queries
559
Predicting Web Requests Efficiently Using a Probability Model
564
Efficient Mining of ConfidenceClosed Correlated Patterns
569
A Conditional Probability DistributionBased Dissimilarity Measure for Categorial Data
580
Learning Hidden Markov Model Topology Based on KL Divergence for Information Extraction
590
A Nonparametric Wavelet Feature Extractor for Time Series Classification
595
Rules Discovery from CrossSectional ShortLength Time Series
604
ConstraintBased Mining of Formal Concepts in Transactional Data
615
Towards Optimizing Conjunctive Inductive Queries
625
Febrl A Parallel Open Source Data Linkage System
638
A General Coding Method for ErrorCorrecting Output Codes
648
Discovering Partial Periodic Patterns in Discrete Data Sequences
653
Conceptual Mining of Large Administrative Health Data
659
A Semiautomatic System for Tagging Specialized Corpora
670
A TreeBased Approach to the Discovery of Diagnostic Biomarkers for Ovarian Cancer
682
A Novel ParameterLess Clustering Method for Mining Gene Expression Data
692
Extracting and Explaining Biological Knowledge in Microarray Data
699
Further Applications of a Particle Visualization Framework
704
Author Index
711
Copyright

Other editions - View all

Common terms and phrases