## Advances in Knowledge Discovery and Data Mining: 7th Pacific-Asia Conference, PAKDD 2003. Seoul, Korea, April 30 - May 2, 2003, Proceedings, Volume 7The 7th Paci?c Asia Conference on Knowledge Discovery and Data Mining (PAKDD) was held from April 30 to May 2, 2003 in the Convention and Ex- bition Center (COEX), Seoul, Korea. The PAKDD conference is a major forum for academic researchers and industry practitioners in the Paci?c Asia region to share original research results and development experiences from di?erent KDD-related areas such as data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition and discovery, data visualization, and knowledge-based systems. The conference was organized by the Advanced Information Technology Research Center (AITrc) at KAIST and the Statistical Research Center for Complex Systems (SRCCS) at Seoul National University. It was sponsored by the Korean Datamining Society (KDMS), the Korea Inf- mation Science Society (KISS), the United States Air Force O?ce of Scienti?c Research, the Asian O?ce of Aerospace Research & Development, and KAIST. It was held with cooperation from ACM’s Special Group on Knowledge Dis- very and Data Mining (SIGKDD). |

Data Mining as an Automated Service | 1 |

Trends and Challenges in the Industrial Applications of KDD | 14 |

Finding EventOriented Patterns in Long Temporal Sequences | 15 |

Mining Frequent Episodes for Relating Financial Events and Stock Trends | 27 |

An Efficient Algorithm of Frequent Connected Subgraph Extraction | 40 |

Classifier Construction by GraphBased Induction for GraphStructured Data | 52 |

Comparison of the Performance of CenterBased Clustering Algorithms | 63 |

Automatic Extraction of Clusters from Hierarchical Clustering Representations | 75 |

Evolutionary Approach for Mining Association Rules on Dynamic Databases | 325 |

Position Coded Preorder Linked WAPTree for Web Log Sequential Pattern Mining | 337 |

An Integrated System of Mining HTML Texts and Filtering Structured Documents | 350 |

A New Sequential Mining Approach to XML Document Similarity Computation | 356 |

Optimization of Fuzzy Rules for Classification Using Genetic Algorithm | 363 |

Fast Pattern Selection for Support Vector Classifiers | 376 |

A NoiseRobust Ensemble Method | 388 |

Improving Performance of Decision Tree Algorithms with Multiedited Nearest Neighbor Rule | 394 |

Large Scale Unstructured Document Classification Using Unlabeled Data and Syntactic Information | 88 |

Extracting Shared Topics of Multiple Documents | 100 |

An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition | 111 |

A Semisupervised Algorithm for Pattern Discovery in Information Extraction from Textual Data | 117 |

Mining Patterns of Dyspepsia Symptoms Across Time Points Using Constraint Association Rules | 124 |

Predicting Protein Structural Class from Closed Protein Sequences | 136 |

Learning Rules to Extract Protein Interactions from Biomedical Text | 148 |

Predicting Protein Interactions in Human by Homologous Interactions in Yeast | 159 |

Mining the Customers UpToMoment Preferences for Ecommerce Recommendation | 166 |

A GraphBased Optimization Algorithm for Website Topology Using Interesting Association Rules | 178 |

A Markovian Approach for Web User Profiling and Clustering | 191 |

Extracting User Interests from Bookmarks on the Web | 203 |

Mining Frequent Instances on Workflows | 209 |

Real Time Video Data Mining for Surveillance Video Streams | 222 |

Distinguishing Causal and Acausal Temporal Relations | 234 |

Online Bayes Point Machines | 241 |

Exploiting Hierarchical Domain Values for Bayesian Learning | 253 |

A New Restricted Bayesian Network Classifier | 265 |

An Efficient Algorithm for Clustering Large HighDimensional Datasets | 271 |

Multilevel Clustering and Reasoning about Its Clusters Using Region Connection Calculus | 283 |

An Efficient CellBased Clustering Method for Handling Large HighDimensional Data | 295 |

Enhancing SWF for Incremental Association Mining by Itemset Maintenance | 301 |

Reducing Rule Covers with Deterministic Error Bounds | 313 |

HypergraphBased Outlier Test for Categorical Data | 399 |

A Method for Aggregating Partitions Applications in KDD | 411 |

Efficiently Computing Iceberg Cubes with Complex Constraints through Bounding | 423 |

Extraction of Tag Tree Patterns with Contractible Variables from Irregular Semistructured Data | 430 |

A More Efficient Alternative for Polynomial Multiple Linear Regression in Stream Cube | 437 |

An Efficient Method for TimeConstraint Mining | 449 |

Mining Open Source Software OSS Data Using Association Rules Network | 461 |

Parallel FPGrowth on PC Cluster | 467 |

Active Feature Selection Using Classes | 474 |

Electricity Based External Similarity of Categorical Attributes | 486 |

Weighted Proportional kInterval Discretization for NaiveBayes Classifiers | 501 |

An Indiscernibility Based Approach | 513 |

Considering Correlation between Variables to Improve Spatiotemporal Forecasting | 519 |

A FilterandRefine Approach | 532 |

When to Update the Sequential Patterns of Stream Data | 545 |

A New Clustering Algorithm for Transaction Data via Caucus | 551 |

A DensityBased Spatial Clustering Method with Random Sampling | 563 |

Optimized Clustering for Anomaly Intrusion Detection | 576 |

Finding Frequent Subgraphs from Graph Structured Data with Geometric Information and Its Application to Lossless Compression | 582 |

Upgrading ILP Rules to FirstOrder Bayesian Networks | 595 |

A Clustering Validity Assessment Index | 602 |

