## Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference
The annual Neural Information Processing Systems (NIPS) conference is the flagship meeting on neural computation and machine learning. It draws a diverse group of attendees -- physicists, neuroscientists, mathematicians, statisticians, and computer scientists -- interested in theoretical and applied aspects of modeling, simulating, and building neural-like or intelligent systems. The presentations are interdisciplinary, with contributions in algorithms, learning theory, cognitive science, neuroscience, brain imaging, vision, speech and signal processing, reinforcement learning, and applications. Only twenty-five percent of the papers submitted are accepted for presentation at NIPS, so the quality is exceptionally high. This volume contains the papers presented at the December 2006 meeting, held in Vancouver. |

### What people are saying - Write a review

### Contents

An Application of Reinforcement Learning to Aerobatic Helicopter Flight | 1 |

Tighter PACBayes Bounds | 9 |

Online Classification for Complex Problems Using Simultaneous Projections | 17 |

Learning on Graph with Laplacian Regularization | 25 |

MultiTask Feature Learning | 41 |

Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning | 49 |

Efficient Methods for Privacy Preserving Face Detection | 57 |

Active learning for misspecified generalized linear models | 65 |

Uncertaintyphase and oscillatory hippocampal recall | 833 |

Blind Motion Deblurring Using Image Statistics | 841 |

Speakers optimize information density through syntactic reduction | 849 |

Realtime adaptive informationtheoretic optimization of neurophysiology experiments | 857 |

Ordinal Regression by Extended Binary Classification | 865 |

A Sketchbased Sampling Technique for Sparse Data | 873 |

Generalized Regularized LeastSquares Learning with Predefined Features in a Hilbert Space | 881 |

Learnability and the Doubling Dimension | 889 |

Subordinate class recognition using relational object models | 73 |

Unified Inference for Variational Bayesian Linear Gaussian StateSpace Models | 81 |

A Novel Gaussian Sum Smoother for Approximate Inference in Switching Linear Dynamical Systems | 89 |

Sample complexity of policy search with known dynamics | 97 |

AdaBoost is Consistent | 105 |

A selective attention multichip system with dynamic synapses and spiking neurons | 113 |

Temporal and CrossSubject Probabilistic Models for fMRI Prediction Tasks | 121 |

Convergence of Laplacian Eigenmaps | 129 |

Analysis of Representations for Domain Adaptation | 137 |

An Approach to Bounded Rationality | 145 |

Greedy LayerWise Training of Deep Networks | 153 |

DirichletEnhanced Spam Filtering based on Biased Samples | 161 |

Detecting Humans via Their Pose | 169 |

Similarity by Composition | 177 |

Denoising and Dimension Reduction in Feature Space | 185 |

Learning to Rank with Nonsmooth Cost Functions | 193 |

Conditional mean field | 201 |

Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation | 209 |

Branch and Bound for SemiSupervised Support Vector Machines | 217 |

Automated Hierarchy Discovery for Planning in Partially Observable Environments | 225 |

Maxmargin classification of incomplete data | 233 |

Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model | 241 |

Implicit Online Learning with Kernels | 249 |

Context dependent amplification of both rate and eventcorrelation in a VLSI network of spiking neurons | 257 |

Bayesian Ensemble Learning | 265 |

Implicit Surfaces with Globally Regularised and Compactly Supported Basis Functions | 273 |

MapReduce for Machine Learning on Multicore | 281 |

Relational Learning with Gaussian Processes | 289 |

Recursive Attribute Factoring | 297 |

On Transductive Regression | 305 |

Balanced Graph Matching | 313 |

Learning from Multiple Sources | 321 |

Kernels on Structured Objects Through Nested Histograms | 329 |

Differential Entropic Clustering of Multivariate Gaussians | 337 |

Support Vector Machines on a Budget | 345 |

A Theory of Retinal Population Coding | 353 |

Learning to Traverse Image Manifolds | 361 |

Using Combinatorial Optimization within MaxProduct Belief Propagation | 369 |

Optimal SingleClass Classification Strategies | 377 |

A Small World Threshold for Economic Network Formation | 385 |

learning the number of clusters in data | 393 |

Clustering Under Prior Knowledge with Application to Image Segmentation | 401 |

Multidynamic Bayesian Networks | 409 |

Image Retrieval and Classification Using Local Distance Functions | 417 |

Multiple Instance Learning for Computer Aided Diagnosis | 425 |

Distributed Inference in Dynamical Systems | 433 |

Eligibility Traces and Convergence Analysis | 441 |

A PACBayes Risk Bound for General Loss Functions | 449 |

Bayesian Policy Gradient Algorithms | 457 |

Data Integration for Classification Problems Employing Gaussian Process Priors | 465 |

Approximate inference using planar graph decomposition | 473 |

NearUniform Sampling of Combinatorial Spaces Using XOR Constraints | 481 |

Noregret Algorithms for Online Convex Programs | 489 |

Large Margin Multichannel AnalogtoDigital Conversion with Applications to Neural Prosthesis | 497 |

Approximate Correspondences in High Dimensions | 505 |

A Kernel Method for the TwoSampleProblem | 513 |

Learning Nonparametric Models for Probabilistic Imitation | 521 |

Training Conditional Random Fields for Maximum Labelwise Accuracy | 529 |

Adaptive Spatial Filters with predefined Region of Interest for EEG based BrainComputerInterfaces | 537 |

GraphBased Visual Saliency | 545 |

Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds | 553 |

Manifold Denoising | 561 |

A Bayesian Skill Rating System | 569 |

Prediction on a Graph with a Perceptron | 577 |

Geometric entropy minimization GEM for anomaly detection and localization | 585 |

Single Channel Speech Separation Using Factorial Dynamics | 593 |

Correcting Sample Selection Bias by Unlabeled Data | 601 |

Sparse Representation for Signal Classification | 609 |

InN etwork PCA and Anomaly Detection | 617 |

Learning TimeIntensity Proxles of Human Activity using NonParametric Bayesian Models | 625 |

Kernel Maximum Entropy Data Transformation and an Enhanced Spectral Clustering Algorithm | 633 |

A Framework for Specifying Compositional Nonparametric Bayesian Models | 641 |

A Humanlike Predictor of Facial Attractiveness | 649 |

Clustering appearance and shape by learning jigsaws | 657 |

A Kernel Subspace Method by Stochastic Realization for Learning Nonlinear Dynamical Systems | 665 |

An Efficient Method for GradientBased Adaptation of Hyperparameters in SVM Models | 673 |

Combining casual and similaritybased reasoning | 681 |

A Nonparametric Approach to BottomUp Visual Saliency | 689 |

Hierarchical Dirichlet Processes with Random Effects | 697 |

An Information Theoretic Framework for Eukaryotic Gradient Sensing | 705 |

Information Bottleneck Optimization and Independent Component Extraction with Spiking Neurons | 713 |

Predicting spike times from subthreshold dynamics of a neuron | 721 |

Gaussian and Wishart Hyperkernels | 729 |

Causal inference in sensorimotor integration | 737 |

Multiple timescales and uncertainty in motor adaptation | 745 |

A Clustering Approach | 753 |

Accelerated Variational Dirichlet Process Mixtures | 761 |

PACBayes Bounds for the Risk of the Majority Vote and the Variance of the Gibbs Classifier | 769 |

Inducing Metric Violations in Human Similarity Judgements | 777 |

Modelling transcriptional regulation using Gaussian processes | 785 |

SemiSupervised Discriminative Random Fields | 793 |

Efficient sparse coding algorithms | 801 |

A Bayesian Approach to Diffusion Models of DecisionMaking and Response Time | 809 |

Efficient Structure Learning of Markov Networks using L1Regularization | 817 |

Application to Single Trial EEG | 825 |

Emergence of conjunctive visual features by quadratic independent component analysis | 897 |

Bayesian Detection of Infrequent Differences in Sets of Time Series with Shared Structure | 905 |

Analysis of Contour Motions | 913 |

Attributeefficient learning of decision lists and linear threshold functions under unconcentrated distributions | 921 |

Dynamic ForegroundBackground Extraction from Images and Videos using Random Patches | 929 |

Effects of Stress and Genotype on Metaparameter Dynamics in Reinforcement Learning | 937 |

Statistical Modeling of Images with Fields of Gaussian Scale Mixtures | 945 |

An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments | 953 |

Isotonic Conditional Random Fields and Local Sentiment Flow | 961 |

Partbased Probabilistic Point Matching using Equivalence Constraints | 969 |

Modeling Dyadic Data with Binary Latent Factors | 977 |

Fast Discriminative Visual Codebooks using Randomized Clustering Forests | 985 |

An Investigation of Four Probabilistic Models | 993 |

Approximating the Set of Subgame Perfect Equilibria in GeneralSum Stochastic Games | 1001 |

Coherent Point Drift | 1009 |

Fundamental Limitations of Spectral Clustering | 1017 |

On the Relation Between Low Density Separation Spectral Clustering and Graph Cuts | 1025 |

A Nonparametric Bayesian Method for Inferring Features From Similarity Judgments | 1033 |

Temporal dynamics of information content carried by neurons in the primary visual cortex | 1041 |

Blind source separation for overdetermined delayed mixtures | 1049 |

The Neurodynamics of Belief Propagation on Binary Markov Random Fields | 1057 |

Handling Advertisements of Unknown Quality in Search Advertising | 1065 |

Bayesian Model Scoring in Markov Random Fields | 1073 |

Game theoretic algorithms for ProteinDNA binding | 1081 |

Bayesian Image Superresolution Continued | 1089 |

Parameter Expanded Variational Bayesian Methods | 1097 |

Inferring Network Structure from CoOccurrences | 1105 |

Unsupervised Regression with Applications to Nonlinear System Identification | 1113 |

Stability of KMeans Clustering | 1121 |

Learning to parse images of articulated bodies | 1129 |

Efficient Learning of Sparse Representations with an EnergyBased Model | 1137 |

Learning to be Bayesian without Supervision | 1145 |

Boosting Structured Prediction for Imitation Learning | 1153 |

Large Scale Hidden SemiMarkov SVMs | 1161 |

Natural ActorCritic for Road Traffic Optimisation | 1169 |

Computation of Similarity Measures for Sequential Data using Generalized Suffix Trees | 1177 |

Learning annotated hierarchies from relational data | 1185 |

Shifting OneInclusion Mistake Bounds and Tight Multiclass Expected Risk Bounds | 1193 |

Neurophysiological Evidence of Cooperative Mechanisms for Stereo Computation | 1201 |

Robotic Grasping of Novel Objects | 1209 |

Theory and Dynamics of Perceptual Bistability | 1217 |

Fast Iterative Kernel PCA | 1225 |

CrossValidation Optimization for Large Scale Hierarchical Classification Kernel Methods | 1233 |

Information Bottleneck for Non CoOccurrence Data | 1241 |

Large Margin Hidden Markov Models for Automatic Speech Recognition | 1249 |

Nonlinear physicallybased models for decoding motorcortical population activity | 1257 |

Convex Repeated Games and Fenchel Duality | 1265 |

Recursive ICA | 1273 |

Chained Boosting | 1281 |

A recipe for optimizing a timehistogram Hideaki Shimazaki | 1289 |

Mutagenetic tree Fisher kernel improves prediction of HIV drug resistance from viral genotype | 1297 |

Modeling Genetic Recombination in Open Ancestral Space | 1305 |

Learning Dense 3D Correspondence | 1313 |

An Oracle Inequality for Clipped Regularized Risk Minimizers | 1321 |

Learning Structural Equation Models for fMRI | 1329 |

Mixture Regression for Covariate Shift | 1337 |

Modeling Human Motion Using Binary Latent Variables | 1345 |

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation | 1353 |

Towards a general independent subspace analysis | 1361 |

Linearlysolvable Markov decision problems | 1369 |

Logistic Regression for Single Trial EEG Classification | 1377 |

Large Margin Component Analysis | 1385 |

Learning Motion Style Synthesis from Perceptual Observations | 1393 |

LargeScale Sparsified Manifold Regularization | 1401 |

Scalable Discriminative Learning for Natural Language Parsing and Translation | 1409 |

Generalized Maximum Margin Clustering and Unsupervised Kernel Learning | 1417 |

A ComplexityDistortion Approach to Joint Pattern Alignment | 1425 |

Online Clustering of Moving Hyperplanes | 1433 |

Comparative Gene Prediction using Conditional Random Fields | 1441 |

Fast Computation of Graph Kernels | 1449 |

Temporal Coding using the Response Properties of Spiking Neurons | 1457 |

HighDimensional Graphical Model Selection Using l1 Regularized Logistic Regression | 1465 |

Attentional Processing on a SpikeBased VLSI Neural Network | 1473 |

Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension | 1481 |

Graph Laplacian Regularization for LargeScale Semidefinite Programming | 1489 |

A Switched Gaussian Process for Estimating Disparity and Segmentation in Binocular Stereo | 1497 |

Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization | 1505 |

Particle Filtering for Nonparametric Bayesian Matrix Factorization | 1513 |

A Scalable Machine Learning Approach to Go | 1521 |

A Local Learning Approach for Clustering | 1529 |

The RobustnessPerformance Tradeoff in Markov Decision Processes | 1537 |

Optimal ChangeDetection and Spiking Neurons | 1545 |

Stochastic Relational Models for Discriminative Link Prediction | 1553 |

Nonnegative Sparse PCA | 1561 |

Doubly Stochastic Normalization for Spectral Clustering | 1569 |

Simplifying Mixture Models through Function Approximation | 1577 |

Hyperparameter Learning for Graph Based Semisupervised Learning Algorithms | 1585 |

Modified Locally Linear Embedding Using Multiple Weights | 1593 |

Clustering Classification and Embedding | 1601 |

MultiInstance MultiLabel Learning with Application to Scene Classification | 1609 |

Unsupervised Learning of a Probabilistic Grammar for Object Detection and Parsing | 1617 |

A Probabilistic Algorithm Integrating Source Localization and Noise Suppression of MEG and EEG Data | 1625 |

1633 | |

1639 | |