The tutorial deals with problems in which the data we have to
work with is a set of rankings or permutations. We can see
examples of this kind of data in the output of a query of
several search engines, the ranking of films or songs given by
the users of on-line music or movies web-server, etc. One way
to tackle this kind of data is by means of probabilistic
modeling, i.e., to learn a probability distribution over the
space of ranking, and then to carry out inference tasks in
order to get the required information: find the consensus
ranking, find the probability that movie A is ranked higher
than movie B for a particular user, etc. Unfortunately dealing
with ranking data from a probabilistic point of view is a
daunting task as the number of permutations of the n objects
grows exponentially with n. Furthermore, in many real
scenarios the 1 number of possible objects (as the number of
possible output web pages of a query in a search engine) can
be considered as unlimited. Recently numerous proposals have
been done in the machine learning as well as in the
statistical communities to create models that can approach
these kind of problems. These different approaches can be
mainly classified as, (i) parametric-based approaches, (ii)
low-order statistics approaches and (iii) non-parametric
approaches. Parametric approaches to deal with permutation
data have a long history in the field of statistics as they
have been used for more than thirty years. These approaches
assume a functional form of the probability distribution over
the set of permutations that come defined by a set of
parameters. The differences between the parametric models can
be related with the kind of data that motivated its discovery.
For instance, Thurstone model is motivated by rankings
produced from score functions. On the other hand,
Plackett-Luce model is motivated by a multi-stage ranking
model and Mallows-Bradley-Terry model is based on
pair-comparisons. Apart from the previously mentioned models,
the most common models are those based on distances between
permutations, and particularly the Mallows model. Mallows
model is an exponential model that resembles the Gaussian
distribution but for permutations. All the previous models
have received in the last five years a increasing attention in
the machine learning community. The researchers have extended
those models to allow them to deal, not only with complete
ranked data, but also with partially ranked data, with
incomplete instances, etc. Furthermore, the learning and
inference problems have been thoroughly analyzed for these
models, and many new results have been obtained. Even more,
Bayesian approaches have proposed for those models that allow
the incorporation of a priori knowledge in the model.
Low-order statistics models are based on learning sparse
models of the probability distribution. These sparse models
basically try to keep information related with low-order
marginals. For instance, first-order marginals considers the
probability that an object i is ranked in the j-th position. A
notion of independence or conditional independence is
essential in this approach. Non-parametric approaches consider
mainly two kind of models: models based on kernels, and models
based on the Fourier transform over groups. In the first case,
the models are appropriate to deal with partial ranking and
ranking with many objects. In the second case, they allow to
carry out inference tasks efficiently. In order to illustrate
the introduced probabilistic models, we will give several
application examples in different fields and present the most
common benchmark datasets and the available software in this
area. Relevance of the topic for the Machine Learning
Community. There are several aspects that reveal the
importance of the topic for the ACML community. First of all
is the different hot-topic fields of application, from
recommender systems, to identity tracking or information
retrieval. A second indicator of the value of the topic for
the community is the increasing number of papers published
about it in the highest impact factor machine learning
journals such as Journal of Machine Learning Research, Machine
Learning Journal, etc. Furthermore the most relevant
conferences in machine learning have also received an
increasing number of submissions in the topic. The section of
bibliography at the end of this document collects some of the
most important references in the field and as it can be seen
most of them are published in those journals and conferences.
Finally, I would like to mention the workshops that were
recently organized around the topic: Workshop on Learning with
Orderings Neural Information Processing Systems (NIPS 2009)
Workshop on Advances in Ranking Neural Information Processing
Systems (NIPS 2009) and all the numerous workshops around
“Learning to Rank” that although it is not exactly the same
field, some of the papers published in them have a close
relation with the topics of the tutorial. Previous tutorials.
The same tutorial has been accepted in ECML/PKDD 2012
conference for a time of three hours. In addition the
presenters have a long experience in teaching tutorials.
Particularly we would like to emphasize that Prof. Jose A.
Lozano gave a tutorial in ACML 2010 conference titled: Honest
Evaluation of Classification Models.