HCI Bibliography Home | HCI Conferences | IR Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
IR Tables of Contents: 989900010203040506070809101112131415

Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

Fullname:Proceedings of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval
Editors:Sung-Hyon Myaeng; Dougas W. Oard; Fabrizio Sebastiani; Tat-Seng Chua; Mun-Kew Leong
Location:Singapore, Singapore
Dates:2008-Jul-20 to 2008-Jul-24
Standard No:ISBN: 1-60558-164-X, 978-1-60558-164-4; ACM Order Number:606081; ACM DL: Table of Contents hcibib: IR08
  1. Keynotes
  2. User interaction models
  3. Web search: 1
  4. Evaluation: 1
  5. Collaborative filtering
  6. Learning to rank: 1
  7. High-performance & high dimensional indexing
  8. User adaptation & personalization
  9. Clustering: 1
  10. Multilingual & crosslingual retrieval
  11. Relevance feedback
  12. Learning to rank: 2
  13. Summarization
  14. Exploratory search & filtering
  15. Web-search: 2
  16. Multimedia retrieval
  17. Query analysis & models: 1
  18. Non-topicality
  19. Probabilistic models
  20. Analysis of social networks
  21. Question-answering
  22. Query analysis & models: 2
  23. Social tagging
  24. Clustering: 2
  25. Content analysis
  26. Learning models for IR
  27. Text classification
  28. Evaluation: 2
  29. Posters group 1: evaluation, text collections and user/personalized IR
  30. Posters group 2: blog, tagging, opinion analysis and web IR
  31. Posters group 3: multimedia and domain specific IR
  32. Posters group 4: theory and IR models
  33. Posters group 5: structured IR, ranking, classification and filtering
  34. Demonstrations
  35. Doctoral consortium


Delighting Chinese users: the Google China experience BIBAFull-Text 1
  Kai-Fu Lee
Google entered China market as a late-comer in late-2005, with no local employees, an inadequate product line, and small market share. This talk will discuss Google China's efforts to build up a team, learn about local user needs, apply its global innovation model, and won over users in the past 2.5 years.
   This talk will cover the results of our user studies, and our key findings about Chinese users for searching and using the Internet. It will also discuss how these findings were applied to our products, and how these products gained traction in the market place. It will also discuss Google's progress in Chinese search relevance, search user experience, and key technology areas where we innovated.
   This talk will also discuss the process of internationalization -- how Google hired locally, and applied its global 20% project approach to encourage truly relevant local innovations. It will discuss several examples of these innovations -- from product innovations like the weather map, the input method editor, SMS greetings search, to research innovations like parallel SVM/SVD.
   Google China's progress dispelled the myth that multinational Internet companies cannot succeed in China. The key ingredients, like in any other success story, are: focus on the customer, embrace the corporate culture, empower local flexibility, and of course, innovate, innovate, innovate.
Guilt by association as a search principle BIBAFull-Text 2
  Limsoon Wong
The exploitation of fundamental invariants is among the most elegant solutions to many computational problems in a wide variety of domains. One of the more powerful approaches to exploit invariants is the principle of "guilt by association". In particular, the principle of guilt by association is the foundation of remote homolog detection, protein function prediction, disease subtype diagnosis, treatment plan prognosis, and other challenges in computational biology. The principle suggests that two entities are in a specific relationship if they exhibit invariant properties underlying that relationship. For example, a protein is predicted to have a particular biological function if it exhibits the underlying invariant properties of that functional group -- viz., guilty by association to other members of that functional group through the shared invariant properties.
   In my talk, I plan to present several facets of guilt by association in the computational prediction of protein function and draw parallels of these facets in information retrieval. Specifically, I plan to touch on the following facets: (a) the issue of chance associations; (b) novel generalizable forms of association; (c) fusion of multiple heterogeneous sources of evidence; (d) the dichotomy of knowing to a high degree of reliability that two entities are in some relationship and yet not knowing what that relationship is. I hope this talk will be, for the informational retrieval community, a window to the opportunities in computational biology that may benefit from the depth and variety of solutions information retrieval has to offer.

User interaction models

On iterative intelligent medical search BIBAFull-Text 3-10
  Gang Luo; Chunqiang Tang
Searching for medical information on the Web has become highly popular, but it remains a challenging task because searchers are often uncertain about their exact medical situations and unfamiliar with medical terminology. To address this challenge, we have built an intelligent medical Web search engine called iMed, which uses medical knowledge and an interactive questionnaire to help searchers form queries. This paper focuses on iMed's iterative search advisor, which integrates medical and linguistic knowledge to help searchers improve search results iteratively. Such an iterative process is common for general Web search, and especially crucial for medical Web search, because searchers often miss desired search results due to their limited medical knowledge and the task's inherent difficulty. iMed's iterative search advisor helps the searcher in several ways. First, relevant symptoms and signs are automatically suggested based on the searcher's description of his situation. Second, instead of taking for granted the searcher's answers to the questions, iMed ranks and recommends alternative answers according to their likelihoods of being the correct answers. Third, related MeSH medical phrases are suggested to help the searcher refine his situation description. We demonstrate the effectiveness of iMed's iterative search advisor by evaluating it using real medical case records and USMLE medical exam questions.
Effective and efficient user interaction for long queries BIBAFull-Text 11-18
  Giridhar Kumaran; James Allan
Handling long queries can involve either pruning the query to retain only the important terms (reduction), or expanding the query to include related concepts (expansion). While automatic techniques to do so exist, roughly 25% performance improvements in terms of MAP have been realized in past work through interactive variants. We show that selectively reducing or expanding a query leads to an average improvement of 51% in MAP over the baseline for standard TREC test collections. We demonstrate how user interaction can be used to achieve this improvement. Most interaction techniques present users with a fixed number of options for all queries. We achieve improvements by interacting less with the user, i.e., we present techniques to identify the optimal number of options to present to users, resulting in an interface with an average of 70% fewer options to consider. Previous algorithms supporting interactive reduction and expansion are exponential in nature. To extend their utility to operational environments, we present techniques to make the complexity of the algorithms polynomial. We finally present an analysis of long queries that continue to exhibit poor performance in spite of our new techniques.
How do users find things with PubMed?: towards automatic utility evaluation with user simulations BIBAFull-Text 19-26
  Jimmy Lin; Mark D. Smucker
In the context of document retrieval in the biomedical domain, this paper explores the complex relationship between the quality of initial query results and the overall utility of an interactive retrieval system. We demonstrate that a content-similarity browsing tool can compensate for poor retrieval results, and that the relationship between retrieval performance and overall utility is non-linear. Arguments are advanced with user simulations, which characterize the relevance of documents that a user might encounter with different browsing strategies. With broader implications to IR, this work provides a case study of how user simulations can be exploited as a formative tool for automatic utility evaluation. Simulation-based studies provide researchers with an additional evaluation tool to complement interactive and Cranfield-style experiments.

Web search: 1

Towards breaking the quality curse.: a web-querying approach to web people search BIBAFull-Text 27-34
  Dmitri V. Kalashnikov; Rabia Nuray-Turan; Sharad Mehrotra
Searching for people on the Web is one of the most common query types to the web search engines today. However, when a person name is queried, the returned webpages often contain documents related to several distinct namesakes who have the queried name. The task of disambiguating and finding the webpages related to the specific person of interest is left to the user. Many Web People Search (WePS) approaches have been developed recently that attempt to automate this disambiguation process. Nevertheless, the disambiguation quality of these techniques leaves a major room for improvement. This paper presents a new server-side WePS approach. It is based on collecting co-occurrence information from theWeb and thus it uses theWeb as an external data source. A skyline-based classification technique is developed for classifying the collected co-occurrence information in order to make clustering decisions. The clustering technique is specifically designed to (a) handle the dominance that exists in data and (b) to adapt to a given clustering quality measure. These properties allow the framework to get a major advantage in terms of result quality over all the latest WePS techniques we are aware of, including all the 18 methods covered in the recent WePS competition [2].
An unsupervised framework for extracting and normalizing product attributes from multiple web sites BIBAFull-Text 35-42
  Tak-Lam Wong; Wai Lam; Tik-Shun Wong
We have developed an unsupervised framework for simultaneously extracting and normalizing attributes of products from multiple Web pages originated from different sites. Our framework is designed based on a probabilistic graphical model that can model the page-independent content information and the page-dependent layout information of the text fragments in Web pages. One characteristic of our framework is that previously unseen attributes can be discovered from the clue contained in the layout format of the text fragments. Our framework tackles both extraction and normalization tasks by jointly considering the relationship between the content and layout information. Dirichlet process prior is employed leading to another advantage that the number of discovered product attributes is unlimited. An unsupervised inference algorithm based on variational method is presented. The semantics of the normalized attributes can be visualized by examining the term weights in the model. Our framework can be applied to a wide range of Web mining applications such as product matching and retrieval. We have conducted extensive experiments from four different domains consisting of over 300 Web pages from over 150 different Web sites, demonstrating the robustness and effectiveness of our framework.
Enhancing web search by promoting multiple search engine use BIBAFull-Text 43-50
  Ryen W. White; Matthew Richardson; Mikhail Bilenko; Allison P. Heath
Any given Web search engine may provide higher quality results than others for certain queries. Therefore, it is in users' best interest to utilize multiple search engines. In this paper, we propose and evaluate a framework that maximizes users' search effective-ness by directing them to the engine that yields the best results for the current query. In contrast to prior work on meta-search, we do not advocate for replacement of multiple engines with an aggregate one, but rather facilitate simultaneous use of individual engines. We describe a machine learning approach to supporting switching between search engines and demonstrate its viability at tolerable interruption levels. Our findings have implications for fluid competition between search engines.

Evaluation: 1

Score standardization for inter-collection comparison of retrieval systems BIBAFull-Text 51-58
  William Webber; Alistair Moffat; Justin Zobel
The goal of system evaluation in information retrieval has always been to determine which of a set of systems is superior on a given collection. The tool used to determine system ordering is an evaluation metric such as average precision, which computes relative, collection-specific scores. We argue that a broader goal is achievable. In this paper we demonstrate that, by use of standardization, scores can be substantially independent of a particular collection, allowing systems to be compared even when they have been tested on different collections. Compared to current methods, our techniques provide richer information about system performance, improved clarity in outcome reporting, and greater simplicity in reviewing results from disparate sources.
The good and the bad system: does the test collection predict users' effectiveness? BIBAFull-Text 59-66
  Azzah Al-Maskari; Mark Sanderson; Paul Clough; Eija Airio
Test collections are extensively used in the evaluation of information retrieval systems. Crucial to their use is the degree to which results from them predict user effectiveness. At first, past studies did not substantiate a relationship between system and user effectiveness; more recently, however, correlations have begun to emerge. The results of this paper strengthen and extend those findings. We introduce a novel methodology for investigating the relationship, which shows great success in establishing a significant correlation between system and user effectiveness. It is shown that users behave differently and discern differences between pairs of systems that have a very small absolute difference in test collection effectiveness. Our results strengthen the use of test collections in IR evaluation, confirming that users' effectiveness can be predicted successfully.
Retrieval sensitivity under training using different measures BIBAFull-Text 67-74
  Ben He; Craig Macdonald; Iadh Ounis
Various measures, such as binary preference (bpref), inferred average precision (infAP), and binary normalised discounted cumulative gain (nDCG) have been proposed as alternatives to mean average precision (MAP) for being less sensitive to the relevance judgements completeness. As the primary aim of any system building is to train the system to respond to user queries in a more robust and stable manner, in this paper, we investigate the importance of the choice of the evaluation measure for training, under different levels of evaluation incompleteness. We simulate evaluation incompleteness by sampling from the relevance assessments. Through large-scale experiments on two standard TREC test collections, we examine retrieval sensitivity when training -- i.e. if a training process, based on any of the four discussed measures has an impact on the final retrieval performance. Experimental results show that training by bpref, infAP and nDCG provides significantly better retrieval performance than training by MAP when relevance judgements completeness is extremely low. When relevance judgements completeness increases, the measures behave more similarly.

Collaborative filtering

Attack resistant collaborative filtering BIBAFull-Text 75-82
  Bhaskar Mehta; Wolfgang Nejdl
The widespread deployment of recommender systems has lead to user feedback of varying quality. While some users faithfully express their true opinion, many provide noisy ratings which can be detrimental to the quality of the generated recommendations. The presence of noise can violate modeling assumptions and may thus lead to instabilities in estimation and prediction. Even worse, malicious users can deliberately insert attack profiles in an attempt to bias the recommender system to their benefit.
   While previous research has attempted to study the robustness of various existing Collaborative Filtering (CF) approaches, this remains an unsolved problem. Approaches such as Neighbor Selection algorithms, Association Rules and Robust Matrix Factorization have produced unsatisfactory results. This work describes a new collaborative algorithm based on SVD which is accurate as well as highly stable to shilling. This algorithm exploits previously established SVD based shilling detection algorithms, and combines it with SVD based-CF. Experimental results show a much diminished effect of all kinds of shilling attacks. This work also offers significant improvement over previous Robust Collaborative Filtering frameworks.
EigenRank: a ranking-oriented approach to collaborative filtering BIBAFull-Text 83-90
  Nathan N. Liu; Qiang Yang
A recommender system must be able to suggest items that are likely to be preferred by the user. In most systems, the degree of preference is represented by a rating score. Given a database of users' past ratings on a set of items, traditional collaborative filtering algorithms are based on predicting the potential ratings that a user would assign to the unrated items so that they can be ranked by the predicted ratings to produce a list of recommended items. In this paper, we propose a collaborative filtering approach that addresses the item ranking problem directly by modeling user preferences derived from the ratings. We measure the similarity between users based on the correlation between their rankings of the items rather than the rating values and propose new collaborative filtering algorithms for ranking items based on the preferences of similar users. Experimental results on real world movie rating data sets show that the proposed approach outperforms traditional collaborative filtering algorithms significantly on the NDCG measure for evaluating ranked results.
Personalized active learning for collaborative filtering BIBAFull-Text 91-98
  Abhay S. Harpale; Yiming Yang
Collaborative Filtering (CF) requires user-rated training examples for statistical inference about the preferences of new users. Active learning strategies identify the most informative set of training examples through minimum interactions with the users. Current active learning approaches in CF make an implicit and unrealistic assumption that a user can provide rating for any queried item. This paper introduces a new approach to the problem which does not make such an assumption. We personalize active learning for the user, and query for only those items which the user can provide rating for. We propose an extended form of Bayesian active learning and use the Aspect Model for CF to illustrate and examine the idea. A comparative evaluation of the new method and a well-established baseline method on benchmark datasets shows statistically significant improvements with our method over the performance of the baseline method that is representative for existing approaches which do not take personalization into account.

Learning to rank: 1

A boosting algorithm for learning bipartite ranking functions with partially labeled data BIBAFull-Text 99-106
  Massih Reza Amini; Tuong Vinh Truong; Cyril Goutte
This paper presents a boosting based algorithm for learning a bipartite ranking function (BRF) with partially labeled data. Until now different attempts had been made to build a BRF in a transductive setting, in which the test points are given to the methods in advance as unlabeled data. The proposed approach is a semi-supervised inductive ranking algorithm which, as opposed to transductive algorithms, is able to infer an ordering on new examples that were not used for its training. We evaluate our approach using the TREC-9 Ohsumed and the Reuters-21578 data collections, comparing against two semi-supervised classification algorithms for ROCArea (AUC), uninterpolated average precision (AUP), mean precision@50 (TP) and Precision-Recall (PR) curves. In the most interesting cases where there are an unbalanced number of irrelevant examples over relevant ones, we show our method to produce statistically significant improvements with respect to these ranking measures.
Directly optimizing evaluation measures in learning to rank BIBAFull-Text 107-114
  Jun Xu; Tie-Yan Liu; Min Lu; Hang Li; Wei-Ying Ma
One of the central issues in learning to rank for information retrieval is to develop algorithms that construct ranking models by directly optimizing evaluation measures used in information retrieval such as Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG). Several such algorithms including SVMmap and AdaRank have been proposed and their effectiveness has been verified. However, the relationships between the algorithms are not clear, and furthermore no comparisons have been conducted between them. In this paper, we conduct a study on the approach of directly optimizing evaluation measures in learning to rank for Information Retrieval (IR). We focus on the methods that minimize loss functions upper bounding the basic loss function defined on the IR measures. We first provide a general framework for the study and analyze the existing algorithms of SVMmap and AdaRank within the framework. The framework is based on upper bound analysis and two types of upper bounds are discussed. Moreover, we show that we can derive new algorithms on the basis of this analysis and create one example algorithm called PermuRank. We have also conducted comparisons between SVMmap, AdaRank, PermuRank, and conventional methods of Ranking SVM and RankBoost, using benchmark datasets. Experimental results show that the methods based on direct optimization of evaluation measures can always outperform conventional methods of Ranking SVM and RankBoost. However, no significant difference exists among the performances of the direct optimization methods themselves.
Query dependent ranking using K-nearest neighbor BIBAFull-Text 115-122
  Xiubo Geng; Tie-Yan Liu; Tao Qin; Andrew Arnold; Hang Li; Heung-Yeung Shum
Many ranking models have been proposed in information retrieval, and recently machine learning techniques have also been applied to ranking model construction. Most of the existing methods do not take into consideration the fact that significant differences exist between queries, and only resort to a single function in ranking of documents. In this paper, we argue that it is necessary to employ different ranking models for different queries and onduct what we call query-dependent ranking. As the first such attempt, we propose a K-Nearest Neighbor (KNN) method for query-dependent ranking. We first consider an online method which creates a ranking model for a given query by using the labeled neighbors of the query in the query feature space and then rank the documents with respect to the query using the created model. Next, we give two offline approximations of the method, which create the ranking models in advance to enhance the efficiency of ranking. And we prove a theory which indicates that the approximations are accurate in terms of difference in loss of prediction, if the learning algorithm used is stable with respect to minor changes in training examples. Our experimental results show that the proposed online and offline methods both outperform the baseline method of using a single ranking function.

High-performance & high dimensional indexing

Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces BIBAFull-Text 123-130
  Wei Dong; Moses Charikar; Kai Li
Efficient similarity search in high-dimensional spaces is important to content-based retrieval systems. Recent studies have shown that sketches can effectively approximate L1 distance in high-dimensional spaces, and that filtering with sketches can speed up similarity search by an order of magnitude. It is a challenge to further reduce the size of sketches, which are already compact, without compromising accuracy of distance estimation.
   This paper presents an efficient sketch algorithm for similarity search with L2 distances and a novel asymmetric distance estimation technique. Our new asymmetric estimator takes advantage of the original feature vector of the query to boost the distance estimation accuracy. We also apply this asymmetric method to existing sketches for cosine similarity and L1 distance. Evaluations with datasets extracted from images and telephone records show that our L2 sketch outperforms existing methods, and the asymmetric estimators consistently improve the accuracy of different sketch methods. To achieve the same search quality, asymmetric estimators can reduce the sketch size by 10% to 40%.
ResIn: a combination of results caching and index pruning for high-performance web search engines BIBAFull-Text 131-138
  Gleb Skobeltsyn; Flavio Junqueira; Vassilis Plachouras; Ricardo Baeza-Yates
Results caching is an efficient technique for reducing the query processing load, hence it is commonly used in real search engines. This technique, however, bounds the maximum hit rate due to the large fraction of singleton queries, which is an important limitation. In this paper we propose ResIn -- an architecture that uses a combination of results caching and index pruning to overcome this limitation.
   We argue that results caching is an inexpensive and efficient way to reduce the query processing load and show that it is cheaper to implement compared to a pruned index. At the same time, we show that index pruning performance is fundamentally affected by the changes in the query traffic that the results cache induces. We experiment with real query logs and a large document collection, and show that the combination of both techniques enables efficient reduction of the query processing costs and thus is practical to use in Web search engines.
Reorganizing compressed text BIBAFull-Text 139-146
  Nieves R. Brisaboa; Antonio Fariña; Susana Ladra; Gonzalo Navarro
Recent research has demonstrated beyond doubts the benefits of compressing natural language texts using word-based statistical semistatic compression. Not only it achieves extremely competitive compression rates, but also direct search on the compressed text can be carried out faster than on the original text; indexing based on inverted lists benefits from compression as well.
   Such compression methods assign a variable-length codeword to each different text word. Some coding methods (Plain Huffman and Restricted Prefix Byte Codes) do not clearly mark codeword boundaries, and hence cannot be accessed at random positions nor searched with the fastest text search algorithms. Other coding methods (Tagged Huffman, End-Tagged Dense Code, or (s, c)-Dense Code) do mark codeword boundaries, achieving a self-synchronization property that enables fast search and random access, in exchange for some loss in compression effectiveness.
   In this paper, we show that by just performing a simple reordering of the target symbols in the compressed text (more precisely, reorganizing the bytes into a wavelet-treelike shape) and using little additional space, searching capabilities are greatly improved without a drastic impact in compression and decompression times. With this approach, all the codes achieve synchronism and can be searched fast and accessed at arbitrary points. Moreover, the reordered compressed text becomes an implicitly indexed representation of the text, which can be searched for words in time independent of the text length. That is, we achieve not only fast sequential search time, but indexed search time, for almost no extra space cost.
   We experiment with three well-known word-based compression techniques with different characteristics (Plain Huffman, End-Tagged Dense Code and Restricted Prefix Byte Codes), and show the searching capabilities achieved by reordering the compressed representation on several corpora. We show that the reordered versions are not only much more efficient than their classical counterparts, but also more efficient than explicit inverted indexes built on the collection, when using the same amount of space.

User adaptation & personalization

User adaptation: good results from poor systems BIBAFull-Text 147-154
  Catherine L. Smith; Paul B. Kantor
Several recent studies have found only a weak relationship between the performance of a retrieval system and the "success" achievable by human searchers. We hypothesize that searchers are successful precisely because they alter their behavior. To explore the possible causal relation between system performance and search behavior, we control system performance, hoping to elicit adaptive search behaviors. 36 subjects each completed 12 searches using either a standard system or one of two degraded systems. Using a general linear model, we isolate the main effect of system performance, by measuring and removing main effects due to searcher variation, topic difficulty, and the position of each search in the time series. We find that searchers using our degraded systems are as successful as those using the standard system, but that, in achieving this success, they alter their behavior in ways that could be measured, in real time, by a suitably instrumented system. Our findings suggest, quite generally, that some aspects of behavioral dynamics may provide unobtrusive indicators of system performance.
Exploring folksonomy for personalized search BIBAFull-Text 155-162
  Shengliang Xu; Shenghua Bao; Ben Fei; Zhong Su; Yong Yu
As a social service in Web 2.0, folksonomy provides the users the ability to save and organize their bookmarks online with "social annotations" or "tags". Social annotations are high quality descriptors of the web pages' topics as well as good indicators of web users' interests. We propose a personalized search framework to utilize folksonomy for personalized search. Specifically, three properties of folksonomy, namely the categorization, keyword, and structure property, are explored. In the framework, the rank of a web page is decided not only by the term matching between the query and the web page's content but also by the topic matching between the user's interests and the web page's topics. In the evaluation, we propose an automatic evaluation framework based on folksonomy data, which is able to help lighten the common high cost in personalized search evaluations. A series of experiments are conducted using two heterogeneous data sets, one crawled from Del.icio.us and the other from Dogear. Extensive experimental results show that our personalized search approach can significantly improve the search quality.
To personalize or not to personalize: modeling queries with variation in user intent BIBAFull-Text 163-170
  Jaime Teevan; Susan T. Dumais; Daniel J. Liebling
In most previous work on personalized search algorithms, the results for all queries are personalized in the same manner. However, as we show in this paper, there is a lot of variation across queries in the benefits that can be achieved through personalization. For some queries, everyone who issues the query is looking for the same thing. For other queries, different people want very different results even though they express their need in the same way. We examine variability in user intent using both explicit relevance judgments and large-scale log analysis of user behavior patterns. While variation in user behavior is correlated with variation in explicit relevance judgments the same query, there are many other factors, such as result entropy, result quality, and task that can also affect the variation in behavior. We characterize queries using a variety of features of the query, the results returned for the query, and people's interaction history with the query. Using these features we build predictive models to identify queries that can benefit from personalization.

Clustering: 1

The opposite of smoothing: a language model approach to ranking query-specific document clusters BIBAFull-Text 171-178
  Oren Kurland
Exploiting information induced from (query-specific) clustering of top-retrieved documents has long been proposed as means for improving precision at the very top ranks of the returned results. We present a novel language model approach to ranking query-specific clusters by the presumed percentage of relevant documents that they contain. While most previous cluster ranking approaches focus on the cluster as a whole, our model also exploits information induced from documents associated with the cluster. Our model substantially outperforms previous approaches for identifying clusters containing a high relevant-document percentage. Furthermore, using the model to produce document ranking yields precision-at-top-ranks performance that is consistently better than that of the initial ranking upon which clustering is performed; the performance also favorably compares with that of a state-of-the-art pseudo-feedback retrieval method.
Enhancing text clustering by leveraging Wikipedia semantics BIBAFull-Text 179-186
  Jian Hu; Lujun Fang; Yang Cao; Hua-Jun Zeng; Hua Li; Qiang Yang; Zheng Chen
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the important information on the semantic relationships between key terms. To overcome this problem, several methods have been proposed to enrich text representation with external resource in the past, such as WordNet. However, many of these approaches suffer from some limitations: 1) WordNet has limited coverage and has a lack of effective word-sense disambiguation ability; 2) Most of the text representation enrichment strategies, which append or replace document terms with their hypernym and synonym, are overly simple. In this paper, to overcome these deficiencies, we first propose a way to build a concept thesaurus based on the semantic relations (synonym, hypernym, and associative relation) extracted from Wikipedia. Then, we develop a unified framework to leverage these semantic relations in order to enhance traditional content similarity measure for text clustering. The experimental results on Reuters and OHSUMED datasets show that with the help of Wikipedia thesaurus, the clustering performance of our method is improved as compared to previous methods. In addition, with the optimized weights for hypernym, synonym, and associative concepts that are tuned with the help of a few labeled data users provided, the clustering performance can be further improved.
Knowledge transformation from word space to document space BIBAFull-Text 187-194
  Tao Li; Chris Ding; Yi Zhang; Bo Shao
In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many real-world applications, however, we usually have knowledge on the word side and wish to transform this knowledge to the document (concept) side. In this paper, we provide a mechanism for this knowledge transformation. To the best of our knowledge, this is the first model for such type of knowledge transformation. This model uses a nonnegative matrix factorization model X = FSGT, where X is the word document semantic matrix, F is the posterior probability of a word belonging to a word cluster and represents knowledge in the word space, G is the posterior probability of a document belonging to a document cluster and represents knowledge in the document space, and S is a scaled matrix factor which provides a condensed view of X. We show how knowledge on words can improve document clustering, i.e, knowledge in the word space is transformed into the document space. We perform extensive experiments to validate our approach.

Multilingual & crosslingual retrieval

A study of learning a merge model for multilingual information retrieval BIBAFull-Text 195-202
  Ming-Feng Tsai; Yu-Ting Wang; Hsin-Hsi Chen
This paper proposes a learning approach for the merging process in multilingual information retrieval (MLIR). To conduct the learning approach, we also present a large number of features that may influence the MLIR merging process; these features are mainly extracted from three levels: query, document, and translation. After the feature extraction, we then use the FRank ranking algorithm to construct a merge model; to our knowledge, this practice is the first attempt to use a learning-based ranking algorithm to construct a merge model for MLIR merging. In our experiments, three test collections for the task of crosslingual information retrieval (CLIR) in NTCIR3, 4, and 5 are employed to assess the performance of our proposed method; moreover, several merging methods are also carried out for a comparison, including traditional merging methods, the 2-step merging strategy, and the merging method based on logistic regression. The experimental results show that our method can significantly improve merging quality on two different types of datasets. In addition to the effectiveness, through the merge model generated by FRank, our method can further identify key factors that influence the merging process; this information might provide us more insight and understanding into MLIR merging.
Bilingual topic aspect classification with a few training examples BIBAFull-Text 203-210
  Yejun Wu; Douglas W. Oard
This paper explores topic aspect (i.e., subtopic or facet) classification for English and Chinese collections. The evaluation model assumes a bilingual user who has found documents on a topic and identified a few passages in each language on aspects of that topic. Additional passages are then automatically labeled using a k-Nearest-Neighbor classifier and local (i.e., result set) Latent Semantic Analysis. Experiments show that when few training examples are available in either language, classification using training examples from both languages can often achieve higher effectiveness than using training examples from just one language. When the total number of training examples is held constant, classification effectiveness correlates positively with the fraction of same-language training examples in the training set. These results suggest that supervised classification can benefit from hand-annotating a few same-language examples, and that when performing classification in bilingual collections it is useful to label some examples in each language.
Crosslingual location search BIBAFull-Text 211-218
  Tanuja Joshi; Joseph Joy; Tobias Kellner; Udayan Khurana; A. Kumaran; Vibhuti Sengar
Address geocoding, the process of finding the map location for a structured postal address, is a relatively well-studied problem. In this paper we consider the more general problem of crosslingual location search, where the queries are not limited to postal addresses, and the language and script used in the search query is different from the one in which the underlying data is stored. To the best of our knowledge, our system is the first crosslingual location search system that is able to geocode complex addresses. We use a statistical machine transliteration system to convert location names from the script of the query to that of the stored data. However, we show that it is not sufficient to simply feed the resulting transliterations into a monolingual geocoding system, as the ambiguity inherent in the conversion drastically expands the location search space and significantly lowers the quality of results. The strength of our approach lies in its integrated, end-to-end nature: we use abstraction and fuzzy search (in the text domain) to achieve maximum coverage despite transliteration ambiguities, while applying spatial constraints (in the geographic domain) to focus only on viable interpretations of the query. Our experiments with structured and unstructured queries in a set of diverse languages and scripts (Arabic, English, Hindi and Japanese) searching for locations in different regions of the world, show full crosslingual location search accuracy at levels comparable to that of commercial monolingual systems. We achieve these levels of performance using techniques that may be applied to crosslingual searches in any language/script, and over arbitrary spatial data.

Relevance feedback

A study of methods for negative relevance feedback BIBAFull-Text 219-226
  Xuanhui Wang; Hui Fang; ChengXiang Zhai
Negative relevance feedback is a special case of relevance feedback where we do not have any positive example; this often happens when the topic is difficult and the search results are poor. Although in principle any standard relevance feedback technique can be applied to negative relevance feedback, it may not perform well due to the lack of positive examples. In this paper, we conduct a systematic study of methods for negative relevance feedback. We compare a set of representative negative feedback methods, covering vector-space models and language models, as well as several special heuristics for negative feedback. Evaluating negative feedback methods requires a test set with sufficient difficult topics, but there are not many naturally difficult topics in the existing test collections. We use two sampling strategies to adapt a test collection with easy topics to evaluate negative feedback. Experiment results on several TREC collections show that language model based negative feedback methods are generally more effective than those based on vector-space models, and using multiple negative models is an effective heuristic for negative feedback. Our results also show that it is feasible to adapt test collections with easy topics for evaluating negative feedback methods through sampling.
A bayesian logistic regression model for active relevance feedback BIBAFull-Text 227-234
  Zuobing Xu; Ram Akella
Relevance feedback, which traditionally uses the terms in the relevant documents to enrich the user's initial query, is an effective method for improving retrieval performance. The traditional relevance feedback algorithms lead to overfitting because of the limited amount of training data and large term space. This paper introduces an online Bayesian logistic regression algorithm to incorporate relevance feedback information. The new approach addresses the overfitting problem by projecting the original feature space onto a more compact set which retains the necessary information. The new set of features consist of the original retrieval score, the distance to the relevant documents and the distance to non-relevant documents. To reduce the human evaluation effort in ascertaining relevance, we introduce a new active learning algorithm based on variance reduction to actively select documents for user evaluation. The new active learning algorithm aims to select feedback documents to reduce the model variance. The variance reduction approach leads to capturing relevance, diversity and uncertainty of the unlabeled documents in a principled manner. These are the critical factors of active learning indicated in previous literature. Experiments with several TREC datasets demonstrate the effectiveness of the proposed approach.
A cluster-based resampling method for pseudo-relevance feedback BIBAFull-Text 235-242
  Kyung Soon Lee; W. Bruce Croft; James Allan
Typical pseudo-relevance feedback methods assume the top-retrieved documents are relevant and use these pseudo-relevant documents to expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics of a query. Experimental results on large-scale web TREC collections show significant improvements over the relevance model. For justification of the resampling approach, we examine relevance density of feedback documents. A higher relevance density will result in greater retrieval accuracy, ultimately approaching true relevance feedback. The resampling approach shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback. This result indicates that the proposed method is effective for pseudo-relevance feedback.
Selecting good expansion terms for pseudo-relevance feedback BIBAFull-Text 243-250
  Guihong Cao; Jian-Yun Nie; Jianfeng Gao; Stephen Robertson
Pseudo-relevance feedback assumes that most frequent terms in the pseudo-feedback documents are useful for the retrieval. In this study, we re-examine this assumption and show that it does not hold in reality -- many expansion terms identified in traditional approaches are indeed unrelated to the query and harmful to the retrieval. We also show that good expansion terms cannot be distinguished from bad ones merely on their distributions in the feedback documents and in the whole collection. We then propose to integrate a term classification process to predict the usefulness of expansion terms. Multiple additional features can be integrated in this process. Our experiments on three TREC collections show that retrieval effectiveness can be much improved when term classification is used. In addition, we also demonstrate that good terms should be identified directly according to their possible impact on the retrieval effectiveness, i.e. using supervised learning, instead of unsupervised learning.

Learning to rank: 2

Learning to rank with partially-labeled data BIBAFull-Text 251-258
  Kevin Duh; Katrin Kirchhoff
Ranking algorithms, whose goal is to appropriately order a set of objects/documents, are an important component of information retrieval systems. Previous work on ranking algorithms has focused on cases where only labeled data is available for training (i.e. supervised learning). In this paper, we consider the question whether unlabeled (test) data can be exploited to improve ranking performance. We present a framework for transductive learning of ranking functions and show that the answer is affirmative. Our framework is based on generating better features from the test data (via KernelPCA) and incorporating such features via Boosting, thus learning different ranking functions adapted to the individual test queries. We evaluate this method on the LETOR (TREC, OHSUMED) dataset and demonstrate significant improvements.
Learning to rank with SoftRank and Gaussian processes BIBAFull-Text 259-266
  John Guiver; Edward Snelson
In this paper we address the issue of learning to rank for document retrieval using Thurstonian models based on sparse Gaussian processes. Thurstonian models represent each document for a given query as a probability distribution in a score space; these distributions over scores naturally give rise to distributions over document rankings. However, in general we do not have observed rankings with which to train the model; instead, each document in the training set is judged to have a particular relevance level: for example "Bad", "Fair", "Good", or "Excellent". The performance of the model is then evaluated using information retrieval (IR) metrics such as Normalised Discounted Cumulative Gain (NDCG). Recently Taylor et al. presented a method called SoftRank which allows the direct gradient optimisation of a smoothed version of NDCG using a Thurstonian model. In this approach, document scores are represented by the outputs of a neural network, and score distributions are created artificially by adding random noise to the scores. The SoftRank mechanism is a general one; it can be applied to different IR metrics, and make use of different underlying models. In this paper we extend the SoftRank framework to make use of the score uncertainties which are naturally provided by a Gaussian process (GP), which is a probabilistic non-linear regression model. We further develop the model by using sparse Gaussian process techniques, which give improved performance and efficiency, and show competitive results against baseline methods when tested on the publicly available LETOR OHSUMED data set. We also explore how the available uncertainty information can be used in prediction and how it affects model performance.
Learning to rank at query-time using association rules BIBAFull-Text 267-274
  Adriano A. Veloso; Humberto M. Almeida; Marcos A. Gonçalves; Wagner, Jr. Meira
Some applications have to present their results in the form of ranked lists. This is the case of many information retrieval applications, in which documents must be sorted according to their relevance to a given query. This has led the interest of the information retrieval community in methods that automatically learn effective ranking functions. In this paper we propose a novel method which uncovers patterns (or rules) in the training data associating features of the document with its relevance to the query, and then uses the discovered rules to rank documents. To address typical problems that are inherent to the utilization of association rules (such as missing rules and rule explosion), the proposed method generates rules on a demand-driven basis, at query-time. The result is an extremely fast and effective ranking method. We conducted a systematic evaluation of the proposed method using the LETOR benchmark collections. We show that generating rules on a demand-driven basis can boost ranking performance, providing gains ranging from 12% to 123%, outperforming the state-of-the-art methods that learn to rank, with no need of time-consuming and laborious pre-processing. As a highlight, we also show that additional information, such as query terms, can make the generated rules more discriminative, further improving ranking performance.
Learning to rank with ties BIBAFull-Text 275-282
  Ke Zhou; Gui-Rong Xue; Hongyuan Zha; Yong Yu
Designing effective ranking functions is a core problem for information retrieval and Web search since the ranking functions directly impact the relevance of the search results. The problem has been the focus of much of the research at the intersection of Web search and machine learning, and learning ranking functions from preference data in particular has recently attracted much interest. The objective of this paper is to empirically examine several objective functions that can be used for learning ranking functions from preference data. Specifically, we investigate the roles of ties in the learning process. By ties, we mean preference judgments that two documents have equal degree of relevance with respect to a query. This type of data has largely been ignored or not properly modeled in the past. In this paper, we analyze the properties of ties and develop novel learning frameworks which combine ties and preference data using statistical paired comparison models to improve the performance of learned ranking functions. The resulting optimization problems explicitly incorporating ties and preference data are solved using gradient boosting methods. Experimental studies are conducted using three publicly available data sets which demonstrate the effectiveness of the proposed new methods.


Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization BIBAFull-Text 283-290
  Furu Wei; Wenjie Li; Qin Lu; Yanxiang He
Sentence ranking is the issue of most concern in document summarization. Early researchers have presented the mutual reinforcement principle (MR) between sentence and term for simultaneous key phrase and salient sentence extraction in generic single-document summarization. In this work, we extend the MR to the mutual reinforcement chain (MRC) of three different text granularities, i.e., document, sentence and terms. The aim is to provide a general reinforcement framework and a formal mathematical modeling for the MRC. Going one step further, we incorporate the query influence into the MRC to cope with the need for query-oriented multi-document summarization. While the previous summarization approaches often calculate the similarity regardless of the query, we develop a query-sensitive similarity to measure the affinity between the pair of texts. When evaluated on the DUC 2005 dataset, the experimental results suggest that the proposed query-sensitive MRC (Qs-MRC) is a promising approach for summarization.
Comments-oriented document summarization: understanding documents with readers' feedback BIBAFull-Text 291-298
  Meishan Hu; Aixin Sun; Ee-Peng Lim
Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and summarization. In this paper, we study the problem of comments-oriented document summarization and aim to summarize a Web document (e.g., a blog post) by considering not only its content, but also the comments left by its readers. We identify three relations (namely, topic, quotation, and mention) by which comments can be linked to one another, and model the relations in three graphs. The importance of each comment is then scored by: (i) graph-based method, where the three graphs are merged into a multi-relation graph; (ii) tensor-based method, where the three graphs are used to construct a 3rd-order tensor. To generate a comments-oriented summary, we extract sentences from the given Web document using either feature-biased approach or uniform-document approach. The former scores sentences to bias keywords derived from comments; while the latter scores sentences uniformly with comments. In our experiments using a set of blog posts with manually labeled sentences, our proposed summarization methods utilizing comments showed significant improvement over those not using comments. The methods using feature-biased sentence extraction approach were observed to outperform that using uniform-document approach.
Multi-document summarization using cluster-based link analysis BIBAFull-Text 299-306
  Xiaojun Wan; Jianwu Yang
The Markov Random Walk model has been recently exploited for multi-document summarization by making use of the link relationships between sentences in the document set, under the assumption that all the sentences are indistinguishable from each other. However, a given document set usually covers a few topic themes with each theme represented by a cluster of sentences. The topic themes are usually not equally important and the sentences in an important theme cluster are deemed more salient than the sentences in a trivial theme cluster. This paper proposes the Cluster-based Conditional Markov Random Walk Model (ClusterCMRW) and the Cluster-based HITS Model (ClusterHITS) to fully leverage the cluster-level information. Experimental results on the DUC2001 and DUC2002 datasets demonstrate the good effectiveness of our proposed summarization models. The results also demonstrate that the ClusterCMRW model is more robust than the ClusterHITS model, with respect to different cluster numbers.
Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization BIBAFull-Text 307-314
  Dingding Wang; Tao Li; Shenghuo Zhu; Chris Ding
Multi-document summarization aims to create a compressed summary while retaining the main characteristics of the original set of documents. Many approaches use statistics and machine learning techniques to extract sentences from documents. In this paper, we propose a new multi-document summarization framework based on sentence-level semantic analysis and symmetric non-negative matrix factorization. We first calculate sentence-sentence similarities using semantic analysis and construct the similarity matrix. Then symmetric matrix factorization, which has been shown to be equivalent to normalized spectral clustering, is used to group sentences into clusters. Finally, the most informative sentences are selected from each group to form the summary. Experimental results on DUC2005 and DUC2006 data sets demonstrate the improvement of our proposed framework over the implemented existing summarization systems. A further study on the factors that benefit the high performance is also conducted.

Exploratory search & filtering

Algorithmic mediation for collaborative exploratory search BIBAFull-Text 315-322
  Jeremy Pickens; Gene Golovchinsky; Chirag Shah; Pernilla Qvarfordt; Maribeth Back
We describe a new approach to information retrieval: algorithmic mediation for intentional, synchronous collaborative exploratory search. Using our system, two or more users with a common information need search together, simultaneously. The collaborative system provides tools, user interfaces and, most importantly, algorithmically-mediated retrieval to focus, enhance and augment the team's search and communication activities. Collaborative search outperformed post hoc merging of similarly instrumented single user runs. Algorithmic mediation improved both collaborative search (allowing a team of searchers to find relevant information more efficiently and effectively), and exploratory search (allowing the searchers to find relevant information that cannot be found while working individually).
Exploiting correlated keywords to improve approximate information filtering BIBAFull-Text 323-330
  Christian Zimmer; Christos Tryfonopoulos; Gerhard Weikum
Information filtering, also referred to as publish/subscribe, complements one-time searching since users are able to subscribe to information sources and be notified whenever new documents of interest are published. In approximate information filtering only selected information sources, that are likely to publish documents relevant to the user interests in the future, are monitored. To achieve this functionality, a subscriber exploits statistical metadata to identify promising publishers and index its continuous query only in those publishers. The statistics are maintained in a directory, usually on a per-keyword basis, thus disregarding possible correlations among keywords. Using this coarse information, poor publisher selection may lead to poor filtering performance and thus loss of interesting documents.1
   Based on the above observation, this work extends query routing techniques from the domain of distributed information retrieval in peer-to-peer (P2P) networks, and provides new algorithms for exploiting the correlation among keywords in a filtering setting. We develop and evaluate two algorithms based on single-key and multi-key statistics and utilize two different synopses (Hash Sketches and KMV synopses) to compactly represent publishers. Our experimental evaluation using two real-life corpora with web and blog data demonstrates the filtering effectiveness of both approaches and highlights the different tradeoffs.

Web-search: 2

A user browsing model to predict search engine click data from past observations BIBAFull-Text 331-338
  Georges E. Dupret; Benjamin Piwowarski
Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have actually seen before and after they clicked. Otherwise, we could estimate document relevance by simple counting. In this paper, we propose a set of assumptions on user browsing behavior that allows the estimation of the probability that a document is seen, thereby providing an unbiased estimate of document relevance. To train, test and compare our model to the best alternatives described in the Literature, we gather a large set of real data and proceed to an extensive cross-validation experiment. Our solution outperforms very significantly all previous models. As a side effect, we gain insight into the browsing behavior of users and we can compare it to the conclusions of an eye-tracking experiments by Joachims et al. [12]. In particular, our findings confirm that a user almost always see the document directly after a clicked document. They also explain why documents situated just after a very relevant document are clicked more often.
Learning query intent from regularized click graphs BIBAFull-Text 339-346
  Xiao Li; Ye-Yi Wang; Alex Acero
This work presents the use of click graphs in improving query intent classifiers, which are critical if vertical search and general-purpose search services are to be offered in a unified user interface. Previous works on query classification have primarily focused on improving feature representation of queries, e.g., by augmenting queries with search engine results. In this work, we investigate a completely orthogonal approach -- instead of enriching feature representation, we aim at drastically increasing the amounts of training data by semi-supervised learning with click graphs. Specifically, we infer class memberships of unlabeled queries from those of labeled ones according to their proximities in a click graph. Moreover, we regularize the learning with click graphs by content-based classification to avoid propagating erroneous labels. We demonstrate the effectiveness of our algorithms in two different applications, product intent and job intent classification. In both cases, we expand the training data with automatically labeled queries by over two orders of magnitude, leading to significant improvements in classification performance. An additional finding is that with a large amount of training data obtained in this fashion, classifiers using only query words/phrases as features can work remarkably well.
Retrieval and feedback models for blog feed search BIBAFull-Text 347-354
  Jonathan L. Elsas; Jaime Arguello; Jamie Callan; Jaime G. Carbonell
Blog feed search poses different and interesting challenges from traditional ad hoc document retrieval. The units of retrieval, the blogs, are collections of documents, the blog posts. In this work we adapt a state-of-the-art federated search model to the feed retrieval task, showing a significant improvement over algorithms based on the best performing submissions in the TREC 2007 Blog Distillation task[12]. We also show that typical query expansion techniques such as pseudo-relevance feedback using the blog corpus do not provide any significant performance improvement and in many cases dramatically hurt performance. We perform an in-depth analysis of the behavior of pseudo-relevance feedback for this task and develop a novel query expansion technique using the link structure in Wikipedia. This query expansion technique provides significant and consistent performance improvements for this task, yielding a 22% and 14% improvement in MAP over the unexpanded query for our baseline and federated algorithms respectively.

Multimedia retrieval

Learning to reduce the semantic gap in web image retrieval and annotation BIBAFull-Text 355-362
  Changhu Wang; Lei Zhang; Hong-Jiang Zhang
We study in this paper the problem of bridging the semantic gap between low-level image features and high-level semantic concepts, which is the key hindrance in content-based image retrieval. Piloted by the rich textual information of Web images, the proposed framework tries to learn a new distance measure in the visual space, which can be used to retrieve more semantically relevant images for any unseen query image. The framework differentiates with traditional distance metric learning methods in the following ways. 1) A ranking-based distance metric learning method is proposed for image retrieval problem, by optimizing the leave-one-out retrieval performance on the training data. 2) To be scalable, millions of images together with rich textual information have been crawled from the Web to learn the similarity measure, and the learning framework particularly considers the indexing problem to ensure the retrieval efficiency. 3) To alleviate the noises in the unbalanced labels of images and fully utilize the textual information, a Latent Dirichlet Allocation based topic-level text model is introduced to define pairwise semantic similarity between any two images. The learnt distance measure can be directly applied to applications such as content-based image retrieval and search-based image annotation. Experimental results on the two applications in a two million Web image database show both the effectiveness and efficiency of the proposed framework.
A lattice-based approach to query-by-example spoken document retrieval BIBAFull-Text 363-370
  Tee Kiah Chia; Khe Chai Sim; Haizhou Li; Hwee Tou Ng
Recent efforts on the task of spoken document retrieval (SDR) have made use of speech lattices: speech lattices contain information about alternative speech transcription hypotheses other than the 1-best transcripts, and this information can improve retrieval accuracy by overcoming recognition errors present in the 1-best transcription. In this paper, we look at using lattices for the query-by-example spoken document retrieval task -- retrieving documents from a speech corpus, where the queries are themselves in the form of complete spoken documents (query exemplars). We extend a previously proposed method for SDR with short queries to the query-by-example task. Specifically, we use a retrieval method based on statistical modeling: we compute expected word counts from document and query lattices, estimate statistical models from these counts, and compute relevance scores as divergences between these models. Experimental results on a speech corpus of conversational English show that the use of statistics from lattices for both documents and query exemplars results in better retrieval accuracy than using only 1-best transcripts for either documents, or queries, or both. In addition, we investigate the effect of stop word removal which further improves retrieval accuracy. To our knowledge, our work is the first to have used a lattice-based approach to query-by-example spoken document retrieval.

Query analysis & models: 1

A few examples go a long way: constructing query models from elaborate query formulations BIBAFull-Text 371-378
  Krisztian Balog; Wouter Weerkamp; Maarten de Rijke
We address a specific enterprise document search scenario, where the information need is expressed in an elaborate manner. In our scenario, information needs are expressed using a short query (of a few keywords) together with examples of key reference pages. Given this setup, we investigate how the examples can be utilized to improve the end-to-end performance on the document retrieval task. Our approach is based on a language modeling framework, where the query model is modified to resemble the example pages. We compare several methods for sampling expansion terms from the example pages to support query-dependent and query-independent query expansion; the latter is motivated by the wish to increase "aspect recall", and attempts to uncover aspects of the information need not captured by the query.
   For evaluation purposes we use the CSIRO data set created for the TREC 2007 Enterprise track. The best performance is achieved by query models based on query-independent sampling of expansion terms from the example documents.
A unified and discriminative model for query refinement BIBAFull-Text 379-386
  Jiafeng Guo; Gu Xu; Hang Li; Xueqi Cheng
This paper addresses the issue of query refinement, which involves reformulating ill-formed search queries in order to enhance relevance of search results. Query refinement typically includes a number of tasks such as spelling error correction, word splitting, word merging, phrase segmentation, word stemming, and acronym expansion. In previous research, such tasks were addressed separately or through employing generative models. This paper proposes employing a unified and discriminative model for query refinement. Specifically, it proposes a Conditional Random Field (CRF) model suitable for the problem, referred to as Conditional Random Field for Query Refinement (CRF-QR). Given a sequence of query words, CRF-QR predicts a sequence of refined query words as well as corresponding refinement operations. In that sense, CRF-QR differs greatly from conventional CRF models. Two types of CRF-QR models, namely a basic model and an extended model are introduced. One merit of employing CRF-QR is that different refinement tasks can be performed simultaneously and thus the accuracy of refinement can be enhanced. Furthermore, the advantages of discriminative models over generative models can be fully leveraged. Experimental results demonstrate that CRF-QR can significantly outperform baseline methods. Furthermore, when CRF-QR is used in web search, a significant improvement of relevance can be obtained.
Query expansion using gaze-based feedback on the subdocument level BIBAFull-Text 387-394
  Georg Buscher; Andreas Dengel; Ludger van Elst
We examine the effect of incorporating gaze-based attention feedback from the user on personalizing the search process. Employing eye tracking data, we keep track of document parts the user read in some way. We use this information on the subdocument level as implicit feedback for query expansion and reranking.
   We evaluated three different variants incorporating gaze data on the subdocument level and compared them against a baseline based on context on the document level. Our results show that considering reading behavior as feedback yields powerful improvements of the search result accuracy of ca. 32% in the general case. However, the extent of the improvements varies depending on the internal structure of the viewed documents and the type of the current information need.


Affective feedback: an investigation into the role of emotions in the information seeking process BIBAFull-Text 395-402
  Ioannis Arapakis; Joemon M. Jose; Philip D. Gray
User feedback is considered to be a critical element in the information seeking process, especially in relation to relevance assessment. Current feedback techniques determine content relevance with respect to the cognitive and situational levels of interaction that occurs between the user and the retrieval system. However, apart from real-life problems and information objects, users interact with intentions, motivations and feelings, which can be seen as critical aspects of cognition and decision-making. The study presented in this paper serves as a starting point to the exploration of the role of emotions in the information seeking process. Results show that the latter not only interweave with different physiological, psychological and cognitive processes, but also form distinctive patterns, according to specific task, and according to specific user.
Optimizing relevance and revenue in ad search: a query substitution approach BIBAFull-Text 403-410
  Filip Radlinski; Andrei Broder; Peter Ciccolo; Evgeniy Gabrilovich; Vanja Josifovski; Lance Riedel
The primary business model behind Web search is based on textual advertising, where contextually relevant ads are displayed alongside search results. We address the problem of selecting these ads so that they are both relevant to the queries and profitable to the search engine, showing that optimizing ad relevance and revenue is not equivalent. Selecting the best ads that satisfy these constraints also naturally incurs high computational costs, and time constraints can lead to reduced relevance and profitability. We propose a novel two-stage approach, which conducts most of the analysis ahead of time. An offine preprocessing phase leverages additional knowledge that is impractical to use in real time, and rewrites frequent queries in a way that subsequently facilitates fast and accurate online matching. Empirical evaluation shows that our method optimized for relevance matches a state-of-the-art method while improving expected revenue. When optimizing for revenue, we see even more substantial improvements in expected revenue.
A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval BIBAFull-Text 411-418
  Min Zhang; Xingyao Ye
Opinion retrieval is a task of growing interest in social life and academic research, which is to find relevant and opinionate documents according to a user's query. One of the key issues is how to combine a document's opinionate score (the ranking score of to what extent it is subjective or objective) and topic relevance score. Current solutions to document ranking in opinion retrieval are generally ad-hoc linear combination, which is short of theoretical foundation and careful analysis. In this paper, we focus on lexicon-based opinion retrieval. A novel generation model that unifies topic-relevance and opinion generation by a quadratic combination is proposed in this paper. With this model, the relevance-based ranking serves as the weighting factor of the lexicon-based sentiment ranking function, which is essentially different from the popular heuristic linear combination approaches. The effect of different sentiment dictionaries is also discussed. Experimental results on TREC blog datasets show the significant effectiveness of the proposed unified model. Improvements of 28.1% and 40.3% have been obtained in terms of MAP and p@10 respectively. The conclusion is not limited to blog environment. Besides the unified generation model, another contribution is that our work demonstrates that in the opinion retrieval task, a Bayesian approach to combining multiple ranking functions is superior to using a linear combination. It is also applicable to other result re-ranking applications in similar scenario.

Probabilistic models

Discriminative probabilistic models for passage based retrieval BIBAFull-Text 419-426
  Mengqiu Wang; Luo Si
The approach of using passage-level evidence for document retrieval has shown mixed results when it is applied to a variety of test beds with different characteristics. One main reason of the inconsistent performance is that there exists no unified framework to model the evidence of individual passages within a document. This paper proposes two probabilistic models to formally model the evidence of a set of top ranked passages in a document. The first probabilistic model follows the retrieval criterion that a document is relevant if any passage in the document is relevant, and models each passage independently. The second probabilistic model goes a step further and incorporates the similarity correlations among the passages. Both models are trained in a discriminative manner. Furthermore, we present a combination approach to combine the ranked lists of document retrieval and passage-based retrieval.
   An extensive set of experiments have been conducted on four different TREC test beds to show the effectiveness of the proposed discriminative probabilistic models for passage-based retrieval. The proposed algorithms are compared with a state-of-the-art document retrieval algorithm and a language model approach for passage-based retrieval. Furthermore, our combined approach has been shown to provide better results than both document retrieval and passage-based retrieval approaches.
A new probabilistic retrieval model based on the dirichlet compound multinomial distribution BIBAFull-Text 427-434
  Zuobing Xu; Ram Akella
The classical probabilistic models attempt to capture the Ad hoc information retrieval problem within a rigorous probabilistic framework. It has long been recognized that the primary obstacle to effective performance of the probabilistic models is the need to estimate a relevance model. The Dirichlet compound multinomial (DCM) distribution, which relies on hierarchical Bayesian modeling techniques, or the Polya Urn scheme, is a more appropriate generative model than the traditional multinomial distribution for text documents. We explore a new probabilistic model based on the DCM distribution, which enables efficient retrieval and accurate ranking. Because the DCM distribution captures the dependency of repetitive word occurrences, the new probabilistic model is able to model the concavity of the score function more effectively. To avoid the empirical tuning of retrieval parameters, we design several parameter estimation algorithms to automatically set model parameters. Additionally, we propose a pseudo-relevance feedback algorithm based on the latent mixture modeling of the Dirichlet compound multinomial distribution to further improve retrieval accuracy. Finally, our experiments show that both the baseline probabilistic retrieval algorithm based on the DCM distribution and the corresponding pseudo-relevance feedback algorithm outperform the existing language modeling systems on several TREC retrieval tasks.
TF-IDF uncovered: a study of theories and probabilities BIBAFull-Text 435-442
  Thomas Roelleke; Jun Wang
Interpretations of TF-IDF are based on binary independence retrieval, Poisson, information theory, and language modelling. This paper contributes a review of existing interpretations, and then, TF-IDF is systematically related to the probabilities P(q|d) and P(d|q). Two approaches are explored: a space of independent, and a space of disjoint terms. For independent terms, an "extreme" query/non-query term assumption uncovers TF-IDF, and an analogy of P(d|q) and the probabilistic odds O(r|d, q) mirrors relevance feedback. For disjoint terms, a relationship between probability theory and TF-IDF is established through the integral + 1/x dx = log x. This study uncovers components such as divergence from randomness and pivoted document length to be inherent parts of a document-query independence (DQI) measure, and interestingly, an integral of the DQI over the term occurrence probability leads to TF-IDF.

Analysis of social networks

Separate and inequal: preserving heterogeneity in topical authority flows BIBAFull-Text 443-450
  Lan Nie; Brian D. Davison
Web pages, like people, are often known by others in a variety of contexts. When those contexts are sufficiently distinct, a page's importance may be better represented by multiple domains of authority, rather than by one that indiscriminately mixes reputations. In this work we determine domains of authority by examining the contexts in which a page is cited. However, we find that it is not enough to determine separate domains of authority; our model additionally determines the local flow of authority based upon the relative similarity of the source and target authority domains. In this way, we differentiate both incoming and outgoing hyperlinks by topicality and importance rather than treating them indiscriminately. We find that this approach compares favorably to other topical ranking methods on two real-world datasets and produces an approximately 10% improvement in precision and quality of the top ten results over PageRank.
BrowseRank: letting web users vote for page importance BIBAFull-Text 451-458
  Yuting Liu; Bin Gao; Tie-Yan Liu; Ying Zhang; Zhiming Ma; Shuyuan He; Hang Li
This paper proposes a new method for computing page importance, referred to as BrowseRank. The conventional approach to compute page importance is to exploit the link graph of the web and to build a model based on that graph. For instance, PageRank is such an algorithm, which employs a discrete-time Markov process as the model. Unfortunately, the link graph might be incomplete and inaccurate with respect to data for determining page importance, because links can be easily added and deleted by web content creators. In this paper, we propose computing page importance by using a 'user browsing graph' created from user behavior data. In this graph, vertices represent pages and directed edges represent transitions between pages in the users' web browsing history. Furthermore, the lengths of staying time spent on the pages by users are also included. The user browsing graph is more reliable than the link graph for inferring page importance. This paper further proposes using the continuous-time Markov process on the user browsing graph as a model and computing the stationary probability distribution of the process as page importance. An efficient algorithm for this computation has also been devised. In this way, we can leverage hundreds of millions of users' implicit voting on page importance. Experimental results show that BrowseRank indeed outperforms the baseline methods such as PageRank and TrustRank in several tasks.
Exploring traversal strategy for web forum crawling BIBAFull-Text 459-466
  Yida Wang; Jiang-Ming Yang; Wei Lai; Rui Cai; Lei Zhang; Wei-Ying Ma
In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging task due to complex in-site link structures and login controls of most forum sites. Without carefully selecting the traversal path, a generic crawler usually downloads many duplicate and invalid pages from forums, and thus wastes both the precious bandwidth and the limited storage space. To crawl forum data more effectively and efficiently, in this paper, we propose an automatic approach to exploring an appropriate traversal strategy to direct the crawling of a given target forum. In detail, the traversal strategy consists of the identification of the skeleton links and the detection of the page-flipping links. The skeleton links instruct the crawler to only crawl valuable pages and meanwhile avoid duplicate and uninformative ones; and the page-flipping links tell the crawler how to completely download a long discussion thread which is usually shown in multiple pages in Web forums. The extensive experimental results on several forums show encouraging performance of our approach. Following the discovered traversal strategy, our forum crawler can archive more informative pages in comparison with previous related work and a commercial generic crawler.


Finding question-answer pairs from online forums BIBAFull-Text 467-474
  Gao Cong; Long Wang; Chin-Yew Lin; Young-In Song; Yueheng Sun
Online forums contain a huge amount of valuable user generated content. In this paper we address the problem of extracting question-answer pairs from forums. Question-answer pairs extracted from forums can be used to help Question Answering services (e.g. Yahoo! Answers) among other applications. We propose a sequential patterns based classification method to detect questions in a forum thread, and a graph based propagation method to detect answers for questions in the same thread. Experimental results show that our techniques are very promising.
Retrieval models for question and answer archives BIBAFull-Text 475-482
  Xiaobing Xue; Jiwoon Jeon; W. Bruce Croft
Retrieval in a question and answer archive involves finding good answers for a user's question. In contrast to typical document retrieval, a retrieval model for this task can exploit question similarity as well as ranking the associated answers. In this paper, we propose a retrieval model that combines a translation-based language model for the question part with a query likelihood approach for the answer part. The proposed model incorporates word-to-word translation probabilities learned through exploiting different sources of information. Experiments show that the proposed translation based language model for the question part outperforms baseline methods significantly. By combining with the query likelihood language model for the answer part, substantial additional effectiveness improvements are obtained.
Predicting information seeker satisfaction in community question answering BIBAFull-Text 483-490
  Yandong Liu; Jiang Bian; Eugene Agichtein
Question answering communities such as Naver and Yahoo! Answers have emerged as popular, and often effective, means of information seeking on the web. By posting questions for other participants to answer, information seekers can obtain specific answers to their questions. Users of popular portals such as Yahoo! Answers already have submitted millions of questions and received hundreds of millions of answers from other participants. However, it may also take hours -- and sometimes days -- until a satisfactory answer is posted. In this paper we introduce the problem of predicting information seeker satisfaction in collaborative question answering communities, where we attempt to predict whether a question author will be satisfied with the answers submitted by the community participants. We present a general prediction model, and develop a variety of content, structure, and community-focused features for this task. Our experimental results, obtained from a largescale evaluation over thousands of real questions and user ratings, demonstrate the feasibility of modeling and predicting asker satisfaction. We complement our results with a thorough investigation of the interactions and information seeking patterns in question answering communities that correlate with information seeker satisfaction. Our models and predictions could be useful for a variety of applications such as user intent inference, answer ranking, interface design, and query suggestion and routing.

Query analysis & models: 2

Discovering key concepts in verbose queries BIBAFull-Text 491-498
  Michael Bendersky; W. Bruce Croft
Current search engines do not, in general, perform well with longer, more verbose queries. One of the main issues in processing these queries is identifying the key concepts that will have the most impact on effectiveness. In this paper, we develop and evaluate a technique that uses query-dependent, corpus-dependent, and corpus-independent features for automatic extraction of key concepts from verbose queries. We show that our method achieves higher accuracy in the identification of key concepts than standard weighting methods such as inverse document frequency. Finally, we propose a probabilistic model for integrating the weighted key concepts identified by our method into a query, and demonstrate that this integration significantly improves retrieval effectiveness for a large set of natural language description queries derived from TREC topics on several newswire and web collections.
Ambiguous queries: test collections need more sense BIBAFull-Text 499-506
  Mark Sanderson
Although there are many papers examining ambiguity in Information Retrieval, this paper shows that there is a whole class of ambiguous word that past research has barely explored. It is shown that the class is more ambiguous than other word types and is commonly used in queries. The lack of test collections containing ambiguous queries is highlighted and a method for creating collections from existing resources is described. Tests using the new collection show the impact of query ambiguity on an IR system: it is shown that conventional systems are incapable of dealing effectively with such queries and that current assumptions about how to improve search effectiveness do not hold when searching on this common query type.
Automatically identifying localizable queries BIBAFull-Text 507-514
  Michael J. Welch; Junghoo Cho
Personalization of web search results as a technique for improving user satisfaction has received notable attention in the research community over the past decade. Much of this work focuses on modeling and establishing a profile for each user to aid in personalization. Our work takes a more query-centric approach. In this paper, we present a method for efficient, automatic identification of a class of queries we define as localizable from a web search engine query log. We determine a set of relevant features and use conventional machine learning techniques to classify queries. Our experiments find that our technique is able to identify localizable queries with 94% accuracy.

Social tagging

Real-time automatic tag recommendation BIBAFull-Text 515-522
  Yang Song; Ziming Zhuang; Huajing Li; Qiankun Zhao; Jia Li; Wang-Chien Lee; C. Lee Giles
Tags are user-generated labels for entities. Existing research on tag recommendation either focuses on improving its accuracy or on automating the process, while ignoring the efficiency issue. We propose a highly-automated novel framework for real-time tag recommendation. The tagged training documents are treated as triplets of (words, docs, tags), and represented in two bipartite graphs, which are partitioned into clusters by Spectral Recursive Embedding (SRE). Tags in each topical cluster are ranked by our novel ranking algorithm. A two-way Poisson Mixture Model (PMM) is proposed to model the document distribution into mixture components within each cluster and aggregate words into word clusters simultaneously. A new document is classified by the mixture model based on its posterior probabilities so that tags are recommended according to their ranks. Experiments on large-scale tagging datasets of scientific documents (CiteULike) and web pages del.icio.us) indicate that our framework is capable of making tag recommendation efficiently and effectively. The average tagging time for testing a document is around 1 second, with over 88% test documents correctly labeled with the top nine tags we suggested.
Efficient top-k querying over social-tagging networks BIBAFull-Text 523-530
  Ralf Schenkel; Tom Crecelius; Mouna Kacimi; Sebastian Michel; Thomas Neumann; Josiane X. Parreira; Gerhard Weikum
Online communities have become popular for publishing and searching content, as well as for finding and connecting to other users. User-generated content includes, for example, personal blogs, bookmarks, and digital photos. These items can be annotated and rated by different users, and these social tags and derived user-specific scores can be leveraged for searching relevant content and discovering subjectively interesting items. Moreover, the relationships among users can also be taken into consideration for ranking search results, the intuition being that you trust the recommendations of your close friends more than those of your casual acquaintances.
   Queries for tag or keyword combinations that compute and rank the top-k results thus face a large variety of options that complicate the query processing and pose efficiency challenges. This paper addresses these issues by developing an incremental top-k algorithm with two-dimensional expansions: social expansion considers the strength of relations among users, and semantic expansion considers the relatedness of different tags. It presents a new algorithm, based on principles of threshold algorithms, by folding friends and related tags into the search space in an incremental on-demand manner. The excellent performance of the method is demonstrated by an experimental evaluation on three real-world datasets, crawled from deli.cio.us, Flickr, and LibraryThing.
Social tag prediction BIBAFull-Text 531-538
  Paul Heymann; Daniel Ramage; Hector Garcia-Molina
In this paper, we look at the "social tag prediction" problem. Given a set of objects, and a set of tags applied to those objects by users, can we predict whether a given tag could/should be applied to a particular object? We investigated this question using one of the largest crawls of the social bookmarking system del.icio.us gathered to date. For URLs in del.icio.us, we predicted tags based on page text, anchor text, surrounding hosts, and other tags applied to the URL. We found an entropy-based metric which captures the generality of a particular tag and informs an analysis of how well that tag can be predicted. We also found that tag-based association rules can produce very high-precision predictions as well as giving deeper understanding into the relationships between tags. Our results have implications for both the study of tagging systems as potential information retrieval tools, and for the design of such systems.

Clustering: 2

Spectral geometry for simultaneously clustering and ranking query search results BIBAFull-Text 539-546
  Ying Liu; Wenyuan Li; Yongjing Lin; Liping Jing
How best to present query search results is an important problem in search engines and information retrieval systems. When a single query retrieves many results, simply showing them as a long list will provide users with poor overview. Nowadays, ranking and clustering query search results have been two useful separate post-processing techniques to organize retrieved documents. In this paper, we proposed a spectral analysis method based on the content similarity networks to integrate the clustering and ranking techniques for improving literature search. The new approach organizes all these search results into categories intelligently and simultaneously rank the results in each category. A variety of theoretical and empirical studies have demonstrated that the presented method performs well in real applications, especially in biomedical literature retrieval. Moreover, any free text information can be analyzed with the new method, i.e., the proposed approach can be applied to various information systems, such as Web search engines and literature search service.
A rank-aggregation approach to searching for optimal query-specific clusters BIBAFull-Text 547-554
  Oren Kurland; Carmel Domshlak
To improve the precision at the very top ranks of a document list presented in response to a query, researchers suggested to exploit information induced from clustering of documents highly ranked by some initial search. We propose a novel model for ranking such (query-specific) clusters by the presumed percentage of relevant documents that they contain. The model is based on (i) proposing a palette of "witness" cluster properties that purportedly correlate with this percentage, (ii) devising concrete quantitative measures for these properties, and (iii) ordering the clusters via aggregation of rankings induced by these individual measures. Empirical evaluation shows that our model is consistently more effective than previously suggested methods in detecting clusters containing a high relevant-document percentage. Furthermore, the precision-at-top-ranks performance of this model transcends that of standard document-based retrieval, and competes with that of a state-of-the-art document-based retrieval approach.
A comparative evaluation of different link types on enhancing document clustering BIBAFull-Text 555-562
  Xiaodan Zhang; Xiaohua Hu; Xiaohua Zhou
With a growing number of works utilizing link information in enhancing document clustering, it becomes necessary to make a comparative evaluation of the impacts of different link types on document clustering. Various types of links between text documents, including explicit links such as citation links and hyperlinks, implicit links such as co-authorship links, and pseudo links such as content similarity links, convey topic similarity or topic transferring patterns, which is very useful for document clustering. In this study, we adopt a Relaxation Labeling (RL)-based clustering algorithm, which employs both content and linkage information, to evaluate the effectiveness of the aforementioned types of links for document clustering on eight datasets. The experimental results show that linkage is quite effective in improving content-based document clustering. Furthermore, a series of interesting findings regarding the impacts of different link types on document clustering are discovered through our experiments.

Content analysis

SpotSigs: robust and efficient near duplicate detection in large web collections BIBAFull-Text 563-570
  Martin Theobald; Jonathan Siddharth; Andreas Paepcke
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching signatures for near duplicate detection in large Web crawls. Our spot signatures are designed to favor natural-language portions of Web pages over advertisements and navigational bars.
   The contributions of SpotSigs are twofold: 1) by combining stopword antecedents with short chains of adjacent content terms, we create robust document signatures with a natural ability to filter out noisy components of Web pages that would otherwise distract pure n-gram-based approaches such as Shingling; 2) we provide an exact and efficient, self-tuning matching algorithm that exploits a novel combination of collection partitioning and inverted index pruning for high-dimensional similarity search. Experiments confirm an increase in combined precision and recall of more than 24 percent over state-of-the-art approaches such as Shingling or I-Match and up to a factor of 3 faster execution times than Locality Sensitive Hashing (LSH), over a demonstrative "Gold Set" of manually assessed near-duplicate news articles as well as the TREC WT10g Web collection.
Local text reuse detection BIBAFull-Text 571-578
  Jangwon Seo; W. Bruce Croft
Text reuse occurs in many different types of documents and for many different reasons. One form of reuse, duplicate or near-duplicate documents, has been a focus of researchers because of its importance in Web search. Local text reuse occurs when sentences, facts or passages, rather than whole documents, are reused and modified. Detecting this type of reuse can be the basis of new tools for text analysis. In this paper, we introduce a new approach to detecting local text reuse and compare it to other approaches. This comparison involves a study of the amount and type of reuse that occurs in real documents, including TREC newswire and blog collections.
TSCAN: a novel method for topic summarization and content anatomy BIBAFull-Text 579-586
  Chien Chin Chen; Meng Chang Chen
A topic is defined as a seminal event or activity along with all directly related events and activities. It is represented as a chronological sequence of documents by different authors published on the Internet. In this paper, we define a task called topic anatomy, which summarizes and associates core parts of a topic graphically so that readers can understand the content easily. The proposed topic anatomy model, called TSCAN, derives the major themes of a topic from the eigenvectors of a temporal block association matrix. Then, the significant events of the themes and their summaries are extracted by examining the constitution of the eigenvectors. Finally, the extracted events are associated through their temporal closeness and context similarity to form the evolution graph of the topic. Experiments based on the official TDT4 corpus demonstrate that the generated evolution graphs comprehensibly describe the storylines of topics. Moreover, in terms of content coverage and consistency, the produced summaries are superior to those of other summarization methods based on human composed reference summaries.

Learning models for IR

A new rank correlation coefficient for information retrieval BIBAFull-Text 587-594
  Emine Yilmaz; Javed A. Aslam; Stephen Robertson
In the field of information retrieval, one is often faced with the problem of computing the correlation between two ranked lists. The most commonly used statistic that quantifies this correlation is Kendall's τ. Often times, in the information retrieval community, discrepancies among those items having high rankings are more important than those among items having low rankings. The Kendall's τ statistic, however, does not make such distinctions and equally penalizes errors both at high and low rankings.
   In this paper, we propose a new rank correlation coefficient, AP correlation (?ap), that is based on average precision and has a probabilistic interpretation. We show that the proposed statistic gives more weight to the errors at high rankings and has nice mathematical properties which make it easy to interpret. We further validate the applicability of the statistic using experimental data.
Learning from labeled features using generalized expectation criteria BIBAFull-Text 595-602
  Gregory Druck; Gideon Mann; Andrew McCallum
It is difficult to apply machine learning to new domains because often we lack labeled problem instances. In this paper, we provide a solution to this problem that leverages domain knowledge in the form of affinities between input features and classes. For example, in a baseball vs. hockey text classification problem, even without any labeled data, we know that the presence of the word puck is a strong indicator of hockey. We refer to this type of domain knowledge as a labeled feature. In this paper, we propose a method for training discriminative probabilistic models with labeled features and unlabeled instances. Unlike previous approaches that use labeled features to create labeled pseudo-instances, we use labeled features directly to constrain the model's predictions on unlabeled instances. We express these soft constraints using generalized expectation (GE) criteria -- terms in a parameter estimation objective function that express preferences on values of a model expectation. In this paper we train multinomial logistic regression models using GE criteria, but the method we develop is applicable to other discriminative probabilistic models. The complete objective function also includes a Gaussian prior on parameters, which encourages generalization by spreading parameter weight to unlabeled features. Experimental results on text classification data sets show that this method outperforms heuristic approaches to training classifiers with labeled features. Experiments with human annotators show that it is more beneficial to spend limited annotation time labeling features rather than labeling instances. For example, after only one minute of labeling features, we can achieve 80% accuracy on the ibm vs. mac text classification problem using GE-FL, whereas ten minutes labeling documents results in an accuracy of only 77%.
A simple and efficient sampling method for estimating AP and NDCG BIBAFull-Text 603-610
  Emine Yilmaz; Evangelos Kanoulas; Javed A. Aslam
We consider the problem of large scale retrieval evaluation. Recently two methods based on random sampling were proposed as a solution to the extensive effort required to judge tens of thousands of documents. While the first method proposed by Aslam et al. [1] is quite accurate and efficient, it is overly complex, making it difficult to be used by the community, and while the second method proposed by Yilmaz et al., infAP [14], is relatively simple, it is less efficient than the former since it employs uniform random sampling from the set of complete judgments. Further, none of these methods provide confidence intervals on the estimated values.
   The contribution of this paper is threefold: (1) we derive confidence intervals for infAP, (2) we extend infAP to incorporate nonrandom relevance judgments by employing stratified random sampling, hence combining the efficiency of stratification with the simplicity of random sampling, (3) we describe how this approach can be utilized to estimate nDCG from incomplete judgments. We validate the proposed methods using TREC data and demonstrate that these new methods can be used to incorporate nonrandom samples, as were available in TREC Terabyte track '06.
A general optimization framework for smoothing language models on graph structures BIBAFull-Text 611-618
  Qiaozhu Mei; Duo Zhang; ChengXiang Zhai
Recent work on language models for information retrieval has shown that smoothing language models is crucial for achieving good retrieval performance. Many different effective smoothing methods have been proposed, which mostly implement various heuristics to exploit corpus structures. In this paper, we propose a general and unified optimization framework for smoothing language models on graph structures. This framework not only provides a unified formulation of the existing smoothing heuristics, but also serves as a road map for systematically exploring smoothing methods for language models. We follow this road map and derive several different instantiations of the framework. Some of the instantiations lead to novel smoothing methods. Empirical results show that all such instantiations are effective with some outperforming the state of the art smoothing methods.

Text classification

Deep classification in large-scale text hierarchies BIBAFull-Text 619-626
  Gui-Rong Xue; Dikan Xing; Qiang Yang; Yong Yu
Most classification algorithms are best at categorizing the Web documents into a few categories, such as the top two levels in the Open Directory Project. Such a classification method does not give very detailed topic-related class information for the user because the first two levels are often too coarse. However, classification on a large-scale hierarchy is known to be intractable for many target categories with cross-link relationships among them. In this paper, we propose a novel deep-classification approach to categorize Web documents into categories in a large-scale taxonomy. The approach consists of two stages: a search stage and a classification stage. In the first stage, a category-search algorithm is used to acquire the category candidates for a given document. Based on the category candidates, we prune the large-scale hierarchy to focus our classification effort on a small subset of the original hierarchy. As a result, the classification model is trained on the small subset before being applied to assign the category for a new document. Since the category candidates are sufficiently close to each other in the hierarchy, a statistical-language-model based classifier using n-gram features is exploited. Furthermore, the structure of the taxonomy can be utilized in this stage to improve the performance of classification. We demonstrate the performance of our proposed algorithms on the Open Directory Project with over 130,000 categories. Experimental results show that our proposed approach can reach 51.8% on the measure of Mi-F1 at the 5th level, which is 77.7% improvement over top-down based SVM classification algorithms.
Topic-bridged PLSA for cross-domain text classification BIBAFull-Text 627-634
  Gui-Rong Xue; Wenyuan Dai; Qiang Yang; Yong Yu
In many Web applications, such as blog classification and new-sgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain is expensive and time consuming, while there may be plenty of labeled data in a related but different domain. Traditional text classification ap-proaches are not able to cope well with learning across different domains. In this paper, we propose a novel cross-domain text classification algorithm which extends the traditional probabilistic latent semantic analysis (PLSA) algorithm to integrate labeled and unlabeled data, which come from different but related domains, into a unified probabilistic model. We call this new model Topic-bridged PLSA, or TPLSA. By exploiting the common topics between two domains, we transfer knowledge across different domains through a topic-bridge to help the text classification in the target domain. A unique advantage of our method is its ability to maximally mine knowledge that can be transferred between domains, resulting in superior performance when compared to other state-of-the-art text classification approaches. Experimental eval-uation on different kinds of datasets shows that our proposed algorithm can improve the performance of cross-domain text classification significantly.
trNon-greedy active learning for text categorization using convex ansductive experimental design BIBAFull-Text 635-642
  Kai Yu; Shenghuo Zhu; Wei Xu; Yihong Gong
In this paper we propose a non-greedy active learning method for text categorization using least-squares support vector machines (LSSVM). Our work is based on transductive experimental design (TED), an active learning formulation that effectively explores the information of unlabeled data. Despite its appealing properties, the optimization problem is however NP-hard and thus -- like most of other active learning methods -- a greedy sequential strategy to select one data example after another was suggested to find a suboptimum. In this paper we formulate the problem into a continuous optimization problem and prove its convexity, meaning that a set of data examples can be selected with a guarantee of global optimum. We also develop an iterative algorithm to efficiently solve the optimization problem, which turns out to be very easy-to-implement. Our text categorization experiments on two text corpora empirically demonstrated that the new active learning algorithm outperforms the sequential greedy algorithm, and is promising for active text categorization applications.
Classifiers without borders: incorporating fielded text from neighboring web pages BIBAFull-Text 643-650
  Xiaoguang Qi; Brian D. Davison
Accurate web page classification often depends crucially on information gained from neighboring pages in the local web graph. Prior work has exploited the class labels of nearby pages to improve performance. In contrast, in this work we utilize a weighted combination of the contents of neighbors to generate a better virtual document for classification. In addition, we break pages into fields, finding that a weighted combination of text from the target and fields of neighboring pages is able to reduce classification error by more than a third. We demonstrate performance on a large dataset of pages from the Open Directory Project and validate the approach using pages from a crawl from the Stanford WebBase. Interestingly, we find no value in anchor text and unexpected value in page titles (and especially titles of parent pages) in the virtual document.

Evaluation: 2

Evaluation over thousands of queries BIBAFull-Text 651-658
  Ben Carterette; Virgil Pavlu; Evangelos Kanoulas; Javed A. Aslam; James Allan
Information retrieval evaluation has typically been performed over several dozen queries, each judged to near-completeness. There has been a great deal of recent work on evaluation over much smaller judgment sets: how to select the best set of documents to judge and how to estimate evaluation measures when few judgments are available. In light of this, it should be possible to evaluate over many more queries without much more total judging effort. The Million Query Track at TREC 2007 used two document selection algorithms to acquire relevance judgments for more than 1,800 queries. We present results of the track, along with deeper analysis: investigating tradeoffs between the number of queries and number of judgments shows that, up to a point, evaluation over more queries with fewer judgments is more cost-effective and as reliable as fewer queries with more judgments. Total assessor effort can be reduced by 95% with no appreciable increase in evaluation errors.
Novelty and diversity in information retrieval evaluation BIBAFull-Text 659-666
  Charles L. A. Clarke; Maheedhar Kolla; Gordon V. Cormack; Olga Vechtomova; Azin Ashkan; Stefan Büttcher; Ian MacKinnon
Evaluation measures act as objective functions to be optimized by information retrieval systems. Such objective functions must accurately reflect user requirements, particularly when tuning IR systems and learning ranking functions. Ambiguity in queries and redundancy in retrieved documents are poorly reflected by current evaluation measures. In this paper, we present a framework for evaluation that systematically rewards novelty and diversity. We develop this framework into a specific evaluation measure, based on cumulative gain. We demonstrate the feasibility of our approach using a test collection based on the TREC question answering track.
Relevance assessment: are judges exchangeable and does it matter BIBAFull-Text 667-674
  Peter Bailey; Nick Craswell; Ian Soboroff; Paul Thomas; Arjen P. de Vries; Emine Yilmaz
We investigate to what extent people making relevance judgements for a reusable IR test collection are exchangeable. We consider three classes of judge: "gold standard" judges, who are topic originators and are experts in a particular information seeking task; "silver standard" judges, who are task experts but did not create topics; and "bronze standard" judges, who are those who did not define topics and are not experts in the task.
   Analysis shows low levels of agreement in relevance judgements between these three groups. We report on experiments to determine if this is sufficient to invalidate the use of a test collection for measuring system performance when relevance assessments have been created by silver standard or bronze standard judges. We find that both system scores and system rankings are subject to consistent but small differences across the three assessment sets. It appears that test collections are not completely robust to changes of judge when these judges vary widely in task and topic expertise. Bronze standard judges may not be able to substitute for topic and task experts, due to changes in the relative performance of assessed systems, and gold standard judges are preferred.
Intuition-supporting visualization of user's performance based on explicit negative higher-order relevance BIBAFull-Text 675-682
  Heikki Keskustalo; Kalervo Järvelin; Ari Pirkola; Jaana Kekäläinen
Modeling the beyond-topical aspects of relevance are currently gaining popularity in IR evaluation. For example, the discounted cumulated gain (DCG) measure implicitly models some aspects of higher-order relevance via diminishing the value of relevant documents seen later during retrieval (e.g., due to information cumulated, redundancy, and effort). In this paper, we focus on the concept of negative higher-order relevance (NHOR) made explicit via negative gain values in IR evaluation. We extend the computation of DCG to allow negative gain values, perform an experiment in a laboratory setting, and demonstrate the characteristics of NHOR in evaluation. The approach leads to intuitively reasonable performance curves emphasizing, from the user's point of view, the progression of retrieval towards success or failure. We discuss normalization issues when both positive and negative gain values are allowed and conclude by discussing the usage of NHOR to characterize test collections.

Posters group 1: evaluation, text collections and user/personalized IR

Relevance judgments between TREC and Non-TREC assessors BIBAFull-Text 683-684
  Azzah Al-Maskari; Mark Sanderson; Paul Clough
This paper investigates the agreement of relevance assessments between official TREC judgments and those generated from an interactive IR experiment. Results show that 63% of documents judged relevant by our users matched official TREC judgments. Several factors contributed to differences in the agreements: the number of retrieved relevant documents; the number of relevant documents judged; system effectiveness per topic and the ranking of relevant documents.
Evaluation measures for preference judgments BIBAFull-Text 685-686
  Ben Carterette; Paul N. Bennett
There has been recent interest in collecting user or assessor preferences, rather than absolute judgments of relevance, for the evaluation or learning of ranking algorithms. Since measures like precision, recall, and DCG are defined over absolute judgments, evaluation over preferences will require new evaluation measures that explicitly model them. We describe a class of such measures and compare absolute and preference measures over a large TREC collection.
Exploring evaluation metrics: GMAP versus MAP BIBAFull-Text 687-688
  Sri Devi Ravana; Alistair Moffat
In retrieval experiments, an effectiveness metrics is used to generate a score for each system-topic pair being tested. It is then usual to average the system-topic scores to obtain a system score, which is used for the purpose of system comparison. In this paper we explore the ramifications of using the geometric mean (GMAP), rather than the arithmetic mean (MAP) when computing an aggregate system score from a set of system-topic scores. We find that GMAP does indeed handle variability in topic difficulty more consistently than does the usual MAP aggregation method.
A new interpretation of average precision BIBAFull-Text 689-690
  Stephen Robertson
We consider the question of whether Average Precision, as a measure of retrieval effectiveness, can be regarded as deriving from a model of user searching behaviour. It turns out that indeed it can be so regarded, under a very simple stochastic model of user behaviour.
Comparing metrics across TREC and NTCIR:: the robustness to pool depth bias BIBKFull-Text 691-692
  Tetsuya Sakai
Keywords: evaluation metrics, graded relevance, test collection
Relevance thresholds in system evaluations BIBAFull-Text 693-694
  Falk Scholer; Andrew Turpin
We introduce and explore the concept of an individual's relevance threshold as a way of reconciling differences in outcomes between batch and user experiments.
Precision-at-ten considered redundant BIBAFull-Text 695-696
  William Webber; Alistair Moffat; Justin Zobel; Tetsuya Sakai
Information retrieval systems are compared using evaluation metrics, with researchers commonly reporting results for simple metrics such as precision-at-10 or reciprocal rank together with more complex ones such as average precision or discounted cumulative gain. In this paper, we demonstrate that complex metrics are as good as or better than simple metrics at predicting the performance of the simple metrics on other topics. Therefore, reporting of results from simple metrics alongside complex ones is redundant.
Structuring collections with Scatter/Gather extensions BIBAFull-Text 697-698
  Omar Alonso; Justin Talbot
A major component of sense-making is organizing -- grouping, labeling, and summarizing -- the data at hand in order to form a useful mental model, a necessary precursor to identifying missing information and to reasoning about the data. Previous work has shown the Scatter/Gather model to be useful in exploratory activities that occur when users encounter unknown document collections. However, the topic structure communicated by Scatter/Gather is closely tied to the behavior of the underlying clustering algorithm; this structure may not reflect the mental model most applicable to the information need. In this paper we describe the initial design of a mixed-initiative information structuring tool that leverages aspects of the well-studied Scatter/Gather model but permits the user to impose their own desired structure when necessary.
Text collections for FIRE BIBAFull-Text 699-700
  Prasenjit Majumder; Mandar Mitra; Dipasree Pal; Ayan Bandyopadhyay; Samaresh Maiti; Sukanya Mitra; Aparajita Sen; Sukomal Pal
The aim of the Forum for Information Retrieval Evaluation (FIRE) is to create a Cranfield-like evaluation framework in the spirit of TREC, CLEF and NTCIR, for Indian Language Information Retrieval. For the first year, six Indian languages have been selected: Bengali, Hindi, Marathi, Punjabi, Tamil, and Telugu. This poster describes the tasks as well as the document and topic collections that are to be used at the FIRE workshop.
A longitudinal study of real-time search assistance adoption BIBAFull-Text 701-702
  Peter Anick; Raj Gopal Kantamneni
We present findings from a log based study designed to track the adoption of features of a new real-time query refinement interface deployed on the Yahoo search engine. Several trends from the first four months are noted and discussed.
TopicRank: bringing insight to users BIBKFull-Text 703-704
  Ivan Berlocher; Kyung-il Lee; Kono Kim
Keywords: automated annotation, tag clouds, word clustering
Talking the talk vs. walking the walk: salience of information needs in querying vs. browsing BIBAFull-Text 705-706
  Mikhail Bilenko; Ryen W. White; Matthew Richardson; G. Craig Murray
Traditional information retrieval models assume that users express their information needs via text queries (i.e., their "talk"). In this poster, we consider Web browsing behavior outside of interactions with retrieval systems (i.e., users' "walk") as an alternative source of signal describing users' information needs, and compare it to the query-expressed information needs on a large dataset. Our findings demonstrate that information needs expressed in different behavior modalities are largely non-overlapping, and that past behavior in each modality is the most accurate predictor of future behavior in that modality. Results also show that browsing data provides a stronger source of signal than search queries due to its greater volume, which explains previous work that has found implicit behavioral data to be a valuable source of information for user modeling and personalization.
Exploring mouse movements for inferring query intent BIBAFull-Text 707-708
  Qi Guo; Eugene Agichtein
Clickthrough on search results have been successfully used to infer user interest and preferences, but are often noisy and potentially ambiguous. We explore the potential of a complementary, more sensitive signal -- mouse movements -- in providing insights into the intent behind a web search query. We report preliminary results of studying user mouse movements on search result pages, with the goal of inferring user intent -- in particular, to explore whether we can automatically distinguish the different query classes such as navigational vs. informational. Our preliminary exploration confirms the value of studying mouse movements for user intent inference, and suggests interesting avenues for future exploration.
Emulating query-biased summaries using document titles BIBAFull-Text 709-710
  Hideo Joho; David Hannah; Joemon M. Jose
Generating query-biased summaries can take up a large part of the response time of interactive information retrieval (IIR) systems. This paper proposes to use document titles as an alternative to queries in the generation of summaries. The use of document titles allows us to pre-generate summaries statically, and thus, improve the response speed of IIR systems. Our experiments suggest that title-biased summaries are a promising alternative to query-biased summaries.
Hierarchical naive bayes models for representing user profiles BIBAFull-Text 711-712
  J. F. Huete; L. M. de Campos; J. M. Fernandez-Luna; M. A. Rueda-Morales
In this paper, we show how a user profile can be enhanced when a more detailed description of the products is included. Two main assumptions have been considered: the first implies that the set of features used to describe an item can be organized into a well-defined set of components or categories, and the second is that the user's rating for a given item is obtained by combining user opinions of the relevance of each component.
A topical PageRank based algorithm for recommender systems BIBAFull-Text 713-714
  Liyan Zhang; Kai Zhang; Chunping Li
In this paper, we propose a Topical PageRank based algorithm for recommender systems, which aim to rank products by analyzing previous user-item relationships, and recommend top-rank items to potentially interested users. We evaluate our algorithm on MovieLens dataset and empirical experiments demonstrate that it outperforms other state-of-the-art recommending algorithms.
The impact of history length on personalized search BIBAFull-Text 715-716
  Yangbo Zhu; Jamie Callan; Jaime Carbonell
Personalized search is a promising way to better serve different users' information needs. Search history is one of the major information sources for search personalization. We investigated the impact of history length on the effectiveness of personalized ranking. We carried out task-based user study for Web search, and obtained ranked relevance judgments for all queries. Query contexts derived from previous queries in the same task are used to re-rank results for the current query. Experimental results show that the performance of personalization generally improves as more queries are accumulated, but most of the benefits come from a few immediately preceding queries.
User preference choices for complex question answering BIBAFull-Text 717-718
  Mingfang Wu; Falk Scholer; Andrew Turpin
Question answering systems increasingly need to deal with complex information needs that require more than simple factoid answers. The evaluation of such systems is usually carried out using precision- or recall-based system performance metrics. Previous work has demonstrated that when users are shown two search result lists side-by-side, they can reliably differentiate between the qualities of the lists. We investigate the consistency between this user-based approach and system-oriented metrics in the question answering environment. Our initial results indicate that the two methodologies show a high level of disagreement.
Towards personalized distributed information retrieval BIBAFull-Text 719-720
  Mark J. Carman; Fabio Crestani
Our aim is to investigate if and how the performance of Distributed Information Retrieval (DIR) systems can be improved through personalization. Toward this aim we are building a testbed of document collections and corresponding personalized relevance judgments. In this paper we discuss our intended approach for personalizing the three different phases of the DIR process. We also describe the test collection we are building and discuss our methodology for evaluating personalized DIR using relevance information taken from social bookmarking data.
Task-aware search personalization BIBAFull-Text 721-722
  Julia Luxenburger; Shady Elbassuoni; Gerhard Weikum
Search personalization has been pursued in many ways, in order to provide better result rankings and better overall search experience to individual users [5]. However, blindly applying personalization to all user queries, for example, by a background model derived from the user's long-term query-and-click history, is not always appropriate for aiding the user in accomplishing her actual task. User interests change over time, a user sometimes works on very different categories of tasks within a short timespan, and history-based personalization may impede a user's desire of discovering new topics. In this paper we propose a personalization framework that is selective in a twofold sense. First, it selectively employs personalization techniques for queries that are expected to benefit from prior history information, while refraining from undue actions otherwise. Second, we introduce the notion of tasks representing different granularity levels of a user profile, ranging from very specific search goals to broad topics, and base our reasoning selectively on query-relevant user tasks. These considerations are cast into a statistical language model for tasks, queries, and documents, supporting both judicious query expansion and result re-ranking. The effectiveness of our method is demonstrated by an empirical user study.

Posters group 2: blog, tagging, opinion analysis and web IR

Personal vs non-personal blogs: initial classification experiments BIBAFull-Text 723-724
  Erik Elgersma; Maarten de Rijke
We address the task of separating personal from non-personal blogs, and report on a set of baseline experiments where we compare the performance on a small set of features across a set of five classifiers. We show that with a limited set of features a performance of up to 90% can be obtained.
Exploiting subjectivity analysis in blogs to improve political leaning categorization BIBAFull-Text 725-726
  Maojin Jiang; Shlomo Argamon
In this paper, we address a relatively new and interesting text categorization problem: classify a political blog as either liberal or conservative, based on its political leaning. Our subjectivity analysis based method is twofold: 1) we identify subjective sentences that contain at least two strong subjective clues based on the General Inquirer dictionary; 2) from subjective sentences identified, we extract opinion expressions and other features to build political leaning classifiers. Experimental results with a political blog corpus we built show that by using features from subjective sentences can significantly improve the classification performance. In addition, by extracting opinion expressions from subjective sentences, we are able to reveal opinions that are characteristic of a specific political leaning to some extent.
Ranking opinionated blog posts using OpinionFinder BIBAFull-Text 727-728
  Ben He; Craig Macdonald; Iadh Ounis
The aim of an opinion finding system is not just to retrieve relevant documents, but to also retrieve documents that express an opinion towards the query target entity. In this work, we propose a way to use and integrate an opinion-identification toolkit, OpinionFinder, into the retrieval process of an Information Retrieval (IR) system, such that opinionated, relevant documents are retrieved in response to a query. In our experiments, we vary the number of top-ranked documents that must be parsed in response to a query, and investigate the effect on opinion retrieval performance and required parsing time. We find that opinion finding retrieval performance is improved by integrating OpinionFinder into the retrieval system, and that retrieval performance grows as more posts are parsed by OpinionFinder. However, the benefit eventually tails off at a deep rank, suggesting that an optimal setting for the system has been achieved.
Searching blogs and news: a study on popular queries BIBAFull-Text 729-730
  Aixin Sun; Meishan Hu; Ee-Peng Lim
Blog/news search engines are very important channels to reach information about the real-time happenings. In this paper, we study the popular queries collected over one year period and compare their search results returned by a blog search engine (i.e., Technorati) and a news search engine (i.e., Google News). We observed that the numbers of hits returned by the two search engines for the same set of queries were highly correlated, suggesting that blogs often provide commentary to current events reported in news. As many popular queries are related to some events, we further observed a high cohesiveness among the returned search results for these queries.
Aggregated click-through data in a homogeneous user community BIBAFull-Text 731-732
  Mingfang Wu; Andrew Turpin; Justin Zobel
There are many proposed methods for using clickthrough data for common queries to improve the quality of search results returned for that query. In this study we examine the search behaviour of users in a close-knit community on such queries. We argue that the benefit of using aggregated clickthrough data varies from task to task: it may improve document rankings for navigational or specific informational queries, but is less likely to be of value to users issuing a broad informational query.
To tag or not to tag -: harvesting adjacent metadata in large-scale tagging systems BIBAFull-Text 733-734
  Adriana Budura; Sebastian Michel; Philippe Cudré-Mauroux; Karl Aberer
We present HAMLET, a suite of principles, scoring models and algorithms to automatically propagate metadata along edges in a document neighborhood. As a showcase scenario we consider tag prediction in community-based Web 2.0 tagging applications. Experiments using real-world data demonstrate the viability of our approach in large-scale environments where tags are scarce. To the best of our knowledge, HAMLET is the first system to promote an efficient and precise reuse of shared metadata in highly dynamic, large-scale Web 2.0 tagging systems.
Exploring question subjectivity prediction in community QA BIBAFull-Text 735-736
  Baoli Li; Yandong Liu; Ashwin Ram; Ernest V. Garcia; Eugene Agichtein
In this paper we begin to investigate how to automatically determine the subjectivity orientation of questions posted by real users in community question answering (CQA) portals. Subjective questions seek answers containing private states, such as personal opinion and experience. In contrast, objective questions request objective, verifiable information, often with support from reliable sources. Knowing the question orientation would be helpful not only for evaluating answers provided by users, but also for guiding the CQA engine to process questions more intelligently. Our experiments on Yahoo! Answers data show that our method exhibits promising performance.
On the evolution of the yahoo! answers QA community BIBAFull-Text 737-738
  Yandong Liu; Eugene Agichtein
While question answering communities have been gaining popularity for several years, we wonder if the increased popularity actually improves or degrades the user experience. In addition, automatic QA systems, which utilize different sources such as search engines and social media, are emerging rapidly. QA communities have already created abundant resources of millions of questions and hundreds of millions of answers. The question whether they will continue to serve as an effective source is of information for web search and question answering is of vital importance. In this poster, we investigate the temporal evolution of a popular QA community -- Yahoo! Answers, with respect to its effectiveness in answering three basic types of questions: factoid, opinion and complex questions. Our experiments show that Yahoo! Answers keeps growing rapidly, while its overall quality as an information source for factoid question-answering degrades. However, instead of answering factoid questions, it might be more effective to answer opinion and complex questions.
Detecting synonyms in social tagging systems to improve content retrieval BIBAFull-Text 739-740
  Maarten Clements; Arjen P. de Vries; Marcel J. T. Reinders
Collaborative tagging used in online social content systems is naturally characterized by many synonyms, causing low precision retrieval. We propose a mechanism based on user preference profiles to identify synonyms that can be used to retrieve more relevant documents by expanding the user's query. Using a popular online book catalog we discuss the effectiveness of our method over usual similarity based expansion methods.
SOPING: a Chinese customer review mining system BIBAFull-Text 741-742
  Chao Zhou; Guang Qiu; Kangmiao Liu; Jiajun Bu; Mingcheng Qu; Chun Chen
With the booming development of the Web, popular Chinese forums enable people to find experienced customers' reviews for products. In order to get an all-around opinion about one product, users need to go through plenty of web pages, which is time-consuming and inefficient. Consequently, automatic review mining and summarization has become a hot research topic recently. However, previous approaches are not applicable for mining Chinese customer reviews. In this paper, we introduce SOPING, a Chinese customer review mining system that mines reviews from forums. Specifically, we propose a novel search-based approach to extract product features and a feature-oriented sentence orientation determination method. Our experimental results show that our proposed techniques are highly effective.
Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples BIBAFull-Text 743-744
  Songbo Tan; Yuefen Wang; Xueqi Cheng
In this work, we propose a novel scheme for sentiment classification (without labeled examples) which combines the strengths of both "learn-based" and "lexicon-based" approaches as follows: we first use a lexicon-based technique to label a portion of informative examples from given task (or domain); then learn a new supervised classifier based on these labeled ones; finally apply this classifier to the task. The experimental results indicate that proposed scheme could dramatically outperform "learn-based" and "lexicon-based" techniques.
Semi-supervised spam filtering: does it work? BIBAFull-Text 745-746
  Mona Mojdeh; Gordon V. Cormack
The results of the 2006 ECML/PKDD Discovery Challenge suggest that semi-supervised learning methods work well for spam filtering when the source of available labeled examples differs from those to be classified. We have attempted to reproduce these results using data from the 2005 and 2007 TREC Spam Track, and have found the opposite effect: methods like self-training and transductive support vector machines yield inferior classifiers to those constructed using supervised learning on the labeled data alone. We investigate differences between the ECML/PKDD and TREC data sets and methodologies that may account for the opposite results.
Limits of opinion-finding baseline systems BIBAFull-Text 747-748
  Craig Macdonald; Ben He; Iadh Ounis; Ian Soboroff
In opinion-finding, the retrieval system is tasked with retrieving not just relevant documents, but which also express an opinion towards the query target entity. Most opinion-finding systems are based on a two-stage approach, where initially the system aims to retrieve relevant documents, which are then re-ranked according to the extent to which they are detected to be of an opinionated nature. In this work, we investigate how the underlying 'baseline' retrieval system performance affects the overall opinion-finding performance. We apply two effective opinion-finding techniques to all the baseline runs submitted to the TREC 2007 Blog track, and draw new insights and conclusions.
Web query translation via web log mining BIBAFull-Text 749-750
  Rong Hu; Weizhu Chen; Peng Bai; Yansheng Lu; Zheng Chen; Qiang Yang
This paper describes a method to automatically acquire query translation pairs by mining web click-through data. The extraction requires no crawling or Chinese words segmentation, and can capture popular translations. Experimental results on a real click-through data show that only 17.4% of the extracted queries are in the dictionary, and our method can achieve 62.2% (in top-1) to 80.0% (in top-5) precision in translating web queries. Moreover, the extracted translations are semantically relevant to the source query, which is particularly useful for Cross-Lingual Information Retrieval (CLIR).
Analyzing web text association to disambiguate abbreviation in queries BIBAFull-Text 751-752
  Xing Wei; Fuchun Peng; Benoit Dumoulin
We introduce a statistical model for abbreviation disambiguation in Web search, based on analysis of Web data resources, including anchor text, click log and query log. By combining evidence from multiple sources, we are able to accurately disambiguate the abbreviation in queries. Experiments on real Web search queries show promising results.
Bloggers as experts: feed distillation using expert retrieval models BIBAFull-Text 753-754
  Krisztian Balog; Maarten de Rijke; Wouter Weerkamp
We address the task of (blog) feed distillation: to find blogs that are principally devoted to a given topic. The task may be viewed as an association finding task, between topics and bloggers. Under this view, it resembles the expert finding task, for which a range of models have been proposed. We adopt two language modeling-based approaches to expert finding, and determine their effectiveness as feed distillation strategies. The two models capture the idea that a human will often search for key blogs by spotting highly relevant posts (the Posting model) or by taking global aspects of the blog into account (the Blogger model). Results show the Blogger model outperforms the Posting model and delivers state-of-the art performance, out-of-the-box.
Search effectiveness with a breadth-first crawl BIBAFull-Text 755-756
  Dennis Fetterly; Nick Craswell; Vishwa Vinay
Previous scalability experiments found that early precision improves as collection size increases. However, that was under the assumption that a collection's documents are all sampled with uniform probability from the same population. We contrast this to a large breadth-first web crawl, an important scenario in real-world Web search, where the early documents have quite different characteristics from the later documents.
Guide focused crawler efficiently and effectively using on-line topical importance estimation BIBAFull-Text 757-758
  Ziyu Guan; Can Wang; Chun Chen; Jiajun Bu; Junfeng Wang
Focused crawling is a critical technique for topical resource discovery on the Web. We propose a new frontier prioritizing algorithm, namely, the OTIE (On-line Topical Importance Estimation) algorithm, which efficiently and effectively combines link-based and content-based analysis to evaluate the priority of an uncrawled URL in the frontier. We then demonstrate OTIE's advantages over traditional prioritizing algorithms by real crawling experiments.
Web page retrieval in ubiquitous sensor environments BIBAFull-Text 759-760
  Takuya Maekawa; Yutaka Yanagisawa; Yasushi Sakurai; Yasue Kishino; Koji Kamei; Takeshi Okadome
This paper proposes new concept of query free web search for daily living. We ordinarily benefit from additional information about our daily activities that we are currently engaged in. When washing a coffee maker, for example, we receive the benefit if we obtain such information as 'cleaning a coffee maker with vinegar removes its stain well.' Our proposed method automatically searches for a web page including such information relates to an activity of daily living when the activity is performed. We assume that wireless sensor nodes are attached to daily objects to detect object use; our method makes a query from the names of objects which are used. Then, the method retrieves a web page relates to the activity of daily living by using the query.
Automatic document prior feature selection for web retrieval BIBAFull-Text 761-762
  Jie Peng; Craig Macdonald; Iadh Ounis
Document prior features, such as Pagerank and URL depth, can improve the retrieval effectiveness of Web Information Retrieval (IR) systems. However, not all queries equally benefit from the application of a document prior feature. This paper aims to investigate whether the retrieval performance can be further enhanced by selecting the best document prior feature on a per-query basis. We present a novel method for selecting the best document prior feature on a per-query basis. We evaluate our technique on the TREC .GOV Web test collection and its associated TREC 2003 Web search tasks. Our experiments demonstrate the effectiveness and robustness of our proposed selection method.
Using parsimonious language models on web data BIBAFull-Text 763-764
  Rianne Kaptein; Rongmei Li; Djoerd Hiemstra; Jaap Kamps
In this paper we explore the use of parsimonious language models for web retrieval. These models are smaller thus more efficient than the standard language models and are therefore well suited for large-scale web retrieval. We have conducted experiments on four TREC topic sets, and found that the parsimonious language model results in improvement of retrieval effectiveness over the standard language model for all data-sets and measures. In all cases the improvement is significant, and more substantial than in earlier experiments on newspaper/newswire data.
Query preprocessing: improving web search through a Vietnamese word tokenization approach BIBAFull-Text 765-766
  Doan Nguyen
In this poster paper, we propose a novel approach to improve web search relevancy by tokenizing a Vietnamese query text prior submitting it to a search engine. Evaluations demonstrate its effectiveness and practical value.

Posters group 3: multimedia and domain specific IR

AdImage: video advertising by image matching and ad scheduling optimization BIBAFull-Text 767-768
  Wei-Shing Liao; Kuan-Ting Chen; Winston H. Hsu
With the prevalence of recording devices and the ease of media sharing, consumers are embracing huge amounts of Internet videos. There arise the needs for effective video advertisement systems following their phenomenal success in text. We propose a novel advertising system, AdImage, which automatically associates relevant ads by matching characteristic images, referred to as adImages (analogous to adWords) here. The proposed image matching method is invariant to certain distortions commonly observed in shared videos. AdImage also avoids the pitfalls of poor tagging qualities in shared videos and provides a brand-new venue to specify ad targets by image objects. Moreover, we formulate the image matching scores and the parameterized bidding information as a nonlinear optimization problem for maximizing the system revenues and user perception.
Bag-of-visual-words expansion using visual relatedness for video indexing BIBAFull-Text 769-770
  Yu-Gang Jiang; Chong-Wah Ngo
Bag-of-visual-words (BoW) has been popular for visual classification in recent years. In this paper, we propose a novel BoW expansion method to alleviate the effect of visual word correlation problem. We achieve this by diffusing the weights of visual words in BoW based on visual word relatedness, which is rigorously defined within a visual ontology. The proposed method is tested in video indexing experiment on TRECVID-2006 video retrieval benchmark, and an improvement of 7% over the traditional BoW is reported.
A word shape coding method for camera-based document images BIBAFull-Text 771-772
  Linlin Li; Chew Lim Tan
This paper reports a word shape coding method to facilitate retrieval of camera-based document images without OCR. Due to perspective distortion, many reported word shape coding methods fail on camera-based images. In this paper, the problem is addressed by approximating the perspective transformation with an affine transformation, and employing an affine invariant, namely length ratio, to represent the connected components. Components in a document image are classified into a few clusters, each of which is assigned with a representative symbol. Retrieval are based on "words" comprising of symbols. The experiment results showed that the proposed method achieved an average retrieval precision of 93.43% and recall of 94.22%.
Term clouds as surrogates for user generated speech BIBAFull-Text 773-774
  Manos Tsagias; Martha Larson; Maarten de Rijke
User generated spoken audio remains a challenge for Automatic Speech Recognition (ASR) technology and content-based audio surrogates derived from ASR-transcripts must be error robust. An investigation of the use of term clouds as surrogates for podcasts demonstrates that ASR term clouds closely approximate term clouds derived from human-generated transcripts across a range of cloud sizes. A user study confirms the conclusion that ASR-clouds are viable surrogates for depicting the content of podcasts.
A faceted interface for multimedia search BIBAFull-Text 775-776
  Robert Villa; Nicholas Gildea; Joemon M. Jose
With the rapid increase in online video services, video retrieval systems are becoming increasingly important search tools to many users in many different fields. In this poster we present a novel video retrieval interface, which supports the creation of multiple search "facets", to aid users carrying out complex, multi-faceted search tasks. The interface allows multiple searches to be executed and viewed simultaneously, and allows material to be reorganized between the facets. An experiment is presented which compares the faceted interface to a tabbed interface similar to that on modern web browsers, and some preliminary results are given.
WISA: a novel web image semantic analysis system BIBAFull-Text 777-778
  Hongtao Xu; Xiangdong Zhou; Lan Lin
We present a novel Web Image Semantic Analysis (WISA) system, which explores the problem of adaptively modeling the distributions of the semantic labels of the web image on its surrounding text. To deal with this problem, we employ a new piecewise penalty weighted regression model to learn the weights of the contributions of the different parts of the surrounding text to the semantic labels of images. Experimental results on a real web image data set show that it can improve the performance of web image semantic annotation significantly.
One-button search extracts wider interests: an empirical study with video bookmarking search BIBAFull-Text 779-780
  Masayuki Okamoto; Masaaki Kikuchi; Tomohiro Yamasaki
This poster presents an overview of the characteristics of a one-button information retrieval interface with closed captions from TV watching activities, which is intended to lighten the burden of remembering and entering query terms while watching TV. We investigated this interface with an experimental system named Video Bookmarking Search, which estimates query terms from closed captions with named-entity recognition and sentence labeling techniques. According to an empirical evaluation for 1,138 search queries from 206 bookmarks using seven actual TV shows on city life, travel, health, and cuisine, we found wider queries and search results are acceptable through the query-input-free interface, despite the fact that the number of queries and search results that are directly relevant to the users' original intentions is not high. The main reason is a watching user's interest is wider than what is expressed with query terms.
Product retrieval for grocery stores BIBAFull-Text 781-782
  Petteri Nurmi; Eemil Lagerspetz; Wray Buntine; Patrik Floréen; Joonas Kukkonen
We introduce a grocery retrieval system that maps shopping lists written in natural language into actual products in a grocery store. We have developed the system using nine months of shopping basket data from a large Finnish supermarket. To evaluate the system, we used 70 real shopping lists gathered from customers of the supermarket. Our system achieves over 80% precision for products at rank one, and the precision is around 70% for products at rank 5.
A reranking model for genomics aspect search BIBAFull-Text 783-784
  Qinmin Hu; Xiangji Huang
In this paper, we propose a reranking model to improve the aspect-level performance in the biomedical domain. This model iteratively computes the maximum hidden aspect for every retrieved passage and then reranks these passages from aspect subsets. The experimental results show the improvements of the aspect-level performance up to 27.14% for 2006 Genomics topics and 27.09% for 2007 Genomics topics.
Improving biomedical document retrieval using domain knowledge BIBAFull-Text 785-786
  Shuguang Wang; Milos Hauskrecht
Research articles typically introduce new results or findings and relate them to knowledge entities of immediate relevance. However, a large body of context knowledge related to the results is often not explicitly mentioned in the article. To overcome this limitation the state-of-the-art information retrieval approaches rely on the latent semantic analysis in which terms in articles are projected to a lower dimensional latent space and best possible matches in this space are identified. However, this approach may not perform well enough if the number of explicit knowledge entities in the articles is too small compared to the amount of knowledge in the domain. We address the problem by exploiting a domain knowledge layer, a rich network of relations among knowledge entities in the domain extracted from a large corpus of documents. The knowledge layer supplies the context knowledge that lets us relate different knowledge entities and hence improve the information retrieval performance. We develop and study a new framework for i) learning and aggregating the relations in the knowledge layer from the literature corpus; ii) and for exploiting these relations to improve the information-retrieval of relevant documents.
Kleio: a knowledge-enriched information retrieval system for biology BIBAFull-Text 787-788
  Chikashi Nobata; Philip Cotter; Naoaki Okazaki; Brian Rea; Yutaka Sasaki; Yoshimasa Tsuruoka; Jun'ichi Tsujii; Sophia Ananiadou
Kleio is an advanced information retrieval (IR) system developed at the UK National Centre for Text Mining (NaCTeM)1. The system offers textual and metadata searches across MEDLINE and provides enhanced searching functionality by leveraging terminology management technologies.
Enhancing keyword-based botanical information retrieval with information extraction BIBAFull-Text 789-790
  Xiaoya Tang
Keyword-based retrieval matches search terms and documents via term co-occurrence. Such an approach does not allow matching based on the specific plant characteristic descriptions that are often used in botanical text retrieval. This study applies information extraction techniques to automatically extract plant characteristic information from text and allows users to search using such information in combination with keywords. An evaluation experiment was conducted using actual users. The results indicate that this approach enhances task-based retrieval performance.
How medical expertise influences web search interaction BIBAFull-Text 791-792
  Ryen W. White; Susan Dumais; Jaime Teevan
Domain expertise can have an important influence on how people search. In this poster we present findings from a log-based study into how medical domain experts search the Web for information related to their expertise, as compared with non-experts. We find differences in sites visited, query vocabulary, and search behavior. The findings have implications for the automatic identification of domain experts from interaction logs, and the use of domain knowledge in applications such as query suggestion or page recommendation to support non-experts.
Generating diverse katakana variants based on phonemic mapping BIBAFull-Text 793-794
  Kazuhiro Seki; Hiroyuki Hattori; Kuniaki Uehara
In Japanese, it is quite common for the same word to be written in several different ways. This is especially true for katakana words which are typically used for transliterating foreign languages. This ambiguity becomes critical for automatic processing such as information retrieval (IR). To tackle this problem, we propose a simple but effective approach to generating katakana variants by considering phonemic representation of the original language for a given word. The proposed approach is evaluated through an assessment of the variants it generates. Also, the impact of the generated variants on IR is studied in comparison to an existing approach using katakana rewriting rules.
Exploiting sequential dependencies for expert finding BIBAFull-Text 795-796
  Pavel Serdyukov; Henning Rode; Djoerd Hiemstra
We propose an expert finding method based on assumption of sequential dependence between a candidate expert and the query terms in the scope of a document. We assume that the strength of relation of a candidate to the document's content depends on its position in this document with respect to the positions of the query terms. The experiments on the official Enterprise TREC data demonstrate the advantage of our method over the method based on independence of query terms and persons in a document.
Modeling expert finding as an absorbing random walk BIBAFull-Text 797-798
  Pavel Serdyukov; Henning Rode; Djoerd Hiemstra
We introduce a novel approach to expert finding based on multi-step relevance propagation from documents to related candidates. Relevance propagation is modeled with an absorbing random walk. The evaluation on the two official Enterprise TREC data sets demonstrates the advantage of our method over the state-of-the-art method based on one-step propagation.
A scalable assistant librarian: hierarchical subject classification of books BIBAFull-Text 799-800
  Steven P. Crain; Jian Huang; Hongyuan Zha
In this paper, we discuss our work in progress towards a scalable hierarchical classification system for books using the Library of Congress subject hierarchy. We examine the characteristics of this domain which make the problem very challenging, and we look at several appropriate performance measurements. We show that both Hieron and Hierarchical Support Vector Machines perform moderately well.
Information retrieval on bug locations by learning co-located bug report clusters BIBAFull-Text 801-802
  Ing-Xiang Chen; Hojun Jaygarl; Cheng-Zen Yang; Ping-Jung Wu
Bug locating usually involves intensive search activities and incurs unpredictable cost of labor and time. An issue of information retrieval on bug locations is particularly addressed to facilitate identifying bugs from software code. In this paper, a novel bug retrieval approach with co-location shrinkage (CS) is proposed. The proposed approach has been implemented in open-source software projects collected from real-world repositories, and consistently improves the retrieval accuracy of a state-of-the-art Support Vector Machine (SVM) model.
Summarization of compressed text images: an experience on Indic script documents BIBAFull-Text 803-804
  Utpal Garain
Automatic summarization of JBIG2 coded textual images is discussed. Compressed images are partially decompressed to compute relevant features. The feature extraction method is free from using any character recognition module. Summary sentences are ranked. Experiment considers documents in Indic scripts that lack in having any efficient OCR systems. Script independent aspect of the approach is highlighted through use of two most popular Indic scripts. Sentence selection efficiency of about 61% is achieved when judged against man-made summarization. A nonparametric (distribution-free) rank statistic shows a correlation coefficient of 0.33 as a measure of the (minimum) strength of the associations between sentence ranking by machine and human.

Posters group 4: theory and IR models

A method for transferring retrieval scores between collections with non-overlapping vocabularies BIBAFull-Text 805-806
  Fernando D. Diaz
We present a method for projecting retrieval scores across two corpora with a shared, parallel corpus.
Improving relevance feedback in language modeling with score regularization BIBAFull-Text 807-808
  Fernando D. Diaz
We demonstrate that regularization can improve feedback in a language modeling framework.
Theoretical bounds on and empirical robustness of score regularization to different similarity measures BIBAFull-Text 809-810
  Fernando D. Diaz
We present theoretical bounds and empirical robustness of score regularization given changes in the similarity measure.
A study of query length BIBAFull-Text 811-812
  Avi Arampatzis; Jaap Kamps
We analyse query length, and fit power-law and Poisson distributions to four different query sets. We provide a practical model for query length, based on the truncation of a Poisson distribution for short queries and a power-law distribution for longer queries, that better fits real query length distributions than earlier proposals.
Don't have a stemmer?: be un+concern+ed BIBAFull-Text 813-814
  Paul McNamee; Charles Nicholas; James Mayfield
The choice of indexing terms used to represent documents crucially determines how effective subsequent retrieval will be. IR systems commonly use rule-based stemmers to normalize surface word forms to combat the problem of not finding documents that contain words related to query terms by inflectional or derivational morphology. But such stemmers are not available in all languages. In this paper we explore the effectiveness of unsupervised morphological segmentation as an alternative to stemming using test sets in thirteen European languages. We find that unsupervised segmentation is significantly better than unnormalized words, in several cases by more than 20%. However, rule-based stemming, if available, is better in low complexity languages. We also compare these methods to the use of character n-grams, finding that on average n-grams yield the best performance.
Parsimonious concept modeling BIBKFull-Text 815-816
  Edgar Meij; Dolf Trieschnigg; Maarten de Rijke; Wessel Kraaij
Keywords: language models, parsimonious models, relevance feedback
Parsimonious relevance models BIBAFull-Text 817-818
  Edgar Meij; Wouter Weerkamp; Krisztian Balog; Maarten de Rijke
We describe a method for applying parsimonious language models to re-estimate the term probabilities assigned by relevance models. We apply our method to six topic sets from test collections in five different genres. Our parsimonious relevance models (i) improve retrieval effectiveness in terms of MAP on all collections, (ii) significantly outperform their non-parsimonious counterparts on most measures, and (iii) have a precision enhancing effect, unlike other blind relevance feedback methods.
Author-topic evolution analysis using three-way non-negative Paratucker BIBAFull-Text 819-820
  Wei Peng; Tao Li
Analyzing three-way data has attracted a lot of attention recently due to the intrinsic rich structures in real-world datasets. The PARATUCKER model has been proposed to combine the axis capabilities of the Parafac model and the structural generality of the Tucker model. However, no algorithms have been developed for fitting the PARATUCKER model. In this paper, we propose TANPT algorithm to solve the PARATUCKER model. We apply the algorithm for temporal relation co-clustering on author-topic evolution. Experiments on DBLP datasets demonstrate its effectiveness.
Exploiting proximity feature in bigram language model for information retrieval BIBAFull-Text 821-822
  Seung-Hoon Na; Jungi Kim; In-Su Kang; Jong-Hyeok Lee
Language modeling approaches have been effectively dealing with the dependency among query terms based on N-gram such as bigram or trigram models. However, bigram language models suffer from adjacency-sparseness problem which means that dependent terms are not always adjacent in documents, but can be far from each other, sometimes with distance of a few sentences in a document. To resolve the adjacency-sparseness problem, this paper proposes a new type of bigram language model by explicitly incorporating the proximity feature between two adjacent terms in a query. Experimental results on three test collections show that the proposed bigram language model significantly improves previous bigram model as well as Tao's approach, the state-of-art method for proximity-based method.
Measuring concept relatedness using language models BIBAFull-Text 823-824
  Dolf Trieschnigg; Edgar Meij; Maarten de Rijke; Wessel Kraaij
Over the years, the notion of concept relatedness has attracted considerable attention. A variety of approaches, based on ontology structure, information content, association, or context have been proposed to indicate the relatedness of abstract ideas. We propose a method based on the cross entropy reduction between language models of concepts which are estimated based on document-concept assignments. The approach shows improved or competitive results compared to state-of-the-art methods on two test sets in the biomedical domain.
Query-drift prevention for robust query expansion BIBAFull-Text 825-826
  Liron Zighelnic; Oren Kurland
Pseudo-feedback-based automatic query expansion yields effective retrieval performance on average, but results in performance inferior to that of using the original query for many information needs. We address an important cause of this robustness issue, namely, the query drift problem, by fusing the results retrieved in response to the original query and to its expanded form. Our approach posts performance that is significantly better than that of retrieval based only on the original query and more robust than that of retrieval using the expanded query.
Adaptive label-driven scaling for latent semantic indexing BIBAFull-Text 827-828
  Xiaojun Quan; Enhong Chen; Qiming Luo; Hui Xiong
This paper targets on enhancing Latent Semantic Indexing (LSI) by exploiting category labels. Specifically, in the term-document matrix, the vector for each term either appearing in labels or semantically close to labels is scaled before performing Singular Value Decomposition (SVD) to boost its impact on the generated left singular vectors. As a result, the similarities among documents in the same category are increased. Furthermore, an adaptive scaling strategy is designed to better utilize the hierarchical structure of categories. Experimental results show that the proposed approach is able to significantly improve the performance of hierarchical text categorization.
Fixed-threshold SMO for Joint Constraint Learning Algorithm of Structural SVM BIBAFull-Text 829-830
  Changki Lee; HyunKi Kim; Myung-Gil Jang
In this paper, we describe a fixed-threshold sequential minimal optimization (FSMO) for a joint constraint learning algorithm of structural classification SVM problems. Because FSMO uses the fact that the joint constraint formulation of structural SVM has b=0, FSMO breaks down the quadratic programming (QP) problems of structural SVM into a series of smallest QP problems, each involving only one variable. By using only one variable, FSMO is advantageous in that each QP sub-problem does not need subset selection.
Posterior probabilistic clustering using NMF BIBAFull-Text 831-832
  Chris Ding; Tao Li; Dijun Luo; Wei Peng
We introduce the posterior probabilistic clustering (PPC), which provides a rigorous posterior probability interpretation for Nonnegative Matrix Factorization (NMF) and removes the uncertainty in clustering assignment. Furthermore, PPC is closely related to probabilistic latent semantic indexing (PLSI).
On document splitting in passage detection BIBAFull-Text 833-834
  Nazli Goharian; Saket S. R. Mengle
Passages can be hidden within a text to circumvent their disallowed transfer. Such release of compartmentalized information is of concern to all corporate and governmental organization. We explore the methodology to detect such hidden passages within a document. A document is divided into passages using various document splitting techniques, and a text classifier is used to categorize such passages. We present a novel document splitting technique called dynamic windowing, which significantly improves precision, recall and F1 measure.
Learning with support vector machines for query-by-multiple-examples BIBAFull-Text 835-836
  Dell Zhang; Wee Sun Lee
We explore an alternative Information Retrieval paradigm called Query-By-Multiple-Examples (QBME) where the information need is described not by a set of terms but by a set of documents. Intuitive ideas for QBME include using the centroid of these documents or the well-known Rocchio algorithm to construct the query vector. We consider this problem from the perspective of text classification, and find that a better query vector can be obtained through learning with Support Vector Machines (SVMs). For online queries, we show how SVMs can be learned from one-class examples in linear time. For offline queries, we show how SVMs can be learned from positive and unlabeled examples together in linear or polynomial time. The effectiveness and efficiency of the proposed approaches have been confirmed by our experiments on four real-world datasets.
Question classification with semantic tree kernel BIBAFull-Text 837-838
  Yan Pan; Yong Tang; Luxin Lin; Yemin Luo
Question Classification plays an important role in most Question Answering systems. In this paper, we exploit semantic features in Support Vector Machines (SVMs) for Question Classification. We propose a semantic tree kernel to incorporate semantic similarity information. A diverse set of semantic features is evaluated. Experimental results show that SVMs with semantic features, especially semantic classes, can significantly outperform the state-of-the-art systems.
Generalising multiple capture-recapture to non-uniform sample sizes BIBAFull-Text 839-840
  Paul Thomas
Algorithms in distributed information retrieval often rely on accurate knowledge of the size of a collection. The "multiple capture-recapture" method of Shokouhi et al. is one of the more reliable algorithms for determining collection size, but it relies on samples with a uniform number of documents. Such uniform samples are often hard to obtain in a working system.
   A simple generalisation of multiple capture-recapture does not rely on uniform sample sizes. Simulations show it is as accurate as the original method even when sample sizes vary considerably, making it a useful technique in real tools.
Predicting when browsing context is relevant to search BIBAFull-Text 841-842
  Mandar Rahurkar; Silviu Cucerzan
We investigate a representative case of sudden information need change of Web users. By analyzing search engine query logs, we show that the majority of queries submitted by users after browsing documents in the news domain are related to the most recently browsed document. We investigate ways of identifying whether a query is a good candidate for contextualization conditioned on the most recently browsed document by a user. We build a successful classifier for this task, which achieves 96% precision at 90% recall.

Posters group 5: structured IR, ranking, classification and filtering

XML-aided phrase indexing for hypertext documents BIBAFull-Text 843-844
  Miro Lehtonen; Antoine Doucet
We combine techniques of XML Mining and Text Mining for the benefit of Information Retrieval. By manipulating the word sequence according to the XML structure of the marked-up text, we strengthen phrase boundaries so that they are more obvious to the algorithms that extract multiword sequences from text. Consequently, the quality of the indexed phrases improves, which has a positive effect on the average precision measured by the INEX 2007 standards.
Proximity-aware scoring for XML retrieval BIBAFull-Text 845-846
  Andreas Broschart; Ralf Schenkel
Proximity-aware scoring functions lead to significant effectiveness improvements for text retrieval. For XML IR, we can sometimes enhance the retrieval quality by exploiting knowledge about the document structure combined with established text IR methods. This paper introduces modified proximity scores that take the document structure into account and demonstrates the effect for the INEX benchmark.
Locating relevant text within XML documents BIBAFull-Text 847-848
  Jaap Kamps; Marijn Koolen; Mounia Lalmas
Traditional document retrieval has shown to be a competitive approach in XML element retrieval, which is counter-intuitive since the element retrieval task requests all and only relevant document parts to be retrieved. This paper conducts a comparative analysis of document and element retrieval, highlights the relative strengths and weaknesses of both approaches, and explains the relative effectiveness of document retrieval approaches at element retrieval tasks.
A flexible extension of XPath to improve XML querying BIBAFull-Text 849-850
  Ernesto Damiani; Stefania Marrara; Gabriella Pasi
This work presents a flexible XML selection language, FleXPath which allows the formulation of flexible constraints on both structure and content of XML documents. Some experimental results, obtained with a preliminary prototype, are described in order to show that the idea promises good results.
Combining document- and paragraph-based entity ranking BIBAFull-Text 851-852
  Henning Rode; Pavel Serdyukov; Djoerd Hiemstra
We study entity ranking on the INEX entity track and propose a simple graph-based ranking approach that enables to combine scores on document and paragraph level. The combined approach improves the retrieval results not only on the INEX testset, but similarly on TREC's expert finding task.
Re-ranking search results using document-passage graphs BIBAFull-Text 853-854
  Michael Bendersky; Oren Kurland
We present a novel passage-based approach to re-ranking documents in an initially retrieved list so as to improve precision at top ranks. While most work on passage-based document retrieval ranks a document based on the query similarity of its constituent passages, our approach leverages information about the centrality of the document passages with respect to the initial document list. Passage centrality is induced over a bipartite document-passage graph, wherein edge weights represent document-passage similarities. Empirical evaluation shows that our approach yields effective re-ranking performance. Furthermore, the performance is superior to that of previously proposed passage-based document ranking methods.
Utilizing phrase based semantic information for term dependency BIBAFull-Text 855-856
  Yang Xu; Fan Ding; Bin Wang
Previous work on term dependency has not taken into account semantic information underlying query phrases. In this work, we study the impact of utilizing phrase based concepts for term dependency. We use Wikipedia to separate important and less important term dependencies, and treat them accordingly as features in a linear feature-based retrieval model. We compare our method with a Markov Random Field (MRF) model on four TREC document collections. Our experimental results show that utilizing phrase based concepts improves the retrieval effectiveness of term dependency, and reduces the size of the feature set to large extent.
Inferring the most important types of a query: a semantic approach BIBAFull-Text 857-858
  David Vallet; Hugo Zaragoza
In this paper we present a technique for ranking the most important types or categories for a given query. Rather than trying to find the category of the query, known as query categorization, our approach seeks to find the most important types related to the query results. Not necessarily the query category falls into this ranking of types and therefore our approach can be complementary.
On multiword entity ranking in peer-to-peer search BIBAFull-Text 859-860
  Yuval Merhav; Ophir Frieder
Previously [2], we postulated the advantage of using entity extraction to implement a new Peer-to-Peer (P2P) search framework for reducing network traffic and providing a trade off between precision and recall. We now propose an entity ranking method designed for the 'short documents' characteristic of P2P, which significantly improves both precision and recall in 'top results' P2P search. We construct a dynamic entity corpus using n-grams statistics and metadata, study its reliability, and use it to identify correlations between user query terms.
Site-based dynamic pruning for query processing in search engines BIBAFull-Text 861-862
  Ismail Sengor Altingovde; Engin Demir; Fazli Can; Ozgür Ulusoy
Web search engines typically index and retrieve at the page level. In this study, we investigate a dynamic pruning strategy that allows the query processor to first determine the most promising websites and then proceed with the similarity computations for those pages only within these sites.
Exploiting MDS Projections for Cross-language IR BIBAFull-Text 863-864
  Rafael E. Banchs; Andreas Kaltenbrunner
In this paper, we describe some preliminary work on using monolingual projections of document collections for performing cross-language information retrieval tasks. The proposed methodology uses multidimensional scaling for projecting the vector-space representations of a given multilingual document collection into spaces of lower dimensionality. An independent projection is computed for each different language, and the structural similarities of the resulting projections are exploited for information retrieval tasks.
Local approximation of PageRank and reverse PageRank BIBAFull-Text 865-866
  Ziv Bar-Yossef; Li-Tal Mashiach
We consider the problem of approximating the PageRank of a target node using only local information provided by a link server. We prove that local approximation of PageRank is feasible if and only if the graph has low in-degree and admits fast PageRank convergence. While natural graphs, such as the web graph, are abundant with high in-degree nodes, making local PageRank approximation too costly, we show that reverse natural graphs tend to have low indegree while maintaining fast PageRank convergence. It follows that calculating Reverse PageRank locally is frequently more feasible than computing PageRank locally. Finally, we demonstrate the usefulness of Reverse PageRank in five different applications.
Improving text classification accuracy using topic modeling over an additional corpus BIBAFull-Text 867-868
  Somnath Banerjee
The World Wide Web has many document repositories that can act as valuable sources of additional data for various machine learning tasks. In this paper, we propose a method of improving text classification accuracy by using such an additional corpus that can easily be obtained from the web. This additional corpus can be unlabeled and independent of the given classification task. The method proposed here uses topic modeling to extract a set of topics from the additional corpus. Those extracted topics then act as additional features of the data of the given classification task. An evaluation on the RCV1 dataset shows significant improvement over a baseline method.
An algorithm for text categorization BIBAFull-Text 869-870
  Anestis Gkanogiannis; Theodore Kalamboukis
A novel and efficient learning algorithm is proposed for the binary linear classification problem. The algorithm is trained using the Rocchio's relevance feedback technique and builds a classifier by the intermediate hyperplane of two common tangent hyperplanes for the given category and its complement. Experimental results presented are very encouraging and justify the need for further research.
Hypergraph partitioning for document clustering: a unified clique perspective BIBAFull-Text 871-872
  Tianming Hu; Hui Xiong; Wenjun Zhou; Sam Yuan Sung; Hangzai Luo
Hypergraph partitioning has been considered as a promising method to address the challenges of high dimensionality in document clustering. With documents modeled as vertices and the relationship among documents captured by the hyperedges, the goal of graph partitioning is to minimize the edge cut. Therefore, the definition of hyperedges is vital to the clustering performance. While several definitions of hyperedges have been proposed, a systematic understanding of desired characteristics of hyperedges is still missing. To that end, in this paper, we first provide a unified clique perspective of the definition of hyperedges, which serves as a guide to define hyperedges. With this perspective, based on the concepts of hypercliques and shared (reverse) nearest neighbors, we propose three new types of clique hyperedges and analyze their properties regarding purity and size issues. Finally, we present an extensive evaluation using real-world document datasets. The experimental results show that, with shared (reverse) nearest neighbor based hyperedges, the clustering performance can be improved significantly in terms of various external validation measures without the need for fine tuning of parameters.
Pagerank based clustering of hypertext document collections BIBAFull-Text 873-874
  Konstantin Avrachenkov; Vladimir Dobrynin; Danil Nemirovsky; Son Kim Pham; Elena Smirnova
Clustering hypertext document collection is an important task in Information Retrieval. Most clustering methods are based on document content and do not take into account the hyper-text links. Here we propose a novel PageRank based clustering (PRC) algorithm which uses the hypertext structure. The PRC algorithm produces graph partitioning with high modularity and coverage. The comparison of the PRC algorithm with two content based clustering algorithms shows that there is a good match between PRC clustering and content based clustering.
An alignment-based pattern representation model for information extraction BIBKFull-Text 875-876
  Seokhwan Kim; Minwoo Jeong; Gary Geunbae Lee
Keywords: information extraction, pattern representation model
Relational distance-based collaborative filtering BIBAFull-Text 877-878
  Wei Zhang
In this paper, we present a novel hybrid recommender system called RelationalCF, which integrate content and demographic information into a collaborative filtering framework by using relational distance computation approaches without the effort of form transformation and feature construction. Our experiments suggest that the effective combination of various kinds of information based on relational distance approaches provides improved accurate recommendations than other approaches.


Minexml: bridging unstructured query with structured resources via mediated query BIBKFull-Text 879
  Keng Hoon Gan; Phang Keat Keong; Saravadee Sae Tan; Tang Enya Kong
Keywords: query, search, xml retrieval
Clustering search results for mobile terminals BIBAFull-Text 880
  Michiko Yasukawa; Hidetoshi Yokoo
Mobile terminals such as cell phones are much more restricted in terms of input/output functionality and, therefore, some special techniques must be incorporated to enable them to be easily used for Web searching. Further, searching for a location name is related to a dazzling variety of topics. We relate these two factors to each other to yield a new search system for map and text information. Presenting search results as clusters is helpful for users, especially in a mobile environment. The system makes mobile web searching easier and more efficient.
Refining search results with facet landscapes BIBKFull-Text 881
  Mark Sifer; Jian Lin
Keywords: faceted search, information visualization, olap
Ice-tea: an interactive cross-language search engine with translation enhancement BIBKFull-Text 882
  Dan Wu; Daqing He
Keywords: clir, ice-tea, query expansion (qe), relevance feedback (rf), translation enhancement (te)
Cross-lingual search over 22 european languages BIBAFull-Text 883
  Bla Fortuna; Jan Rupnik; Bostjan Pajntar; Marko Grobelnik; Dunja Mladenic
In this paper we present a system for cross-lingual information retrieval, which can handle tens of languages and millions of documents. Functioning of the system is demonstrated on corpus of European Legislation (22 languages, more than 400,000 documents per language). The system uses an interactive web-interface, which can take advantage of a predefined thesaurus allowing the user to dynamically re-rank the retrieval results based on the mapping onto a predefined thesaurus.
Social recommendations at work BIBAFull-Text 884
  Tom Crecelius; Mouna Kacimi; Sebastian Michel; Thomas Neumann; Josiane X. Parreira; Ralf Schenkel; Gerhard Weikum
Online communities have become popular for publishing and searching content, and also for connecting to other users. User-generated content includes, for example, personal blogs, bookmarks, and digital photos. Items can be annotated and rated by different users, and users can connect to others that are usually friends and/or share common interests.
   We demonstrate a social recommendation system that takes advantages of users connections and tagging behavior to compute recommendations of items in such communities. The advantages can be verified via comparison to a standard IR technique.
Bilkent news portal: a personalizable system with new event detection and tracking capabilities BIBKFull-Text 885
  Fazli Can; Seyit Kocberber; Ozgur Baglioglu; Suleyman Kardas; Huseyin Cagdas Ocalan; Erkan Uyar
Keywords: new event detection and tracking, news portal, web
Geographic IR and visualization in time and space BIBAFull-Text 886
  Ray R. Larson
This demonstration will show how graphical geospatial query specifications can be used to obtain sets of georeferenced data ranked by probability of relevance, and displayed geographically and temporally in a geospatial browser with temporal support.
Fine-grained relevance feedback for XML retrieval BIBAFull-Text 887
  Hanglin Pan; Ralf Schenkel; Gerhard Weikum
This demonstration presents an XML IR system that allows users to give feedback of different granularities and types, using Dempster-Shafer theory of evidence to compute expanded and reweighted queries.
Dynamic visualization of music classification systems BIBKFull-Text 888
  Kris West; J. Stephen Downie; Xiao Hu; M. Cameron Jones
Keywords: classification, evaluation, music information retrieval
From concepts to implementation and visualization: tools from a team-based approach to ir BIBAFull-Text 889
  Uma Murthy; Ricardo da Silva Torres; Edward A. Fox; Logambigai Venkatachalam; Seungwon Yang; Marcos A. Gonçalves
Researchers have been studying and developing teaching materials for information retrieval (IR), such as [3]. Toolkits also have been built that provide hands-on experience to students. For example, IR-Toolbox [4] is an effort to close the gap between the students' understanding of IR concepts and real-life indexing and search systems. Such tools might be good for helping students in non-technical areas such as in the Library and Information Science field to develop their conceptual model of search engines. However, they do not cover emerging topics and skills, such as content-based image retrieval (CBIR) and fusion search. Although there is open source software (such as those in http://www.searchtools.com/tools/tools-opensource.html) that can be used to teach basic and advanced IR topics, they require a student to have high-level technical knowledge and to spend a long time to gain a practical understanding of these topics.
   We present a new and rapid approach to teach basic and advanced IR topics, such as text retrieval, web-based IR, CBIR, and fusion search, to Computer Science (CS) graduate students. We designed projects that would help students grasp the above-mentioned IR topics. Students, working in teams, were given a practical application to start with -- the Superimposed Application for Image Description and Retrieval [5]. SAIDR (earlier, SIERRA) allows users to associate parts of images with multimedia information such as text annotations. Also, users may retrieve information in one of two 2 ways: (1) Perform text-based retrieval on annotations; (2) Perform CBIR on images and parts of images that look like a query image (or part of a query image).
   Each team was asked to build an enhancement for this application, involving text retrieval and/or CBIR, in three weeks time. The sub-projects are described in Table 1. The outcome of this activity was that students learned about IR concepts while being able to relate their applicability to a real world problem (Figure 1). Details of these projects may be found at http://collab.dlib.vt.edu/runwiki/wiki.pl?TabletPcImageRetrievalSuperimposedInformation. We will demonstrate the tools developed along with the IR concepts they illustrate (Table 1). We believe these tools may aid others to learn about basic and advanced topics in IR.

Doctoral consortium

Exploiting XML structure to improve information retrieval in peer-to-peer systems BIBAFull-Text 890
  Judith Winter
With the advent of XML as a standard for representation and exchange of structured documents, a growing amount of XML-documents are being stored in Peer-to-Peer (P2P) networks. Current research on P2P search engines proposes the use of Information Retrieval (IR) techniques to perform content-based search, but does not take into account structural features of documents.
   P2P systems typically have no central index, thus avoiding single-points-of-failures, but distribute all information among participating peers. Accordingly, a querying peer has only limited access to the index information and should select carefully which peers can help answering a given query by contributing resources such as local index information or CPU time for ranking computations. Bandwidth consumption is a major issue. To guarantee scalability, P2P systems have to reduce the number of peers involved in the retrieval process. As a result, the retrieval quality in terms of recall and precision may suffer substantially.
   In the proposed thesis, document structure is considered as an extra source of information to improve the retrieval quality of XML-documents in a P2P environment. The thesis centres on the following questions: how can structural information help to improve the retrieval of XML-documents in terms of result quality such as precision, recall, and specificity? Can XML structure support the routing of queries in distributed environments, especially the selection of promising peers? How can XML IR techniques be used in a P2P network while minimizing bandwidth consumption and considering performance aspects?
   To answer these questions and to analyze possible achievements, a search engine is proposed that exploits structural hints expressed explicitly by the user or implicitly by the self-describing structure of XML-documents. Additionally, more focused and specific results are obtained by providing ranked retrieval units that can be either XML-documents as a whole or the most relevant passages of theses documents. XML information retrieval techniques are applied in two ways: to select those peers participating in the retrieval process, and to compute the relevance of documents.
   The indexing approach includes both content and structural information of documents. To support efficient execution of multi term queries, index keys consist of rare combinations of (content, structure)-tuples. Performance is increased by using only fixed-sized posting lists: frequent index keys are combined with each other iteratively until the new combination is rare, with a posting list size under a pre-set threshold. All posting lists are sorted by taking into account classical IR measures such as term frequency and inverted term frequency as well as weights for potential retrieval units of a document, with a slight bias towards documents on peers with good collections regarding the current index key and with good peer characteristics such as online times, available bandwidth, and latency.
   When extracting the posting list for a specific query, a re-ordering on the posting list is performed that takes into account the structural similarity between key and query. According to this preranking, peers are selected that are expected to hold information about potentially relevant documents and retrieval units
   The final ranking is computed in parallel on those selected peers. The computation is based on an extension of the vector space model and distinguishes between weights for different structures of the same content. This allows weighting XML elements with respect to their discriminative power, e.g. a title will be weighted much higher than a footnote. Additionally, relevance is computed as a mixture of content relevance and structural similarity between a given query and a potential retrieval unit.
   Currently, a first prototype for P2P Information Retrieval of XML-documents called SPIRIX is being implemented. Experiments to evaluate the proposed techniques and use of structural hints will be performed on a distributed version of the INEX Wikipedia Collection.
Affective feedback: an investigation into the role of emotions in the information seeking process BIBAFull-Text 891
  Ioannis Arapakis
User feedback is considered to be a critical element in the information seeking process. An important aspect of the feedback cycle is relevance assessment that has progressively become a popular practice in web searching activities and interactive information retrieval (IR). The value of relevance assessment lies in the disambiguation of the user's information need, which is achieved by applying various feedback techniques. Such techniques vary from explicit to implicit and help determine the relevance of the retrieved documents.
   The former type of feedback is usually obtained through the explicit and intended indication of documents as relevant (positive feedback) or irrelevant (negative feedback). Explicit feedback is a robust method for improving a system's overall retrieval performance and producing better query reformulations [1], at the expense of users' cognitive resources. On the other hand, implicit feedback techniques tend to collect information on search behavior in a more intelligent and unobtrusive manner. By doing so, they disengage the users from the cognitive burden of document rating and relevance judgments. Information-seeking activities such as reading time, saving, printing, selecting and referencing have been all treated as indicators of relevance, despite the lack of sufficient evidence to support their effectiveness [2].
   Besides their apparent differences, both categories of feedback techniques determine document relevance with respect to the cognitive and situational levels of the interactive dialogue that occurs between the user and the retrieval system [5]. However, this approach does not account for the dynamic interplay and adaptation that takes place between the different dialogue levels, but most importantly it does not consider the affective dimension of interaction. Users interact with intentions, motivations and feelings apart from real-life problems and information objects, which are all critical aspects of cognition and decision-making [3][4]. By evaluating users' affective response towards an information object (e.g. a document), prior and post to their exposure to it, a more accurate understanding of the object's properties and degree of relevance to the current information need may be facilitated. Furthermore, systems that can detect and respond accordingly to user emotions could potentially improve the naturalness of human-computer interaction and progressively optimize their retrieval strategy. The current study investigates the role of emotions in the information seeking process, as the latter are communicated through multi-modal interaction, and reconsiders relevance feedback with respect to what occurs on the affective level of interaction as well.
Exploring and measuring dependency trees for informationretrieval BIBAFull-Text 892
  Chang Liu
Natural language processing techniques are believed to hold a tremendous potential to supplement the purely quantitative methods of text information retrieval. This has led to the emergence of a large number of NLP-based IR research projects over the last few years, even though the empirical evidence to support this has often been inadequate. Most contributions of NLP to IR mainly concentrate on document representation and compound term matching strategies. Researchers have noted that the simple term-based representation of document content such as vector representation is usually inadequate for accurate discrimination. The "bag of words" representation does not invoke linguistic considerations and allow modelling of relationships between subsets of words. However, even though a variety of content indicator such as syntactic phrase have been tried and investigated for representing documents rather than single terms in IR systems, the matching strategy over those representation still cannot go beyond traditional statistical techniques that measure term co-occurrence characteristics and proximity in analyzing text structure.
   In this paper, we propose a novel IR strategy (SIR) with NLP techniques involved at the syntactic level. Within SIR, documents and query representation are built on the basis of a syntactic data structure of the natural language text -- the dependency tree, in which syntactic relationships between words are identified and structured in the form of a tree. In order to capture the syntactic relations between words in their hierarchical structural representation, the matching strategy in SIR upgrades from the traditional statistical techniques by introducing a similarity measure method executing on the graph representation level as the key determiner. A basic IR experiment is designed and implemented on the TREC data to evaluate if this novel IR model is feasible. Experimental results indicate that this approach has the potential to outperform the standard bag of words IR model, especially in response to syntactical structured queries.
The search for expertise: to the documents and beyond BIBKFull-Text 893
  Pavel Serdyukov
Keywords: enterprise search, expert finding, expertise retrieval
Task detection for activity-based desktop search BIBAFull-Text 894
  Sergey Chernov
The desktop search tools provide powerful query capabilities and result presentation techniques. However, they do not take the user context into account. We propose to exploit collected information about user activities with desktop files and applications for activity-based desktop search. When I prepare for a project review and type in a search box the name of a colleague, I expect to find her last deliverable draft, but not her email with a paper review or our joint conference presentation. Ideally, the desktop search system should be able to infer my current task from the logs of my previous activities and present task-specific search results.
Using a mediated query approach for matching unstructured query with structured resources BIBKFull-Text 895
  Keng Hoon Gan
Keywords: XML retrieval, query, search
Understanding system implementation and user behavior in a collaborative information seeking environment BIBKFull-Text 896
  Chirag Shah
Keywords: collaborative information seeking, evaluation, user study
Biomedical cross-language information retrieval BIBKFull-Text 897
  Dolf Trieschnigg
Keywords: CLIR, biomedical IR, language models
Towards a combined model for search and navigation of annotated documents BIBKFull-Text 898
  Edgar Meij
Keywords: document annotations, language models, semantic relatedness
Context and linking in retrieval from personal digital archives BIBAFull-Text 899
  Liadh Kelly
Advances in digital capture and storage technologies mean that it is now possible to capture and store one's entire life experiences in personal digital archives. These vast personal archives (or Human Digital Memories (HDMs)) pose new challenges and opportunities for the research community, not the least of which is developing effective means of retrieval from HDMs. Personal archive retrieval research is still in its infancy and there is much scope for novel research. My PhD proposes to develop effective HDM retrieval algorithms by combining rich sources of context associated with items, such as location and people present data, with information obtained by linking HDM items in novel ways.
Extending language modeling techniques to models of search and browsing activity in a digital library BIBAFull-Text 900
  G. Craig Murray
Users searching for information in a digital library or on the WWW can be modeled as individuals moving through a semantic space by issuing queries and clicking on hyperlinks. As they go, they emit a stream of interaction data. Most of it is linguistic data. Lots of it is captured in logs. Some of it is used to guess what the user is searching for. But to most information retrieval systems, each user interaction is a stateless point in this space. There is a timeline connecting each of these points, but systems seldom make use of this as sequence data, in part because there is no clear way to systematically characterize the meaningful relations within a sequence of user activity. It is a problem of pragmatics as much as it is of semantics -- the fact that a user clicked on a particular link, or added a particular term to their query, has meaning primarily in relation to the preceding actions. A remaining challenge in IR is to extract features of the user interaction data that will give meaning to those relations.
   Meanwhile, from the user's perspective each of these points in time and semantic space are just part of a path of exploration. To the user, the exact terms in a query, or the specific words surrounding a hypertext link, may be less important than the trajectory those terms establish in relation to the user's path. Identifying the meaningful relations between queries and page views within a sequence of activity increases our understanding of users and their information needs. Formally, we can model query and browsing behaviors as surface forms of a hidden process. What is missing is a layer of abstraction for mapping sequences of interaction in a way that is both descriptive of users' needs and useful to automation.
   The work I describe is an effort to identify features of data in logs of query and browsing activity that are highly predictive of certain types of behavior. Sequences of interaction data from individual users are modeled as sequences of expression. Statistical modeling techniques that are effective for modeling sequences in natural language processing and bioinformatics are examined for their ability to model sequences of interaction between an information searcher and an information retrieval system. Queries and click-throughs in this stream of interaction can be tagged with features such as semantic coordinates, timing, frequency of use, type of action, etc. By analyzing large collections of interaction sequences it is possible to identify frequent patterns of user behavior. From these patterns we can make predictions about future interactions. For example, certain patterns of link following in a digital library are highly predictive of users' next steps while other patterns are not.
   General models of user interaction are useful for design and evaluation of search interfaces. Individual models of user interaction are useful for personalized search and customized content. Yet very little research has been done to investigate which features are optimal for modeling user queries and browsing as interaction sequences. An important first step is to identify informative features and the relationships between features. I propose to construct models of user behavior based on user data in logs of query and browsing activity and to identify features that are highly predictive of certain types of user behaviors. I examine activity within search sessions on a digital library as a microcosm of larger systems. I expect to find features that are useful in predictive models of user behavior both at an individual and aggregate level. Where possible, I hope to identify meaningful relationships between those features. The work has implications beyond the scope of digital libraries, to larger systems and broader search domains.