[1]
A web based multi-linguists symbol-to-text AAC application
The Paciello group accessibility challenge
/
Ding, Chaohai
/
Halabi, Nawar
/
Al-Zaben, Lama
/
Li, Yunjia
/
Draffan, E. A.
/
Wald, Mike
Proceedings of the 2015 International Cross-Disciplinary Conference on Web
Accessibility (W4A)
2015-05-18
p.24
© Copyright 2015 ACM
Summary: There are several commercial or freely available symbol sets for
Augmentative and Alternative Communication (AAC) use; all these symbol sets
have the same issue when trying to use them in a multiple lingual setting.
Symbol Dragoman is a Web based application that aims to allow the user who has
no spoken language and uses pictograms or images to communicate in Arabic or
English. It combines chosen 'symbols' in any way they want to produce a
sentence that can be read or heard in both languages with the potential of
offering any combination of languages in the future.
[2]
Linked data-driven decision support for accessible travelling
Google doctoral consortium
/
Ding, Chaohai
/
Wald, Mike
/
Wills, Gary
Proceedings of the 2015 International Cross-Disciplinary Conference on Web
Accessibility (W4A)
2015-05-18
p.39
© Copyright 2015 ACM
Summary: With the aim of addressing the gap between users' needs of accessible
travelling and complex environmental barriers of physical places in the real
world, this paper summarizes the research of investigating the use of Linked
Data principles for enhanced accessible travelling decision support. Firstly,
this paper reviews current research and projects to identify some problems and
challenges. Then a conceptual model and the reference architecture of Linked
Data-driven decision support system (DSS) for accessible travelling are
proposed to address such problems to enhance the accessible travelling for
people with disabilities (PwD), especially for people with mobility
difficulties. As a result, this research would not only benefit PwD, but also
contribute to the research of a novel model to address accessibility
information barriers by applying the Linked Data principles to DSSs for
enhanced accessible travelling.
[3]
Open Accessibility Data Interlinking
Mobility Support and Accessible Tourism
/
Ding, Chaohai
/
Wald, Mike
/
Wills, Gary
ICCHP'14: International Conference on Computers Helping People with Special
Needs, Part 2
2014-07-09
v.2
p.73-80
Keywords: Linked Data; Open Accessibility Data; Information Retrieval; Data
Interlinking
© Copyright 2014 Springer International Publishing
Summary: This paper presents the research of using Linked Open Data to enhance
accessibility data for accessible travelling. Open accessibility data is the
data related to the accessibility issues associated with geographical data,
which could benefit people with disabilities and their special needs. With the
aim of addressing the gap between users' special needs and data, this paper
presents the results of a survey of open accessibility data retrieved from four
different sources in the UK. An ontology based data integration approach is
proposed to interlink these datasets together to generate a linked open
accessibility repository, which also links to other resources on the Linked
Data Cloud. As a result, this research would not only enrich the open
accessibility data, but also contribute to a novel framework to address
accessibility information barriers by establishing a linked data repository for
publishing, linking and consuming the open accessibility data.
[4]
A survey of open accessibility data
Mobility
/
Ding, Chaohai
/
Wald, Mike
/
Wills, Gary
Proceedings of the 2014 International Cross-Disciplinary Conference on Web
Accessibility (W4A)
2014-04-07
p.37
© Copyright 2014 ACM
Summary: This paper presents the research of using Linked Data for enhancing
accessibility data, especially for accessible travelling. With the aim of
addressing the gap between users' special needs and accessibility data, this
research initially explores the current situation of open accessibility data.
Open accessibility data is the data related to the accessibility issues and
associated with geographical data, which could benefit people with disabilities
or special needs. This paper proposed a survey of open accessibility data in UK
based on the datasets retrieved from five different resources. After examining
the features of each dataset, a mapping approach using Semantic Web
technologies is proposed to interlink these datasets together to generate a
linked open accessibility repository and link this repository to other
resources on the Linked Open Data Cloud (LODC). As a result, this research
would not only benefit people with disabilities, but also contribute to a novel
method to address accessibility information barriers by establishing a linked
open accessibility data repository for publishing, integrating and consuming
the accessibility data.
[5]
Probabilistic solutions of influence propagation on social networks
IR track: networks
/
Zhang, Miao
/
Dai, Chunni
/
Ding, Chris
/
Chen, Enhong
Proceedings of the 2013 ACM Conference on Information and Knowledge
Management
2013-10-27
p.429-438
© Copyright 2013 ACM
Summary: Given fixed budgets, companies attempt to obtain maximum coverage on a
social network by targeting at influential individuals. This viral marketing is
often modeled by the independent cascade model. However, identifying the most
influential people by computing influence spread is NP-hard, and various
approximate algorithms are developed. In this paper, we emphasize the
probabilistic nature of influence propagation. We propose to use exact
probabilistic solutions and prove an inclusion-exclusion principle for
computing influence spread. Our probabilistic solutions can significantly speed
up the computation of influence spread. We also give a probabilistic-additive
incremental search strategy to solve the influence maximization problem, i.e.,
to find a subset of individuals that has the largest influence spread in the
end. Experiments on real data sets demonstrated the effectiveness and
efficiency of our methods.
[6]
Infobox suggestion for Wikipedia entities
Knowledge management poster session
/
Sultana, Afroza
/
Hasan, Quazi Mainul
/
Biswas, Ashis Kumer
/
Das, Soumyava
/
Rahman, Habibur
/
Ding, Chris
/
Li, Chengkai
Proceedings of the 2012 ACM Conference on Information and Knowledge
Management
2012-10-29
p.2307-2310
© Copyright 2012 ACM
Summary: Given the sheer amount of work and expertise required in authoring Wikipedia
articles, automatic tools that help Wikipedia contributors in generating and
improving content are valuable. This paper presents our initial step towards
building a full-fledged author assistant, particularly for suggesting infobox
templates for articles. We build SVM classifiers to suggest infobox template
types, among a large number of possible types, to Wikipedia articles without
infoboxes. Different from prior works on Wikipedia article classification which
deal with only a few label classes for named entity recognition, the much
larger 337-class setup in our study is geared towards realistic deployment of
infobox suggestion tool. We also emphasize testing on articles without
infoboxes, due to that labeled and unlabeled data exhibit different
distributions of features, which departs from the typical assumption that they
are drawn from the same underlying population.
[7]
Simultaneous clustering of multi-type relational data via symmetric
nonnegative matrix tri-factorization
Machine learning for information retrieval
/
Wang, Hua
/
Huang, Heng
/
Ding, Chris
Proceedings of the 2011 ACM Conference on Information and Knowledge
Management
2011-10-24
p.279-284
© Copyright 2011 ACM
Summary: The rapid growth of Internet and modern technologies has brought data
involving objects of multiple types that are related to each other, called as
multi-type relational data. Traditional clustering methods for single-type data
rarely work well on them, which calls for more advanced clustering techniques
to deal with multiple types of data simultaneously to utilize their
interrelatedness. A major challenge in developing simultaneous clustering
methods is how to effectively use all available information contained in a
multi-type relational data set including inter-type and intra-type
relationships. In this paper, we propose a Symmetric Nonnegative Matrix
Tri-Factorization (S-NMTF) framework to cluster multi-type relational data at
the same time. The proposed S-NMTF approach employs NMTF to simultaneously
cluster different types of data using their inter-type relationships, and
incorporate the intra-type information through manifold regularization. In
order to deal with the symmetric usage of the factor matrix in S-NMTF, we
present a new generic matrix inequality to derive the solution algorithm, which
involves a fourth-order matrix polynomial, in a principled way. Promising
experimental results have validated the proposed approach.
[8]
Robust nonnegative matrix factorization using L21-norm
Classification and evaluation
/
Kong, Deguang
/
Ding, Chris
/
Huang, Heng
Proceedings of the 2011 ACM Conference on Information and Knowledge
Management
2011-10-24
p.673-682
© Copyright 2011 ACM
Summary: Nonnegative matrix factorization (NMF) is widely used in data mining and
machine learning fields. However, many data contain noises and outliers. Thus a
robust version of NMF is needed. In this paper, we propose a robust formulation
of NMF using L21 norm loss function. We also derive a computational algorithm
with rigorous convergence analysis. Our robust NMF approach, (1) can handle
noises and outliers; (2) provides very efficient and elegant updating rules;
(3) incurs almost the same computational cost as standard NMF, thus potentially
to be used in more real world application tasks. Experiments on 10 datasets
show that the robust NMF provides more faithful basis factors and consistently
better clustering results as compared to standard NMF.
[9]
Cross-language web page classification via dual knowledge transfer using
nonnegative matrix tri-factorization
Multilingual IR
/
Wang, Hua
/
Huang, Heng
/
Nie, Feiping
/
Ding, Chris
Proceedings of the 34th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2011-07-25
p.933-942
© Copyright 2011 ACM
Summary: The lack of sufficient labeled Web pages in many languages, especially for
those uncommonly used ones, presents a great challenge to traditional
supervised classification methods to achieve satisfactory Web page
classification performance. To address this, we propose a novel Nonnegative
Matrix Tri-factorization (NMTF) based Dual Knowledge Transfer (DKT) approach
for cross-language Web page classification, which is based on the following two
important observations. First, we observe that Web pages for a same topic from
different languages usually share some common semantic patterns, though in
different representation forms. Second, we also observe that the associations
between word clusters and Web page classes are a more reliable carrier than raw
words to transfer knowledge across languages. With these recognitions, we
attempt to transfer knowledge from the auxiliary language, in which abundant
labeled Web pages are available, to target languages, in which we want classify
Web pages, through two different paths: word cluster approximations and the
associations between word clusters and Web page classes. Due to the
reinforcement between these two different knowledge transfer paths, our
approach can achieve better classification accuracy. We evaluate the proposed
approach in extensive experiments using a real world cross-language Web page
data set. Promising results demonstrate the effectiveness of our approach that
is consistent with our theoretical analyses.
[10]
Exploiting user interests for collaborative filtering: interests expansion
via personalized ranking
Poster session 3: KM track
/
Liu, Qi
/
Chen, Enhong
/
Xiong, Hui
/
Ding, Chris H. Q.
Proceedings of the 2010 ACM Conference on Information and Knowledge
Management
2010-10-26
p.1697-1700
© Copyright 2010 ACM
Summary: In real applications, a given user buys or rates an item based on his/her
interests. Learning to leverage this interest information is often critical for
recommender systems. However, in existing recommender systems, the information
about latent user interests are largely under-explored. To that end, in this
paper, we propose an interest expansion strategy via personalized ranking based
on the topic model, named iExpand, for building an interest-oriented
collaborative filtering framework. The iExpand method introduces a three-layer,
user-interest-item, representation scheme, which leads to more interpretable
recommendation results and helps the understanding of the interactions among
users, items, and user interests. Moreover, iExpand strategically deals with
many issues, such as the overspecialization and the cold-start problems.
Finally, we evaluate iExpand on benchmark data sets, and experimental results
show that iExpand outperforms state-of-the-art methods.
[11]
Closed form solution of similarity algorithms
Poster presentations
/
Cai, Yuanzhe
/
Zhang, Miao
/
Ding, Chris
/
Chakravarthy, Sharma
Proceedings of the 33rd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2010-07-19
p.709-710
Keywords: linkage mining, similarity calculation
© Copyright 2010 ACM
Summary: Algorithms defining similarities between objects of an information network
are important of many IR tasks. SimRank algorithm and its variations are
popularly used in many applications. Many fast algorithms are also developed.
In this note, we first reformulate them as random walks on the network and
express them using forward and backward transition probably in a matrix form.
Second, we show that P-Rank (SimRank is only the special case of P-Rank) has a
unique solution of eeT when decay factor c is equal to 1. We also show that
SimFusion algorithm is a special case of P-Rank algorithm and prove that the
similarity matrix of SimFusion is the product of PageRank vector. Our
experiments on the web datasets show that for P-Rank the decay factor c doesn't
seriously affect the similarity accuracy and accuracy of P-Rank is also higher
than SimFusion and SimRank.
[12]
Feature subset non-negative matrix factorization and its applications to
document understanding
Poster presentations
/
Wang, Dingding
/
Ding, Chris
/
Li, Tao
Proceedings of the 33rd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2010-07-19
p.805-806
Keywords: NMF, feature subset selection
© Copyright 2010 ACM
Summary: In this paper, we propose feature subset non-negative matrix factorization
(NMF), which is an unsupervised approach to simultaneously cluster data points
and select important features. We apply our proposed approach to various
document understanding tasks including document clustering, summarization, and
visualization. Experimental results demonstrate the effectiveness of our
approach for these tasks.
[13]
Modeling reliability for wireless sensor node coverage in assistive testbeds
Networking technologies for healthcare information storage, transmission,
processing, and feedback
/
Le, Zhengyi
/
Becker, Eric
/
Konstantinides, Dimitrios G.
/
Ding, Chirs
/
Makedon, Fillia
Proceedings of the 3rd International Conference on PErvasive Technologies
Related to Assistive Environments
2010-06-23
p.46
Keywords: assistive system, fault tolerance, network, quality of service, risk
management, sensor node management, system lifetime, wireless sensor network
© Copyright 2010 ACM
Summary: Wireless Sensor Networks (WSNs) is a prevailing technology in assistive
environments. Assistive environments may include both home and work spaces such
as factories, military installations, industrial spaces, and offices. Critical
quality-of-service properties of WSN are reliability, availability, and
serviceability. This paper focuses on reliability for healthcare applications.
Reliable WSN-based monitoring services can prevent accidents, improve the
quality of life, and even help with early health diagnosis and treatments.
However, because patients/the elderly may have cognitive or other health
problems, the reliability is the dominant factor of quality of services of WSN.
This paper presents an approach to analyze the reliability of a WSN with the
most popular tree structures. The analysis is based on two distribution models,
exponential distribution and Weibull distribution. The simulation results also
give options to users on the cost vs. reliability issue.
[14]
Knowledge transformation for cross-domain sentiment classification
Posters
/
Li, Tao
/
Sindhwani, Vikas
/
Ding, Chris
/
Zhang, Yi
Proceedings of the 32nd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2009-07-19
p.716-717
Keywords: non-negative matrix factorization, sentiment analysis, transfer learning
© Copyright 2009 ACM
Summary: With the explosion of user-generated web2.0 content in the form of blogs,
wikis and discussion forums, the Internet has rapidly become a massive dynamic
repository of public opinion on an unbounded range of topics. A key enabler of
opinion extraction and summarization is sentiment classification: the task of
automatically identifying whether a given piece of text expresses positive or
negative opinion towards a topic of interest. Building high-quality sentiment
classifiers using standard text categorization methods is challenging due to
the lack of labeled data in a target domain. In this paper, we consider the
problem of cross-domain sentiment analysis: can one, for instance, download
rated movie reviews from rottentomatoes.com or IMBD discussion forums, learn
linguistic expressions and sentiment-laden terms that generally characterize
opinionated reviews and then successfully transfer this knowledge to the target
domain, thereby building high-quality sentiment models without manual effort?
We outline a novel sentiment transfer mechanism based on constrained
non-negative matrix tri-factorizations of term-document matrices in the source
and target domains. We report some preliminary results with this approach.
[15]
Knowledge transformation from word space to document space
Clustering: 1
/
Li, Tao
/
Ding, Chris
/
Zhang, Yi
/
Shao, Bo
Proceedings of the 31st Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2008-07-20
p.187-194
© Copyright 2008 ACM
Summary: In most IR clustering problems, we directly cluster the documents, working
in the document space, using cosine similarity between documents as the
similarity measure. In many real-world applications, however, we usually have
knowledge on the word side and wish to transform this knowledge to the document
(concept) side. In this paper, we provide a mechanism for this knowledge
transformation. To the best of our knowledge, this is the first model for such
type of knowledge transformation. This model uses a nonnegative matrix
factorization model X = FSGT, where X is the word document semantic matrix, F
is the posterior probability of a word belonging to a word cluster and
represents knowledge in the word space, G is the posterior probability of a
document belonging to a document cluster and represents knowledge in the
document space, and S is a scaled matrix factor which provides a condensed view
of X. We show how knowledge on words can improve document clustering, i.e,
knowledge in the word space is transformed into the document space. We perform
extensive experiments to validate our approach.
[16]
Multi-document summarization via sentence-level semantic analysis and
symmetric matrix factorization
Summarization
/
Wang, Dingding
/
Li, Tao
/
Zhu, Shenghuo
/
Ding, Chris
Proceedings of the 31st Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2008-07-20
p.307-314
© Copyright 2008 ACM
Summary: Multi-document summarization aims to create a compressed summary while
retaining the main characteristics of the original set of documents. Many
approaches use statistics and machine learning techniques to extract sentences
from documents. In this paper, we propose a new multi-document summarization
framework based on sentence-level semantic analysis and symmetric non-negative
matrix factorization. We first calculate sentence-sentence similarities using
semantic analysis and construct the similarity matrix. Then symmetric matrix
factorization, which has been shown to be equivalent to normalized spectral
clustering, is used to group sentences into clusters. Finally, the most
informative sentences are selected from each group to form the summary.
Experimental results on DUC2005 and DUC2006 data sets demonstrate the
improvement of our proposed framework over the implemented existing
summarization systems. A further study on the factors that benefit the high
performance is also conducted.
[17]
Posterior probabilistic clustering using NMF
Posters group 4: theory and IR models
/
Ding, Chris
/
Li, Tao
/
Luo, Dijun
/
Peng, Wei
Proceedings of the 31st Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2008-07-20
p.831-832
© Copyright 2008 ACM
Summary: We introduce the posterior probabilistic clustering (PPC), which provides a
rigorous posterior probability interpretation for Nonnegative Matrix
Factorization (NMF) and removes the uncertainty in clustering assignment.
Furthermore, PPC is closely related to probabilistic latent semantic indexing
(PLSI).
[18]
Multiple Evidence Combination in Web Site Search Based on Users' Access
Histories
Poster Papers
/
Ding, Chen
/
Zhou, Jin
Proceedings of User Modeling 2007
2007-07-25
p.405-409
© Copyright 2007 Springer-Verlag
Summary: Despite the success of global search engines, web site search is still
problematic in its retrieval accuracy. In this study, we propose to extract
terms based on users' access histories to build web page representations, and
then use multiple evidence combination to combine these log-based terms with
text-based and anchor-based terms. We test different combination approaches and
baseline retrieval models. Our experimental results show that the server log,
when used in multiple evidence combination, can improve the effectiveness of
the web site search, whereas the impact on different models is different.
[19]
NMF and PLSI: equivalence and a hybrid algorithm
Posters
/
Ding, Chris
/
Li, Tao
/
Peng, Wei
Proceedings of the 29th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2006-08-06
p.641-642
© Copyright 2006 ACM
Summary: In this paper, we show that PLSI and NMF optimize the same objective
function, although PLSI and NMF are different algorithms as verified by
experiments. In addition, we also propose a new hybrid method that runs PLSI
and NMF alternatively to achieve better solutions.
[20]
PageRank, HITS and a unified framework for link analysis
Poster session
/
Ding, Chris
/
He, Xiaofeng
/
Husbands, Parry
/
Zha, Hongyuan
/
Simon, Horst D.
Proceedings of the 25th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2002-08-11
p.353-354
Summary: Two popular link-based webpage ranking algorithms are (i) PageRank[1] and
(ii) HITS (Hypertext Induced Topic Selection)[3]. HITS makes the crucial
distinction of hubs and authorities and computes them in a mutually reinforcing
way. PageRank considers the hyperlink weight normalization and the equilibrium
distribution of random surfers as the citation score. We generalize and combine
these key concepts into a unified framework, in which we prove that rankings
produced by PageRank and HITS are both highly correlated with the ranking by
in-degree and out-degree.
[21]
Towards an Adaptive and Task-Specific Ranking Mechanism in Web Searching
Poster Session
/
Ding, Chen
/
Chi, Chi-Hung
Proceedings of the 23rd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2000-07-24
p.375-376
© Copyright 2000 ACM
[22]
Beyond the Traditional Query Operators
Poster Session
/
Ding, Chen
/
Chi, Chi-Hung
Proceedings of the 23rd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
2000-07-24
p.377-378
© Copyright 2000 ACM
[23]
A Similarity-Based Probability Model for Latent Semantic Indexing
LSI & Theory
/
Ding, Chris H. Q.
Proceedings of the 22nd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
1999-08-15
p.58-65
© Copyright 1999 ACM