HCI Bibliography : Search Results skip to search form | skip to results |
Database updated: 2016-05-10 Searches since 2006-12-01: 32,646,502
director@hcibib.org
Hosted by ACM SIGCHI
The HCI Bibliogaphy was moved to a new server 2015-05-12 and again 2016-01-05, substantially degrading the environment for making updates.
There are no plans to add to the database.
Please send questions or comments to director@hcibib.org.
Query: Gao_L* Results: 23 Sorted by: Date  Comments?
Help Dates
Limit:   
[1] Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval Poster Session 1 / Song, Jingkuan / Gao, Lianli / Yan, Yan / Zhang, Dongxiang / Sebe, Nicu Proceedings of the 2015 ACM International Conference on Multimedia 2015-10-26 p.827-830
ACM Digital Library Link
Summary: There is an increasing interest in using hash codes for efficient multimedia retrieval and data storage. The hash functions are learned in such a way that the hash codes can preserve essential properties of the original space or the label information. Then the Hamming distance of the hash codes can approximate the data similarity. Existing works have demonstrated the success of many supervised hashing models. However, labeling data is time and labor consuming, especially for scalable datasets. In order to utilize the supervised hashing models to improve the discriminative power of hash codes, we propose a Supervised Hashing with Pseudo Labels (SHPL) which uses the cluster centers of the training data to generate pseudo labels, based on which the hash codes can be generated using the criteria of supervised hashing. More specifically, we utilize linear discriminant analysis (LDA) with trace ratio criterion as a showcase for hash functions learning and during the optimization, we prove that the pseudo labels and the hash codes can be jointly learned and iteratively updated in an unified framework. The learned hash functions can harness the discriminant power of trace ratio criterion, and thus can achieve better performance. Experimental results on three large-scale unlabeled datasets (i.e., SIFT1M, GIST1M, and SIFT1B) demonstrate the superior performance of our SHPL over existing hashing methods.

[2] Exploring Viewable Angle Information in Georeferenced Video Search Poster Session 1 / Hu, Gang / Shao, Jie / Gao, Lianli / Yang, Yang Proceedings of the 2015 ACM International Conference on Multimedia 2015-10-26 p.839-842
ACM Digital Library Link
Summary: As positioning data and other sensor information such as orientation measurement became powerful contextual features generated by mobile devices during video recording, a model capturing geographic field-of-view (FOV) has been developed for georeferenced video search. The accurate representation of an FOV is through the geometric shape of a circular sector. However, previous work simply employed a rectilinear vector model to represent the coverage area of a video scene. In this study, we propose to use a novel circular sector model with beginning-ending vectors for FOV representation which additionally explores viewable angle information. Its major advantage is that it leads to a more accurate georeferenced video search without false positives or false negatives (which occur in previous model using single vector). We demonstrate how our model can be applied to perform different types of overlap queries for spatial data selection in a unified framework, while providing competitive performance in terms of efficiency.

[3] Scalable Multimedia Retrieval by Deep Learning Hashing with Relative Similarity Learning Poster Session 1 / Gao, Lianli / Song, Jingkuan / Zou, Fuhao / Zhang, Dongxiang / Shao, Jie Proceedings of the 2015 ACM International Conference on Multimedia 2015-10-26 p.903-906
ACM Digital Library Link
Summary: Learning-based hashing methods are becoming the mainstream for approximate scalable multimedia retrieval. They consist of two main components: hash codes learning for training data and hash functions learning for new data points. Tremendous efforts have been devoted to designing novel methods for these two components, i.e., supervised and unsupervised methods for learning hash codes, and different models for inferring hashing functions. However, there is little work integrating supervised and unsupervised hash codes learning into a single framework. Moreover, the hash function learning component is usually based on hand-crafted visual features extracted from the training images. The performance of a content-based image retrieval system crucially depends on the feature representation and such hand-crafted visual features may degrade the accuracy of the hash functions. In this paper, we propose a semi-supervised deep learning hashing (DLH) method for fast multimedia retrieval. More specifically, in the first component, we utilize both visual and label information to learn an relative similarity graph that can more precisely reflect the relationship among training data, and then generate the hash codes based on the graph. In the second stage, we apply a deep convolutional neural network (CNN) to simultaneously learn a good multimedia representation and hash functions. Extensive experiments on three popular datasets demonstrate the superiority of our DLH over both supervised and unsupervised hashing methods.

[4] Chronological Citation Recommendation with Information-Need Shifting Session 6E: Citation Networks / Jiang, Zhuoren / Liu, Xiaozhong / Gao, Liangcai Proceedings of the 2015 ACM Conference on Information and Knowledge Management 2015-10-19 p.1291-1300
ACM Digital Library Link
Summary: As the volume of publications has increased dramatically, an urgent need has developed to assist researchers in locating high-quality, candidate-cited papers from a research repository. Traditional scholarly-recommendation approaches ignore the chronological nature of citation recommendations. In this study, we propose a novel method called "Chronological Citation Recommendation" which assumes initial user information needs could shift while users are searching for papers in different time slices. We model the information-need shifts with two-level modeling: dynamic time-related ranking feature construction and dynamic evolving feature weight training. In more detail, we employed a supervised document influence model to characterize the content "time-varying" dynamics and constructed a novel heterogeneous graph that encapsulates dynamic topic-based information, time-decay paper/topic citation information, and word-based information. We applied multiple meta-paths for different ranking hypotheses which carried different types of information for citation recommendation in various time slices, along with information-need shifting. We also used multiple learning-to-rank models to optimize the feature weights for different time slices to generate the final "Chronological Citation Recommendation" rankings. The use of Chronological Citation Recommendation suggests time-series ranking lists based on initial user textual information need and characterizes the information-need shifting. Experiments on the ACM corpus show that Chronological Citation Recommendation can significantly enhance citation recommendation performance.

[5] Scientific Information Understanding via Open Educational Resources (OER) Session 8B: Citations / Liu, Xiaozhong / Jiang, Zhuoren / Gao, Liangcai Proceedings of the 2015 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2015-08-09 p.645-654
ACM Digital Library Link
Summary: Scientific publication retrieval/recommendation has been investigated in the past decade. However, to the best of our knowledge, few efforts have been made to help junior scholars and graduate students to understand and consume the essence of those scientific readings. This paper proposes a novel learning/reading environment, OER-based Collaborative PDF Reader (OCPR), that incorporates innovative scaffolding methods that can: 1. auto-characterize student emerging information need while reading a paper; and 2. enable students to readily access open educational resources (OER) based on their information need. By using metasearch methods, we pre-indexed 1,112,718 OERs, including presentation videos, slides, algorithm source code, or Wikipedia pages, for 41,378 STEM publications. Based on the computational information need, we use text mining and heterogeneous graph mining algorithms to recommend high quality OERs to help students better understand the scientific content in the paper. Evaluation results and exit surveys for an information retrieval course show that the OCPR system alone with the recommended OERs can effectively assist graduate students better understand the complex STEM publications. For instance, 78.42% of participants believe the OCPR system and recommended OERs can provide precise and useful information they need, while 78.43% of them believe the recommended OERs are close to exactly what they need when reading the paper. From OER ranking viewpoint, MRR, MAP and NDCG results prove that learning to rank and cold start solutions can efficiently integrate different text and graph ranking features.

[6] WikiMirs 3.0: A Hybrid MIR System Based on the Context, Structure and Importance of Formulae in a Document Session 7 -- Non-text Collections / Wang, Yuehan / Gao, Liangcai / Wang, Simeng / Tang, Zhi / Liu, Xiaozhong / Yuan, Ke JCDL'15: Proceedings of the 2015 ACM/IEEE-CS Joint Conference on Digital Libraries 2015-06-21 p.173-182
ACM Digital Library Link
Summary: Nowadays, mathematical information is increasingly available in websites and repositories, such like ArXiv, Wikipedia and growing numbers of digital libraries. Mathematical formulae are highly structured and usually presented in layout presentations, such as PDF, LATEX and Presentation MathML. The differences of presentation between text and formulae challenge traditional text-based index and retrieval methods. To address the challenge, this paper proposes an upgraded Mathematical Information Retrieval (MIR) system, namely WikiMirs 3.0, based on the context, structure and importance of formulae in a document. In WikiMirs 3.0, users can easily "cut" formulae and contexts from PDF documents as well as type in queries. Furthermore, a novel hybrid indexing and matching model is proposed to support both exact and fuzzy matching. In the hybrid model, both context and structure information of formulae are taken into consideration. In addition, the concept of formula importance within a document is introduced into the model for more reasonable ranking. Experimental results, compared with two classical MIR systems, demonstrate that the proposed system along with the novel model provides higher accuracy and better ranking results over Wikipedia.

[7] Querying Web-Scale Information Networks Through Bounding Matching Scores Technical Papers / Jin, Jiahui / Khemmarat, Samamon / Gao, Lixin / Luo, Junzhou Proceedings of the 2015 International Conference on the World Wide Web 2015-05-18 v.1 p.527-537
ACM Digital Library Link
Summary: Web-scale information networks containing billions of entities are common nowadays. Querying these networks can be modeled as a subgraph matching problem. Since information networks are incomplete and noisy in nature, it is important to discover answers that match exactly as well as answers that are similar to queries. Existing graph matching algorithms usually use graph indices to improve the efficiency of query processing. For web-scale information networks, it may not be feasible to build the graph indices due to the amount of work and the memory/storage required. In this paper, we propose an efficient algorithm for finding the best k answers for a given query without precomputing graph indices. The quality of an answer is measured by a matching score that is computed online. To speed up query processing, we propose a novel technique for bounding the matching scores during the computation. By using bounds, we can efficiently prune the answers that have low qualities without having to evaluate all possible answers. The bounding technique can be implemented in a distributed environment, allowing our approach to efficiently answer the queries on web-scale information networks. We demonstrate the effectiveness and the efficiency of our approach through a series of experiments on real-world information networks. The result shows that our bounding technique can reduce the running time up to two orders of magnitude comparing to an approach that does not use bounds.

[8] Scalable Distributed Belief Propagation with Prioritized Block Updates KM Session 15: Knowledge Representation & Reasoning II / Yin, Jiangtao / Gao, Lixin Proceedings of the 2014 ACM Conference on Information and Knowledge Management 2014-11-03 p.1209-1218
ACM Digital Library Link
Summary: Belief propagation (BP) is a popular method for performing approximate inference on probabilistic graphical models. However, its message updates are time-consuming, and the schedule for updating messages is crucial to its running time and even convergence. In this paper, we propose a new scheduling scheme that selects a set of messages to update at a time and leverages a novel priority to determine which messages are selected. Additionally, an incremental update approach is introduced to accelerate the computation of the priority. As the size of the model grows, it is desirable to leverage the parallelism of a cluster of machines to reduce the inference time. Therefore, we design a distributed framework, Prom, to facilitate the implementation of BP algorithms. We evaluate the proposed scheduling scheme (supported by Prom) via extensive experiments on a local cluster as well as the Amazon EC2 cloud. The evaluation results show that our scheduling scheme outperforms the state-of-the-art counterpart.

[9] Comic2CEBX: A system for automatic comic content adaptation Data transformation and description / Li, Luyuan / Wang, Yongtao / Gao, Liangcai / Tang, Zhi / Suen, Ching Y. JCDL'14: Proceedings of the 2014 ACM/IEEE-CS Joint Conference on Digital Libraries 2014-09-08 p.299-308
Keywords: Feature extraction
Keywords: Image edge detection
Keywords: Image segmentation
Keywords: Layout
Keywords: Nonhomogeneous media
Keywords: Pattern recognition
Keywords: Visualization
Keywords: CEBX Document Standard
Keywords: Comic Image
Keywords: Conditional Random Fields
Keywords: Content Reflow and Adaptation
Keywords: Page Layout Analysis
Keywords: Panel Detection
dx.doi.org/10.1109/JCDL.2014.6970183
Summary: Comics are popular almost throughout the world. With the help of comic document digitization, it is much easier for people to archive and browse comic works. However, there are still some big challenges along with comic document digitization progress. Among these challenges, comic content adaptation is an important one to be tackled. The existing works only focus on parts of this problem and do not provide a tangible solution to display comic contents on different devices. In this paper, we solve these problems by proposing Comic2CEBX, a system which can automatically convert a set of scanned comic page images into a CEBX file that allows reflowing of the original comic pages with fixed layouts. Taking raw comic images as inputs, our system first extracts three kinds of low-level visual patterns and then uses multilayer Conditional Random Fields to detect all the panels. Meanwhile, our system automatically identifies the reading orders of the panels within each page. Finally, we encapsulate the comic page images and the obtained page structure information (i.e., the panels detection results and the corresponding reading orders) to generate a CEBX file. Experimental results show that our comic page layout analysis method achieves better performance than the existing ones, and use case presentation of the CEBX files produced by our system demonstrates that it brings better comic reading experience especially on mobile devices.

[10] Full-text based context-rich heterogeneous network mining approach for citation recommendation Citation, citation, citation / Liu, Xiaozhong / Yu, Yingying / Guo, Chun / Sun, Yizhou / Gao, Liangcai JCDL'14: Proceedings of the 2014 ACM/IEEE-CS Joint Conference on Digital Libraries 2014-09-08 p.361-370
Keywords: Abstracts
Keywords: Citation analysis
Keywords: Context
Keywords: Data mining
Keywords: Educational institutions
Keywords: Focusing
Keywords: Inference algorithms
Keywords: Citation Recommendation
Keywords: Full-text Citation Analysis
Keywords: Heterogeneous Information Network
Keywords: Meta-Path
dx.doi.org/10.1109/JCDL.2014.6970191
Summary: Citation relationship between scientific publications has been successfully used for scholarly bibliometrics, information retrieval and data mining tasks, and citation-based recommendation algorithms are well documented. While previous studies investigated citation relations from various viewpoints, most of them share the same assumption that, if paper1 cites paper2 (or author1 cites author2), they are connected, regardless of citation importance, sentiment, reason, topic, or motivation. However, this assumption is oversimplified. In this study, we employ an innovative "context-rich heterogeneous network" approach, which paves a new way for citation recommendation task. In the network, we characterize 1) the importance of citation relationships between citing and cited papers, and 2) the topical citation motivation. Unlike earlier studies, the citation information, in this paper, is characterized by citation textual contexts extracted from the full-text citing paper. We also propose algorithm to cope with the situation when large portion of full-text missing information exists in the bibliographic repository. Evaluation results show that, context-rich heterogeneous network can significantly enhance the citation recommendation performance.

[11] A mathematics retrieval system for formulae in layout presentations Session 7c: signs and symbols / Lin, Xiaoyan / Gao, Liangcai / Hu, Xuan / Tang, Zhi / Xiao, Yingnan / Liu, Xiaozhong Proceedings of the 2014 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2014-07-06 p.697-706
ACM Digital Library Link
Summary: The semantics of mathematical formulae depend on their spatial structure, and they usually exist in layout presentations such as PDF, LaTeX, and Presentation MathML, which challenges previous text index and retrieval methods. This paper proposes an innovative mathematics retrieval system along with the novel algorithms, which enables efficient formula index and retrieval from both webpages and PDF documents. Unlike prior studies, which require users to manually input formula markup language as query, the new system enables users to "copy" formula queries directly from PDF documents. Furthermore, by using a novel indexing and matching model, the system is aimed at searching for similar mathematical formulae based on both textual and spatial similarities. A hierarchical generalization technique is proposed to generate sub-trees from the semi-operator tree of formulae and support substructure match and fuzzy match. Experiments based on massive Wikipedia and CiteSeer repositories show that the new system along with novel algorithms, comparing with two representative mathematics retrieval systems, provides more efficient mathematical formula index and retrieval, while simplifying user query input for PDF documents.

[12] A Study of Kinect-Based Smart TV Control Mode Cross-Cultural Issues in Interaction / Li, He / Qiu, Jing / Gao, Long CCD 2014: 6th International Conference on Cross-Cultural Design 2014-06-22 p.174-183
Keywords: Kinect; Gesture Control; Smart TV Control Mode
Link to Digital Content at Springer
Summary: TV plays a more and more important role in daily life. And it will be a protagonist in our home. However, the progress of smart TV control mode did not catch up with the development of smart TV's software and hardware. In this paper, a survey was conducted to confirm the way of future smart TV control mode. According to the results of the survey, an experiment was executed in order to investigate design parameters of the new smart TV control mode. Therefore, the precautions on design the smart TV control mode was proposed.

[13] Road traffic prediction by incorporating online information Connecting online & offline life workshop (COOL 2014) / Zhou, Tian / Gao, Lixin / Ni, Daiheng Companion Proceedings of the 2014 International Conference on the World Wide Web 2014-04-07 v.2 p.1235-1240
ACM Digital Library Link
Summary: Road traffic conditions are typically affected by events such as extreme weather or sport games. With the advance of Web, events and weather conditions can be readily retrieved in real-time. In this paper, we propose a traffic condition prediction system incorporating both online and offline information. RFID-based system has been deployed for monitoring road traffic. By incorporating data from both road traffic monitoring system and online information, we propose a hierarchical Bayesian network to predict road traffic condition. Using historical data, we establish a hierarchical Bayesian network to characterize the relationships among events and road traffic conditions. To evaluate the model, we use the traffic data collected in Western Massachusetts as well as online information about events and weather. Our proposed prediction achieves an accuracy of 93% overall.

[14] WikiMirs: a mathematical information retrieval system for wikipedia Web 2.0 / Hu, Xuan / Gao, Liangcai / Lin, Xiaoyan / Tang, Zhi / Lin, Xiaofan / Baker, Josef B. JCDL'13: Proceedings of the 2013 ACM/IEEE-CS Joint Conference on Digital Libraries 2013-07-22 p.11-20
ACM Digital Library Link
Summary: Mathematical formulae in structural formats such as MathML and LaTeX are becoming increasingly available. Moreover, repositories and websites, including ArXiv and Wikipedia, and growing numbers of digital libraries use these structural formats to present mathematical formulae. This presents an important new and challenging area of research, namely Mathematical Information Retrieval (MIR). In this paper, we propose WikiMirs, a tool to facilitate mathematical formula retrieval in Wikipedia. WikiMirs is aimed at searching for similar mathematical formulae based upon both textual and spatial similarities, using a new indexing and matching model developed for layout structures. A hierarchical generalization technique is proposed to generate sub-trees from presentation trees of mathematical formulae, and similarity is calculated based upon matching at different levels of these trees. Experimental results show that WikiMirs can efficiently support sub-structure matching and similarity matching of mathematical formulae. Moreover, WikiMirs obtains both higher accuracy and better ranked results over Wikipedia in comparison to Wikipedia Search and Egomath. We conclude that WikiMirs provides a new, alternative, and hopefully better service for users to search mathematical expressions within Wikipedia.

[15] Webpage Designs for Diverse Cultures: An Exploratory Study of User Preferences in China Cross-Cultural, Intercultural and Social Issues / Su, Yin / Liu, David / Yuan, Xiaomeng / Ting, Justin / Jiang, Jingguo / Wang, Li / Gao, Lin Proceedings of IFIP INTERACT'13: Human-Computer Interaction-1 2013 v.1 p.339-346
Keywords: webpage design; cross-culture; diversity; Chinese users
Link to Digital Content at Springer
Summary: A wealth of studies has revealed a cross-cultural difference in the user preference on webpage designs. Users from other cultures often criticize a widely accepted webpage design in one culture. Designs for diverse cultures are thus expected to be specific to address diverse user preferences. This study investigated the preferences of Chinese users on four essential design elements related to the readability of texts of the result pages of search engines. The results suggested that the search result pages of the Bing search engine designed for typical 'US users' did not satisfy Chinese users. Chinese users, in general, preferred huge-sized texts for titles, a more compact layout of the search result pages, and keywords to be highlighted in red. The findings of the study contributed to webpage design guidelines for Chinese users, and may serve as a catalyst in exploring user preferences in designing for diverse cultures.

[16] Web-based citation parsing, correction and augmentation Citations / Gao, Liangcai / Qi, Xixi / Tang, Zhi / Lin, Xiaofan / Liu, Ying JCDL'12: Proceedings of the 2012 Joint International Conference on Digital Libraries 2012-06-10 p.295-304
ACM Digital Library Link
Summary: Considering the tremendous value of citation metadata, many methods have been proposed to automate Citation Metadata Extraction (CME). The existing methods primarily rely on the content analysis of citation text. However, the results from such content-based methods are often unreliable. Moreover, the extracted citation metadata is only a small part of the relevant metadata that spreads across the Internet. As opposed to the content-based CME methods, this paper proposes a Web-based CME approach and a citation enriching system, called as BibAll, which is capable of correcting the parsing results of content-based CME methods and augmenting citation metadata by leveraging relevant bibliographic data from digital repositories and cited-by publications on the Web. BibAll consists of four main components: citation parsing, Web-based bibliographic data retrieval, irrelevant bibliographic data filtering, and relevant bibliographic data integration. The system has been tested on the publicly available FLUX-CIM dataset. Experimental results show that BibAll significantly improves the citation parsing accuracy and augments the metadata of the original citation.

[17] Adaboost with SVM-Based Classifier for the Classification of Brain Motor Imagery Tasks Eye Tracking, Gestures and Brain Interfaces / Wang, Jue / Gao, Lin / Zhang, Haoshi / Xu, Jin UAHCI 2011: 6th International Conference on Universal Access in Human-Computer Interaction, Part II: Users Diversity 2011-07-09 v.2 p.629-634
Keywords: Adaboost; SVM; Classification; Kolmogorov entropy; ERS/ERD; Motor imagery
Link to Digital Content at Springer
Summary: The Adaboost with SVM-based component classifier is generally considered to break the Boosting principle for the difficulty in training of SVM and have imbalance between the diversity and accuracy over basic SVM classifiers. The Adaboost classifier in the paper trains SVM as base classifier with changing kernel function parameter σ value, which progressively reduces with the changes of weight value of training sample. To testify the validity of the classifier, the classifier is tested on human subjects to classify the left- and right-hand motor imagery tasks. The average classification accuracy reaches 90.2% on test data, which greatly outperforms SVM classifiers without Adaboost and commonly Fisher Linear Discriminant classifier. The results confirm that the proposed combination of Adaboost with SVM classifier may improve accuracy for classification of motor imagery tasks, and have applications to performance improvement of brain-computer interface (BCI) systems.

[18] Structure extraction from PDF-based book documents Automated methods to help our understanding of texts / Gao, Liangcai / Tang, Zhi / Lin, Xiaofan / Liu, Ying / Qiu, Ruiheng / Wang, Yongtao JCDL'11: Proceedings of the 2011 Joint International Conference on Digital Libraries 2011-06-13 p.11-20
ACM Digital Library Link
Summary: Nowadays PDF documents have become a dominating knowledge repository for both the academia and industry largely because they are very convenient to print and exchange. However, the methods of automated structure information extraction are yet to be fully explored and the lack of effective methods hinders the information reuse of the PDF documents. To enhance the usability for PDF-formatted electronic books, we propose a novel computational framework to analyze the underlying physical structure and logical structure. The analysis is conducted at both page level and document level, including global typographies, reading order, logical elements, chapter/section hierarchy and metadata. Moreover, two characteristics of PDF-based books, i.e., style consistency in the whole book document and natural rendering order of PDF files, are fully exploited in this paper to improve the conventional image-based structure extraction methods. This paper employs the bipartite graph as a common structure for modeling various tasks, including reading order recovery, figure and caption association, and metadata extraction. Based on the graph representation, the optimal matching (OM) method is utilized to find the global optima in those tasks. Extensive benchmarking using real-world data validates the high efficiency and discrimination ability of the proposed method.

[19] CEBBIP: a parser of bibliographic information in chinese electronic books Session 2 / Gao, Liangcai / Tang, Zhi / Lin, Xiaofan JCDL'09: Proceedings of the 2009 Joint International Conference on Digital Libraries 2009-06-15 p.73-76
Keywords: bibliography, chinese electronic book, digital library, machine learning, metadata extraction
ACM Digital Library Link
Summary: Bibliographic information is essential for many digital library applications, such as citation analysis, academic searching and topic discovery. And bibliographic data extraction has attracted a great deal of attention in recent years. In this paper, we address the problem of automatic extraction of bibliographic data in Chinese electronic book and propose a tool called CEBBIP* for the task, which includes three main systems: data preprocessing, data parsing and data postprocessing. In the data preprocessing system, the tool adopts a rules-based method to locate citation data in a book and to segment citation data into citation strings of individual referencing literature. And a learning-based approach, Conditional Random Fields (CRF), is employed to parse citation strings in the data parsing system. Finally, the tool takes advantage of document intrinsic local format consistency to enhance citation data segmentation and parsing through clustering techniques. CEBBIP has been used in a commercial E-book production system. Experimental results show that CEBBIP's precision rate is very high. More specially, adopting the document intrinsic local format consistency obviously improves the citation data segmenting and parsing accuracy.

[20] XEB: a markup language document container format suitable for handheld devices Demos / Tang, Zhi / Gao, Liangcai / Jia, Aixia / Lin, Xiaofan JCDL'09: Proceedings of the 2009 Joint International Conference on Digital Libraries 2009-06-15 p.481-482
Keywords: document parsing, handheld device, markup language document
ACM Digital Library Link
Summary: We propose a new document container format (XEB, eXtensible Electronic Book) based on block mechanism to efficiently process markup language documents in handheld devices. And random document access is also supported in the format through a pagination mechanism. The format has already been applied to a number of handheld devices' Chinese E-book readers and XEB documents can be downloaded from a Chinese E-book store.

[21] A Motion Compensated De-interlacing Algorithm for Motive Object Capture Part I: Shape and Movement Modeling and Anthropometry / Gao, Lei / Li, Chao / Zhu, Chengjun / Xiong, Zhang DHM 2007: 1st International Conference on Digital Human Modeling 2007-07-22 p.74-81
Keywords: de-interlacing; motion compensation; motion estimation; motion detect; motion object
Link to Digital Content at Springer
Summary: A motion compensated de-interlacing algorithm is proposed to recover the defects of interlaced video frame for capturing motion object. In this algorithm, two anti-noise background fields are formed by analyzing the temporal correlation of pixels between adjacent same parity fields. To each field, the subtraction with the corresponding background is used to detect motion object. To avoid the inaccurate detection caused by the difference between the spatial scanning positions of odd and even field, the motion objects are detected with same parity field and background field. Then motion estimation technology is used to measures the inter-field motion, find out the motion vector between the odd field and even field. Based on the motion vector, an interpolation filter is designed to shift the pixels of the motion object in the two temporally displaced fields to a common point in time. This de-interlacing algorithm maximizes the vertical resolution of the motion objects. Experimental results show that the proposed algorithm could achieve higher image quality on motion object, and the computational complexity is acceptable for consumer computer applications.

[22] Application specific data replication for edge services Consistency and replication / Gao, Lei / Dahlin, Mike / Nayate, Amol / Zheng, Jiandan / Iyengar, Arun Proceedings of the 2003 International Conference on the World Wide Web 2003-05-20 p.449-460
Keywords: availability, data replication, distributed objects, edge services, performance, wide area networks (WAN)
ACM Digital Library Link
Summary: The emerging edge services architecture promises to improve the availability and performance of web services by replicating servers at geographically distributed sites. A key challenge in such systems is data replication and consistency so that edge server code can manipulate shared data without incurring the availability and performance penalties that would be incurred by accessing a traditional centralized database. This paper explores using a distributed object architecture to build an edge service system for an e-commerce application, an online bookstore represented by the TPC-W benchmark. We take advantage of application specific semantics to design distributed objects to manage a specific subset of shared information using simple and effective consistency models. Our experimental results show that by slightly relaxing consistency within individual distributed objects, we can build an edge service system that is highly available and efficient. For example, in one experiment we find that our object-based edge server system provides a factor of five improvement in response time over a traditional centralized cluster architecture and a factor of nine improvement over an edge service system that distributes code but retains a centralized database.

[23] Resource management for scalable disconnected access to Web services / Chandra, Bharat / Dahlin, Mike / Gao, Lei / Khoja, Amjad-Ali / Nayate, Amol / Razzaq, Asim / Sewani, Anil Proceedings of the 2001 International Conference on the World Wide Web 2001-05-01 p.245-256
ACM Digital Library Link