| Language technology infrastructures in support to multilingualism | | BIBA | Full-Text | 3-11 | |
| Joseph Mariani | |||
| The challenges of multilingualism are many, and needs are important, both in Europe and internationally. Language Technologies can help to meet those needs, but necessitate developing appropriate infrastructures and generating the resources which are mandatory to conduct research for the different languages. Some programs support this area, but suffer from a lack of scale, continuity and cohesion. The effort deserves to be coordinated between nations and international organizations in order to facilitate multilingualism in Europe and globally. | |||
| Communications and open systems | | BIBA | Full-Text | 12-13 | |
| Mario Tokoro | |||
| The ultimate purpose of communications is understanding each other. Natural
languages play the central role of communications, but other means such as
gestures, facial expression, and gaze in the situations are equally important.
Physical and social common sense is indispensable, and the historical
backgrounds of nations, regions, families, and individuals of speakers and
listeners are never negligible. All of these means, modes, and aspects are
mutually dependent and change as time progresses.
The method of modern science established in the 17th century contributed enormously to scientific advances and technological progress. In the method, we first define the domain of a problem, then reduce the problem in a way that exposes its true nature, and finally discover the underlying principles of the problem domain. When the domain of a problem is too unwieldy and too large for easily reducing the problem, it is broken up into smaller elements that are subjected to the same process. Hence it is called reductionism. Nonetheless, there are still plenty of stubborn issues that are not easily resolved. These unsolved issues are complicated ones that could not be addressed simply by reductionism alone. Earth sustainability is an example of such an issue. It involves energy, climate, population, food, biodiversity, safety assurance, etc., which are mutually dependent, and cannot be solved independently from the others. Another example is life and health. Many properties of the human body have been discovered through molecular biology, but real life also seems to be stochastic, contingent, and historical. Yet another example is the safety of gigantic infrastructures connected through networks. These infrastructures grow and change while they continue to function even in the event of various incidents without having any significant effect on the everyday lives of people. All these issues are related to the problems of integrated systems consisting of numerous interrelated subsystems. The solutions of individual problems cannot solve the overall problem and may even cause another problem or worsen the overall problem. Communications issues are such problems and may not be solved independently from the others. To solve such problems of integrated complex systems, a new approach called open systems science is proposed. The comparison of closed systems and open systems is presented first, and then the definition of open systems science is given. Some applications of this method to actual important problems are exemplified, and the issues on communications are discussed in depth. | |||
| A computer scientist looks at the energy problem | | BIB | Full-Text | 14-21 | |
| Randy H. Katz | |||
| Discarding monotone composed rule for hierarchical phrase-based statistical machine translation | | BIBAK | Full-Text | 25-29 | |
| Zhongjun He; Yao Meng; Hao Yu | |||
| Hierarchical phrase-based statistical machine translation systems often
suffer from a huge rule table. This paper proposed a basic and efficient method
for rule table reduction, discarding monotone composed rules. These rules are
redundant because they may be monotonically recreated by minimal rules.
Experiments show that the rule table is reduced 57%~71% without worsening
translation quality. Keywords: monotone composed rule, rule table reduction, statistical machine
translation | |||
| Accuracy evaluation of sentences translated to intermediate language in back translation | | BIBAK | Full-Text | 30-35 | |
| Mai Miyabe; Takashi Yoshino | |||
| The back-translation method is used to check the accuracy of a sentence
translated to a native language. We believe that there exits a positive
correlation between the accuracy of sentences translated to an intermediate
language and that of back-translated sentences. However, this has not yet been
verified. However, some back-translated sentences have high accuracy even if
the translated sentence is inaccurate. Therefore, we have to verify the
correlation between the accuracy of sentences translated to an intermediate
language and that of back-translated sentences. We have evaluated the accuracy
of back-translated sentences and that of sentences translated to an
intermediate language to establish the correlation between the two accuracies.
We have obtained the following results: (1) There exists a positive correlation
between the accuracy of sentences translated to an intermediate language and
that of back-translated sentences. (2) The occurrence rate of an accuracy
mismatch case, wherein a back-translated sentence is accurate but the
translated sentence is inaccurate, is less than or equal to 0.5%. (3)
Back-translation can be used to check the accuracy of a translated sentence. Keywords: back translation, machine translation, translation accuracy | |||
| Language independent word segmentation for statistical machine translation | | BIBA | Full-Text | 36-40 | |
| Michael Paul; Andrew Finch; Eiichiro Sumita | |||
| This paper proposes an unsupervised word segmentation algorithm that identifies word boundaries in continuous text in order to optimize the translation quality of statistical machine translation (SMT) approaches. The proposed method is language-independent and uses a parallel corpus to align source language characters to the corresponding word units separated by whitespace in the target language. Successive characters aligned to the same target words are merged to a larger source language unit and a Maximum Entropy (ME) algorithm is applied to learn the word segmentation that optimizes the translation quality of an SMT system trained on the re-segmented bitext. Experimental results translating five Asian languages into English revealed that the proposed method outperforms a baseline system that translates unigram segmented source language sentences. | |||
| Automatic extraction of bilingual terms from a Chinese-Japanese parallel corpus | | BIBAK | Full-Text | 41-45 | |
| Xiaorong Fan; Nobuyuki Shimizu; Hiroshi Nakagawa | |||
| This paper proposes a new approach for the automatic extraction of bilingual
terms from a domain-specific bilingual parallel corpus. We combine existing
monolingual term extractor and a word alignment tool to extract bilingual
terms. Our method is different from those past studies as we simply use a word
alignment tool to extract multi-words terms, and we use one monolingual term
extractor for both of languages to reduce extraction imbalance. We obtained a
good precision and an improved BLEU score in our experiment based on a
Chinese-Japanese parallel corpus. Keywords: automatic extraction, bilingual corpus, bilingual term, multi-words term,
segmentation, word alignment | |||
| Utilizing semantic equivalence classes of Japanese functional expressions in machine translation | | BIBAK | Full-Text | 46-53 | |
| Akiko Sakamoto; Takehito Utsuro; Suguru Matsuyoshi | |||
| This paper applied "Sandglass" machine translation architecture to the task
of translating Japanese functional expressions into English. We employ the
semantic equivalence classes of a recently compiled large scale hierarchical
lexicon of Japanese functional expressions. We examine each class whether it is
monosemous or not. We realize this procedure by empirically studying whether
functional expressions within a class can be translated into a single canonical
English expression. Furthermore, in order to precisely identify the class of
functional expressions to which our translation rule is directly applicable, we
further introduce two types of ambiguities of functional expressions and
identify monosemous functional expressions. We finally show that the proposed
framework outperforms commercial machine translation software products. Keywords: Japanese functional expressions, machine translation, polysemy, sense
disambiguation | |||
| 3-D display and communication technology | | BIBAK | Full-Text | 57-63 | |
| Min-Chul Park; Jung-Young Son | |||
| In this paper, we describe 3-D display in the aspect of communication
technology. 3-D display provides 3-D images to the viewers with more accurate
and realistic information than which 2-D display does. This feature is an
essential component of communication technology. Generally communication
technology pursues for exchanging and sharing of thoughts, feelings and ideas.
3-D displays are effective contact media to achieve these goals. The concept of
accessible spatial dimension of a person is used to describe 3-D display in the
aspect of communication technology. It is classified into three dimensions and
each of dimensions represents the dimension of contact media. Several research
results, related with 3-D display and communication technology are introduced
based on the concept. Keywords: 3-D, communication, contact, dimension, display, media | |||
| Analysis and compensation of spatial distortion in integral three-dimensional imaging | | BIBA | Full-Text | 64-69 | |
| Hisayuki Sasaki; Masahiro Kawakita; Jun Arai; Makoto Okui; Fumio Okano; Yasuyuki Haino; Makoto Yoshimura; Masahito Sato | |||
| We have been conducting research on three-dimensional (3D) television using
the integral imaging method. To enhance integral 3D image quality, Extremely
High-Resolution (EHR) imaging technology would be essential. Now, projection
display systems are practical for EHR images and have some advantages for 3D
imaging.
We theoretically and experimentally analyzed the effects of distorted elemental images on a reconstructed image. We study an image processing method for a compensation of distorted elemental images in projection type 3D imaging systems. The experimental results show the effectiveness in eliminating distortion of reconstructed 3D images and improving the limitation of the viewing zone. | |||
| Electronic holography generated from integral photography | | BIBAK | Full-Text | 70-73 | |
| Ryutaro Oi; Kenji Yamamoto; Tomoyuki Mishina; Takanori Senoh; Taiichiro Kurita | |||
| In this paper, we describe an electronic holography for non-coherent
lighting environment. We used and integral photography (IP) to obtain 3D
information of the scene. This method demands neither laser beams nor a
darkroom at the recording. Therefore living or moving objects may be captured
onto a hologram. The converter hardware calculates fringe patterns according to
the IP at 30 frames per second by using our former proposed conversion
algorithm. In our experiment, 3840x2160 pixels of color holograms are generated
in real-time. Keywords: FFT, electronic holography, holography, integral photography | |||
| An improved optical device for floating displays | | BIBAK | Full-Text | 74-77 | |
| Sandor Markon; Satoshi Maekawa | |||
| We propose an improved design of an optical device for projecting floating
images. The improved device is a modification of the original design of
dihedral corner reflector arrays reported earlier [2], improving its
manufacturability while largely maintaining its image forming capability. We
describe the construction of the device, and show its properties by
mathematical analysis and optical simulation. Keywords: 3D display, dihedral corner reflector array, floating images | |||
| Wearable robotics as a behavioral assist interface like oneness between horse and rider | | BIBAK | Full-Text | 81-88 | |
| Taro Maeda; Hideyuki Ando; Hiroyuki Iizuka | |||
| The Parasitic Humanoid (PH) is a wearable robot for modeling nonverbal human
behavior. This anthropomorphic robot senses the behavior of the wearer and has
the internal models to learn the process of human sensory motor integration,
thereafter it begins to predict the next behavior of the wearer using the
learned models. When the reliability of the prediction is sufficient, the PH
outputs the errors from the actual behavior as a request for motion to the
wearer. Through symbiotic interaction, the internal model and the process of
human sensory motor integration approximate each other asymptotically. Keywords: ability extension, embodiment, human hack, motion induction | |||
| Drowsy driving detection based on human pulse wave by photoplethysmography signal processing | | BIBAK | Full-Text | 89-92 | |
| Hanbit Park; Seungwon Oh; Minsoo Hahn | |||
| Drowsiness of driver while driving is one major factor of traffic accident.
Therefore, there are many researches to prevent and detect drowsy driving.
Recent researches have focused on motion detection using cameras to determine
drowsy driving. However, we have focused on non-invasive and inexpensive
drowsiness detection system. In our previous research, we suggested a system
based on the driver's head movement using infrared sensors. In this paper, we
suggest another non-invasive and inexpensive system based on the driver's pulse
wave by photoplethysmography (PPG) signal processing. Firstly, the system
collects a pulse wave from a PPG sensor on a steering wheel and then it
processes the signal to analyze driver's state. In order to evaluate the
effectiveness of a human pulse wave for drowsiness detection, we integrated two
systems. The experimental result using new integration system showed 83 percent
drowsy driving detection rate in the state of real driving. Keywords: driving, drowsiness detection, human pulse wave, photoplethysmography (PPG),
sensing, signal processing | |||
| Use of active RFID and environment-embedded sensors for indoor object location estimation | | BIBAK | Full-Text | 93-99 | |
| Ming Li; Taketoshi Mori; Hiroshi Noguchi; Masamichi Shimosaka; Tomomasa Sato | |||
| This paper describes a method for localizing objects in an actual living
environment. We have developed this method by using a complementary combination
of 1) received signal strength indicators (RSSIs) and vibration data acquired
from active RFID tags, and 2) human behavior detected from various types of
sensors embedded in the environment. Regarding the former, we use a pattern
recognition method to select a feature appeared in SSIs received by several
radio frequency (RF) readers at different places and to classify them into a
particular location. In our work, we regard the estimated location as the most
probable location where the object is placed. As for the latter, we use the
detected human behavior to support the estimation based on the analysis of
RSSIs. Experiment results showed that the proposed method improved the
estimation performance from about 50 to 95% compared with using only RSSIs to
localize objects. Moreover, the results also suggested that we can estimate
object location indoors without sensors for detecting human position. This
indoor object localization method can contribute for constructing an indoor
object management system that improves living comfort. Keywords: RSSI, active RFID, environment-embedded sensor, indoor localization | |||
| 3D hand posture estimation with single camera by two-stage searches from database | | BIBA | Full-Text | 100-106 | |
| Motomasa Tomida; Kiyoshi Hoshino | |||
| Previous systems for human hand posture estimation have adopted clustered multi-layer large-scale database with narrowing of search space by its past estimation results. But once an estimated result at a time is out of the search space, the system can't find out a true or optimal value. Our system therefore has adopted non-clustered large-scale database including narrowing of search space, rather, a coarse search at the first stage according to some aspects of inputted hand images, and an accurate search at the second stage with low-order image features. The experimental results showed that the averaged estimation error is -2.11 degrees, and the candidates for accurate search at the second stage are reduced from 28, 386 to 137.7 data sets, including our system realizes the stable hand posture estimation with high accuracy and processing speed as previous system without using the past results. | |||
| Proposal for a multilanguage text input support system that is easy for beginner language learners | | BIBA | Full-Text | 109-114 | |
| Kayo Ikeda; Hideho Numata; Masakatsu Kaneko; Kazuhiko Machida | |||
| In this paper, we propose an input support system which supports multiple languages and which will make it possible for users -- even users who are in the midst of learning a foreign language that they wish to use and are not familiar with it yet -- to easily access the information resources in their desired language. As a text input method that is not restricted by the OS or target language, we propose a system which performs input operations using a web browser. In text string input, by using characters within the ASCII domain, all of the text strings can be assigned to keys on the keyboard. For each language (script), a conversion dictionary is available which shows how the key input string and output string correspond. By devising a conversion dictionary, this system can support all languages (scripts). We perform text conversion in incremental search as a method to speed up input for users who are beginner language learners. Detailed Information Display is a function which displays information related to the vocabulary items that are among the conversion candidates. Using the proposed method, we succeeded in creating an environment in which Japanese students of foreign languages and foreigners living in Japan can input text regardless of their computer's environment. | |||
| QRpotato: a system that exhaustively collects bilingual technical term pairs from the web | | BIBAK | Full-Text | 115-119 | |
| Takeshi Abekawa; Kyo Kageura | |||
| This paper reports the system QRpotato, which exhaustively collects
bilingual technical term pairs from the Web. The system uses bilingual
(Japanese-English) term pairs taken from existing terminological dictionary as
seed pairs, search Web pages using the seed pairs, and extract bilingual term
pair candidates from the retrieved Web pages, using relational patterns
identified between seed term pairs. We have successfully collected about 2.2
million different term pair candidates by using about 210,000 seed term pairs.
The manual evaluation of the parts of the candidates shows the effectiveness of
the method. Keywords: automatic term extraction, bilingual term pairs, bilingual terminology, web | |||
| Topic relatedness in evaluative information extraction | | BIBA | Full-Text | 120-125 | |
| Takuya Kawada; Tetsuji Nakagawa; Kentaro Inui; Sadao Kurohashi | |||
| The task of extracting opinions/evaluations related to a given topic from a large number of documents such as Web documents is crucial for developing an automatic evaluation finding system, which can handle a wide variety of topics as input. In this paper, we discuss the topic relatedness of extracted evaluation through analysis of a corpus we developed. We suggest here that the semantic relationship between the target of each extracted evaluation and a given topic helps in judging topic relatedness. In addition, we point out other factors that are beyond the analysis of topic-target relations for judging the topic relatedness of evaluation. | |||
| Development of a large-scale web crawler and search engine infrastructure | | BIBAK | Full-Text | 126-131 | |
| Susumu Akamine; Yoshikiyo Kato; Daisuke Kawahara; Keiji Shinzato; Kentaro Inui; Sadao Kurohashi; Yutaka Kidawara | |||
| This paper reports the ongoing development of a large-scale Web crawler and
search engine infrastructure at National Institute of Information and
Communications Technology. This infrastructure has the following
characteristics: (1) It collects one billion Japanese Web pages while keeping
them up-to-date. (2) It selects 100 million pages from among the collected
pages and converts them into a standard data format to store the results of
morphological analysis, dependency parsing, and synonym augmentation. (3) The
selected set of pages is searchable and accessible to the users. (4) The
scalability of the system is achieved by using a large-scale cluster machine
for distributed data processing. Keywords: crawler, search engine, web information analysis | |||
| A web service for automatic word class acquisition | | BIBAK | Full-Text | 132-138 | |
| Stijn De Saeger; Jun'ichi Kazama; Kentaro Torisawa; Masaki Murata; Ichiro Yamada; Kow Kuroda | |||
| In this paper we present a Web service for building NLP resources to
construct semantic word classes in Japanese. The system takes a few seed words
belonging to the target class as input and uses automatic class expansion to
suggest semantically similar training samples for the user to label. The system
automatically generates random negative training samples as well, and then
trains a supervised classifier on this labeled data to generate the target word
class from 107 candidate words extracted from a corpus of 108 Web
documents. This system eliminates the need for expert machine learning
knowledge in creating semantic word classes, and we experimentally show that it
significantly reduces the human effort required to build them. Keywords: lexical acquisition, web service, word class construction | |||
| One-dimensional integral imaging 3D display systems | | BIBAK | Full-Text | 141-145 | |
| Yuzo Hirayama | |||
| We have developed several kinds of autostereoscopic display systems using
one-dimensional integral imaging method. The integral imaging system reproduces
light beams similar of those produced by a real object. Therefore our displays
have continuous motion parallax. The design, fabrication, and optical
evaluation of the displays have been made. By using our proprietary software,
the fast playback of the CG movie contents and real-time interaction are also
realized with the aid of a graphics card. Realization of the safety 3D images
to the human beings is very important. We have measured the effects on the
visual function and evaluated the biological effects. We have found that our
displays show better results than those to a conventional stereoscopic display.
Our display architecture is suitable for flatbed configurations because it has
a large margin for viewing distance and angle. Mixed reality of virtual 3D
objects and real objects are also realized on a flatbed display. The new
technology opens up new areas of application for 3D displays, including
communications, arcade games, e-learning, simulations of buildings and
landscapes, and even 3D menus in restaurants. Keywords: display, flatbed, integral imaging, three-dimension, visual function | |||
| Surrounding image projection with convex mirrors | | BIBAK | Full-Text | 146-149 | |
| Naoki Hashimoto; Yuki Ishiwata; Makoto Sato | |||
| Immersive projection technologies surrounding users with large and high
quality images are fundamental elements in our near-future information society.
However, such large projection systems are frequently based on large
implementation and high-cost components like a special projector and screen.
This situation limits users receiving the benefits with those technologies.
Therefore, in this paper, we propose an effective immersive projection system
using simple projectors and convex mirrors for our everyday surfaces like a
wall in a room. We also introduce a simple calibration method for making that
system easy to use for many people. Keywords: IPT, convex mirror, multi-projection, virtual reality | |||
| Video-based telemedicine with reliable color: field experiments of natural vision technology | | BIBAK | Full-Text | 150-153 | |
| Masahiro Yamaguchi; Junko Kishimoto; Yasuhiro Komiya; Yoshifumi Kanno; Yuri Murakami; Hiroyuki Hashizume; Ryouji Yamada; Kosuke Miyajima; Hideaki Haneishi | |||
| High-fidelity color imaging technology that incorporates spectrum-based
color reproduction system, called "natural vision" (NV) is applied to the field
experiment of telemedicine. The experiment comprises mainly two parts; 1)
High-fidelity color video of open surgery was captured by the six-band
multispectral camera, and the image quality was visually evaluated by medical
doctors, 2) Video-based teleconsultation experiment between a regional general
hospital and a clinic in an island near the hospital, was conducted with using
the natural vision system. Keywords: color, image reproduction, multispectral imaging, natural vision,
telemedicine, video transmission | |||
| Modeling the spatial behavior of virtual agents in groups for non-verbal communication in virtual worlds | | BIBAK | Full-Text | 154-159 | |
| Hamid Laga; Toshitaka Amaoka | |||
| In this paper we propose a mathematical model for the concept of Personal
Space (PS) and apply it to simulate the non-verbal communication between agents
in virtual worlds. Persons within a group tend to maintain the distances
between each other within a certain range that maximizes their degree of
comfort. These distances reflect the type of their relationship, and changes in
these distances reflect the evolution over time of their relationship.
Human-like autonomous virtual agents should be also equipped with such
capability to simulate natural interactions in virtual worlds. First we model
the space around an agent as a probability distribution function which reflects
at each point in the space the importance of that point to the agent. The agent
updates dynamically this function according to (1) his relation and distance to
other agents in the virtual space, (2) his face orientation, and (3) the
evolution of the relationship over time as a stranger agent may become a
friend. We demonstrate the concept on a multi-agent platform and show that
space-aware agents exhibit better natural behavior. Keywords: personal space, proxemics | |||
| Implicit interaction with daily objects: applications and issues | | BIBAK | Full-Text | 163-168 | |
| Kaori Fujinami | |||
| This paper describes augmentation of daily objects as a mean to interact
with a ubiquitous/pervasive computing environment. A daily object employs a
context-aware capability, where a user's specific context is captured
implicitly and naturally by sensors from its original usage because such an
everyday object has inherent roles and functionalities. Also, information is
presented naturally and effectively during the utilization. A user does not
need to learn how to get information, which fills the gap between a user and a
complex ubiquitous/pervasive computing environment.
In this paper, some projects on augmenting daily objects are presented, where possible applications and a technique to complement a missing piece of context that is obtained only from an instrumental object are presented. Also, we propose to assure a sensor placement for reliable sensing by a daily object. Keywords: context-awareness, implicit interaction, information presentation, smart
object | |||
| A preliminary exploration of augmented social landscapes | | BIBAK | Full-Text | 169-171 | |
| Shin'ichi Konomi | |||
| The ubiquity of sensing devices, including location-aware, sensor-enabled
mobile phones, creates an opportunity to design a novel digital layer of a
city, which senses and shapes the experiences of urban inhabitants. This paper
explores a possibility of ubiquitous sensing devices to generate alternative
social landscapes of a city, and facilitate universal communication. Sensors
have critical dual roles in this process: (1) analyzing existing social
relations, and (2) providing resources for establishing new relations. Several
examples are discussed in relation to the latter role of sensors in shaping
social landscapes, suggesting the possibility to create various representations
that could support novel communication and collaboration practices. Keywords: augmented social landscapes, connectability, context awareness, geo-social
networking, urban sensing | |||
| Network management architecture toward universal communication | | BIBAK | Full-Text | 172-175 | |
| Yoshihiro Kawahara; Ahmad Kamil Abdul Hamid; TaeYoung Song; Kei Wada; Tohru Asami | |||
| Ubiquity of networked devices is one of the first steps toward realization
of universal communication services. However, not much attention has been paid
to the management architecture of the mashed-up services provided across the
network domains. Absence of the scalable cross-domain network management
architecture restricts the availability and penetration of the service. In this
paper, we propose Tambourine framework which defines a web service based a
network management API. Tambourine allows applications to access to the
management and control information of networked devices across the domains. Keywords: network management, new generation network, service composition, smart
environment, webservice | |||
| People, clouds, and interaction for information access | | BIBAK | Full-Text | 179-180 | |
| Tetsuya Sakai | |||
| Microsoft Research Asia (MSRA) currently has nineteen research groups that
cover various areas in computer science. The Web InTelligence (WIT) Group, led
by Chin-Yew Lin, is a recent spin-off from the Natural Language Computing
Group, and tackles problems in sentiment analysis, expert and social search,
social question answering and summarisation, user intent/activity recognition
and prediction, assisting inarticulate users, and information access
evaluation. In this talk, I will try to illustrate current strategies and
future visions of the WIT group by discussing human-human interaction,
computer-computer interaction, human-computer interaction and "evaluation
evolution," each in turn. Keywords: evaluation, information access, natural language processing, question
answering, search, web intelligence | |||
| Using web page layout for extraction of sender names | | BIBAK | Full-Text | 181-186 | |
| Rintaro Miyazaki; Ryo Momose; Hideyuki Shibuki; Tatsunori Mori | |||
| Recently, the credibility of information available on the Web has been
regarded as an important issue. Sender name is one of the important indicators
of the credibility of the information. In this paper, we propose a new method
for extracting sender name. The proposed method use the named entity
recognition method, and reducing the DOM node using Web page Layout for
preprocessing. Experimental result shows that our proposed method can
effectively extract sender names when the preprocessing is successful. Keywords: information credibility, natural language processing, sender name, web page
layout | |||
| Summarizing evaluative information on the web for information credibility analysis | | BIBA | Full-Text | 187-192 | |
| Daisuke Kawahara; Tetsuji Nakagawa; Takuya Kawada; Kentaro Inui; Sadao Kurohashi | |||
| The World Wide Web comprises a wide variety of evaluative information. It consists of positive and negative opinions on innumerable topics from various perspectives, thus proving to be a useful information source for information credibility analysis. To present an informative and at-a-glance summary of any topic that a user of such an analysis system searches for, it is important to summarize many diverse evaluative expressions on the topic. In this paper, we describe a method for summarizing an extensive variety of evaluative expressions that are automatically extracted. | |||
| Web information credibility analysis by geographical social support | | BIBAK | Full-Text | 193-196 | |
| Hiroaki Ohshima; Satoshi Oyama; Hiroyuki Kondo; Katsumi Tanaka | |||
| Since our daily lives strongly depend on information obtained by Web search,
the credibility of Web search results has become crucial. An important aspect
of the credibility of search results is regionality of Web pages. In this
paper, we propose a system for helping users assess the credibility of search
results by measuring and presenting the regionality of support to Web pages. We
conceive two different types of measures for evaluating "geographical social
support": the uniformity of support and the proximity of support. The
uniformity of geographical support (US) indicates uniformity of geographic
distribution of Web pages linking to a Web page. It is calculated by using the
Kullback-Leibler (KL) divergence. The proximity of geographical support (PS)
express how a page is supported by pages geographically located close to the
page. We describe our implemented prototype system that shows the two measures
for Web search results. Keywords: information credibility, local web search, social support | |||
| Application of 3D sound technology to intelligent robots | | BIBAK | Full-Text | 199-204 | |
| Youngjin Park | |||
| Various high-fidelity VAD systems are developed for many practical
application fields including games, home theatre, virtual reality, and military
simulator, etc. Head-related Transfer Function is the one of key functions
widely used in VAD system.
We developed robot auditory systems for sound source localization to achieve the effective human-robot interaction. The developed robot auditory system, which includes artificial ear, MEMS sensor, SoC (system-on-chips) for sound localization can be used for intelligent robots to process speech/acoustic signals. Keywords: acoustic MEMS sensor, human robot interaction, robot artificial ear, sound
direction estimation, sound source localization, spatially mapped GCC function | |||
| Headphone calibration for 3D-audio listening | | BIBA | Full-Text | 205-210 | |
| Ryouichi Nishimura; Parham Mokhtari; Hironori Takemoto; Hiroaki Kato | |||
| This paper proposes a new headphone calibration function for precise reproduction of 3D audio generated using simulated head-related transfer functions (HRTFs) or binaural recordings. In order to compensate for individual characteristics of the earcanal transfer functions and the eardrum impedance, which are generally different from person to person, the method consists of two steps: measuring sound pressure with blocked earcanals and that with open earcanals. The vibration of the eardrum can thereby be precisely reproduced as if the listener were in the original sound scene. Results of experiments using a head and torso simulator (HATS) revealed that sound pressure is correctly reproduced at the position of eardrum as well as at the entrance of the earcanal within a certain wide frequency range. | |||
| Representation and comparison of HRTF in spatio-temporal frequency domain | | BIBAK | Full-Text | 211-214 | |
| Yasuko Morimoto; Takanori Nishino; Kazuya Takeda | |||
| We represent a head-related transfer function (HRTF) in the spatio-temporal
frequency domain. Since an HRTF is defined as an acoustic function of time and
location of sound source, the spatio-temporal frequency characteristics of
HRTFs can be visualized and analyzed by multi-dimensional Fourier transform in
time and space. In our experiments, we investigate a basic property of the
spatio-temporal frequency characteristic and the difference between HRTFs
obtained by numerical analysis and actual measurements. The influences caused
by pinnae for the spatio-temporal frequency characteristic are also examined.
It is found that the spatio-temporal spectral components are mostly
concentrated in specific frequency bands, and these components are different in
each measurement condition. Keywords: Fourier transform, head-related transfer function, spatio-temporal frequency
analysis, visualization | |||
| Subjective effect of synthesis conditions in 3D sound field reproduction system using a few transducers and wave field synthesis | | BIBAK | Full-Text | 215-220 | |
| Toshiyuki Kimura; Munenori Naoe; Yoko Yamakata; Michiaki Katsumoto | |||
| In a conventional 3D sound field reproduction system using wave field
synthesis, numerous loudspeakers are placed around the listener. However, since
such a system is very expensive and loudspeakers are in the listener's field of
vision, it is very difficult to construct an audio-visual virtual reality
system. We have proposed a 3D sound field reproduction system using wave field
synthesis and eight transducers, which are placed at the vertex of a cube. In
this study, the effect of synthesis conditions on the localized perception was
evaluated when the synthesis conditions, the directivity of microphones, and
the size of cubic arrays, were varied. As a result, the performance of the
localized perception was good when shotgun microphones were used and the size
of arrays was that of a cube, measuring 0.4 m on each side. Keywords: microphone directivity, sound field reproduction, wave field synthesis | |||
| Multi-sensor based human activity detection for smart homes | | BIBA | Full-Text | 223-229 | |
| Liyanage C. De Silva | |||
| At the University of Brunei Darussalam, we have designed and built a prototype smart home to monitor human activities to improve the energy efficiency and support elder people. In this paper we present some of our early work related to smart monitoring, control and communication along with a review of other related research initiatives by researchers around the world. Especially we looked at research work carried out in Singapore, Japan and New Zealand. Here our main objective was to look into research work that enhances energy efficiency and eldercare with the use of multitude of sensors. With our simple prototype implementation we have also demonstrated the use of smart home technologies to reduce energy consumption in an average house. | |||
| Human shape reconstruction via graph cuts for voxel-based markerless motion capture in intelligent environment | | BIBA | Full-Text | 230-236 | |
| Masamichi Shimosaka; Kazuhiko Murasaki; Taketoshi Mori; Tomomasa Sato | |||
| In this paper, we propose a robust and real-time 3D human shape reconstruction method in daily life spaces to make practical voxel-based motion capture systems. Our algorithm extracts human silhouette and reconstructs human shape via volume intersection from multi view point images. The method presented in this paper is based on energy minimization via graph cuts, and its main features are: 1) to reduce the background subtraction errors caused by background clutter, 2) to have robustness for influences of shadows, 3) to segment the foreground region even if moving objects other than human. The precise human shape reconstructed by the method improves the accuracy of human pose estimation. Especially, 3) leads to enhance the range of application of the voxel-based human pose estimation. We demonstrate the effectiveness of our approach in terms of both quantitative and qualitative performance where strong shadows appear and moving objects are present in intelligent environment. | |||
| Stereo camera model of feed horns in focal plane array | | BIBAK | Full-Text | 237-240 | |
| Jung-Young Son; Seokwon Yeom; V. P. Guschin; Yuriy Vashpanov; Dong-Su Lee; Shin-Hwan Kim | |||
| The equivalent camera model of two feed horns in millimeter wave imaging
system is a radial type stereo camera with diverging axes. The stereo image
characteristics of the camera are analyzed. This camera model allows minimizing
the distance between cameras. Keywords: feed horn, imaging system, radial type stereo camera with diverging axes,
radiometry | |||
| A context-adaptive haptic interaction and its application | | BIBAK | Full-Text | 241-244 | |
| Youngjae Kim; Youmin Kim; Minsoo Hahn | |||
| Haptic is a promising interface for the next generation ubiquitous computing
environment. Most of the haptic-related study is limited to the first
person-based human-computer interaction [5], not a human-to-human
communication. The proposed system is focused on the personal communication
such as chatting or text messaging. Our system is designed to provide
manipulation ability of multiple sensors and multi actuators into a single
framework. Our contribution can be summarized into three part; (1) design of a
framework for a multi-sensor and multi-actuator interaction. (2) XML-based data
structure for a haptic description. (3) context-adaptive actuation control
using feedback mechanism. Keywords: communication, haptic, interaction, location, spatial | |||
| Normalization on the modulation spectrum of the subband temporal envelopes for automatic speech recognition in reverberant environments | | BIBAK | Full-Text | 247-254 | |
| X. Lu; M. Unoki; S. Nakamura | |||
| In this study, we proposed a feature extraction method based on the subband
temporal envelopes (STEs) and their normalization for reverberated speech
recognition. The STEs were extracted by using a series of constant bandwidth
band-pass filters with Hilbert transform followed by a low-pass filtering. In
the normalization, both the modulation spectrum (MS) of the subband temporal
envelopes of the clean and reverberated speech are normalized to a reference MS
calculated from a clean speech data set. Based on the normalized subband MS,
the inverse Fourier transform was used to restore the subband temporal
envelopes. We tested the proposed method on speech recognition in a reverberant
room with different speaker to microphone distance (SMD). For comparison, the
recognition performance of using the traditional Mel-cepstral coefficients with
mean and variance normalization were used as the baseline. Experimental results
showed that, by averaging the SMDs from 50 cm to 400 cm, there was a 44.96%
relative improvement by only using subband temporal envelope processing, and
further a 15.68% relative improvement by using the normalization on the subband
modulation spectrum. Totally, there was about a 53.59% relative improvement,
which was better than those of using other temporal filtering and normalization
methods. Keywords: automatic speech recognition, dereverberation, subband temporal envelope,
temporal modulation | |||
| Evaluation for WFST-based dialog management | | BIBAK | Full-Text | 255-260 | |
| Chiori Hori; Kiyonori Ohtake; Teruhisa Misu; Hideki Kashioka; Satoshi Nakamura | |||
| To construct an expandable and adaptable dialog system which handles
multiple tasks, we proposes a dialog system using a weighted finite-state
transducer (WFST) in which users concept and system action tags are input and
output of the transducer, respectively. To test the potential of the WFST-based
dialog management (DM) platform using statistical DM models, we construct a
dialog system using a human-to-human spoken dialog corpus for hotel
reservation, which is annotated with Interchange Format (IF). A scenario, a
Spoken Language Understanding (SLU) and a Sentence Generation (SG) WFSTs are
obtained from the corpus and then composed together and optimized to generate a
Dialog Management (DM) WFST. We evaluate the detection accuracy of the system
next actions using Mean Reciprocal Ranking (MRR). We evaluated how WFST
optimization operations contribute to dialog systems and confirmed the
optimization enhance the performance of accuracy of the next action detection. Keywords: WFST optimization operation, interchange format (IF), spoken dialog,
statistical dialog management, weighted finite-state transducer (WFST) | |||
| SOBEX: distributed service search engine that exploits service collaboration context | | BIBA | Full-Text | 261-268 | |
| Rong Zhang; Koji Zettsu; Takafumi Nakanishi; Yutaka Kidawara; Yasushi Kiyoki | |||
| Service-oriented architecture (SOA) is emerging as a paradigm for developing
distributed application. As the development of hardware and software
technology, fast increasing of peers or services has issued critical problems
for the popularity of SOA. One is system scalability and robustness, and the
other one is service location validation. In this paper, we introduce SOBEX, a
web service search engine which designs a distributed indexing structure SIKA,
and proposes proactive web services reuse mechanism by introducing service
context model SPOT.
SIKA is a community-oriented virtual hierarchical distributed indexing structure based on classic Chord algorithm. Though it groups nodes into interest-based communities, it is completely distributed and without central management. Then it promises system search efficiency together with scalability and robustness. The growing number of web services available with an organization and on the web raises new problem: locating the desired web services. Generally keyword-based search has meet with high recall and low precision. In order to improve search efficiency, SOBEX proposes to qualify services using service usage context model, which tries to reduce the concept understanding gap between human and computer by proactively assigning the services with their own story background. On the other hand, besides traditional keyword-based methods, it introduces context-based queries to improve service reusability. | |||
| Towards moving phenomena data management | | BIBAK | Full-Text | 269-272 | |
| Koji Zettsu; Kyoung-Sook Kim; Yutaka Kidawara; Yasushi Kiyoki | |||
| With the spread of Geoweb, people can more easily create and exchange
geo-spatiotemporal contents on the Web. Consequently, it becomes an emerging
issue to manage exploding amount of Geoweb contents more efficiently than
traditional approaches for mapping contents on a digital earth and/or a time
line. In this paper, we propose a novel approach for managing Geoweb contents
based on the idea of moving phenomena such as a typhoon and price escalation.
Our moving phenomenon DBMS defines data types and predicates of moving
phenomena, and the visual query interface, called Sticker, allows users to
aggregate Geoweb contents with 3D view of the moving phenomena. Our framework
works well for obtaining comprehensive knowledge about the situations or
influences of real-world phenomena from the Geoweb contents. Keywords: Geoweb, Sticker, event aggregate, moving phenomena, spatiotemporal databases | |||
| Support tools for literature-based information access in molecular biology | | BIBA | Full-Text | 273-277 | |
| Fabio Rinaldi; Dietrich Rebholz-Schuhmann | |||
| The fast production of information in molecular biology, driven by high-throughput experiments, leads to strong ongoing demands for the integration of the literature into the information and knowledge discovery channels of the biomedical research domain. This paper describes tools developed by the authors with the aim of supporting professional biologists in accessing the information contained in the scientific literature. | |||
| Usage of change-related non-invasive imaging paradigms to investigate the representation of sound in the human brain | | BIBA | Full-Text | 281-284 | |
| Christian F. Altmann | |||
| To efficiently recognize and localize sounds is of paramount importance for our everyday life. However, the computational processes that underlie these capabilities in the human brain are still not fully understood. A powerful tool to study the representation and transformation of sensory information in the human brain are change-related paradigms. This text reviews three recent examples from our lab that employed change-related paradigms with different brain imaging modalities to characterize the representation of sounds in the human brain. Specifically, a first experiment used functional magnetic resonance imaging and signal response suppression after stimulus repetition to characterize the representation of natural sounds. A second magnetoencephalo-graphic experiment, used a two-tone paradigm to describe the time-course of adaptation to natural sounds. In a third experiment, we employed a spatial mismatch negativity oddball paradigm during electroencephalography to test for head-related versus allocentric representation of sound sources in the human brain. | |||
| Measurements of vergence and accommodation while viewing a real 3D scene and its 2D image on a display | | BIBA | Full-Text | 285-288 | |
| Haruki Mizushina; Hiroshi Ando; Takanori Kochiyama; Shinobu Masaki | |||
| It is widely thought that conflict between vergence and accommodation may be
a major factor of visual fatigue and discomfort caused by viewing stereoscopic
images on 3D displays. However, few studies measured vergence and accommodation
simultaneously while viewing a real 3D scene and its 2D image on a traditional
display.
In this study we measured vergence and accommodation responses simultaneously while viewing a 3D real object located at various distances from the participant and its 2D image (photograph) including background scene presented on a display located at fixed distance. The result shows that vergence and accommodation varied with changing target distance while viewing the 3D real object, as expected. On the other hand changing target distance depicted in the photographic image while viewing the 2D display evoked no systematic change of vergence and accommodation. Some participants showed that noticeable accommodation lag and fixation disparity. In addition to that we observed considerable conflicts between vergence and accommodation in both 3D and 2D conditions, but no one reported perceived defocus and/or double vision. There were varieties of individual differences in the pattern of the conflicts. | |||
| Method for identifying aromas using adjective characteristics to enhance the reality of visual images | | BIBAK | Full-Text | 289-295 | |
| Chika Oshima; Koichi Nakayama; Hiroshi Ando | |||
| Some works have suggested that certain aromas can enhance the reality of
visual images of distant locations on the basis of the CONTENTS constituting
the images; and these CONTENTS can be referred to using nouns. Aromatic
materials need to be classified in order to identify the ones corresponding to
each CONTENT. In this paper, we conducted two experiments. The subjects first
rated the extent to which the aromas enhanced the reality of the visual images.
CONTENTS of these visual images were different kinds of trees. Nine aromatic
materials received high ratings. The subjects then rated the aromas using
adjectives. The nine aromatic materials were classified into two clusters on
the basis of the adjectives. These results showed that the use of adjectives to
describe aromas is practical for such a classification. Keywords: adjective, aroma, recommendation system | |||
| Neural correlates of externalized auditory motion perception under reverberation | | BIBAK | Full-Text | 296-299 | |
| Akiko Callan; Hiroshi Ando | |||
| Using functional magnetic resonance imaging, we investigated neural
substrates of realistic auditory motion perception. "Realistic" here means
experiencing the sound as located outside the head instead of originating
inside the head. In order to examine neural effects of moving sounds and neural
effects of externalized sounds separately, we included two experimental factors
in our design: whether auditory stimuli were externalized or not
(externalizability factor) and whether auditory stimuli were moving or not
(motion factor). Externalized sounds activated planum temporale (PT) more than
non-externalized sounds. Moving sounds activated posterior middle temporal
gyrus (pMTG) more than stationary sounds. An interaction effect was found in
the right PT. Our results indicate that the PT and pMTG are involved in
realistic auditory motion perception. The fidelity of auditory space
presentation may be evaluated by observing neural activity change in the PT and
pMTG. Keywords: auditory motion, externalized, fMRI, neuroimaging | |||
| Eye-gaze experiments for conversation monitoring | | BIBAK | Full-Text | 303-308 | |
| Kristiina Jokinen; Masafumi Nishida; Seiichi Yamamoto | |||
| Eye-tracking technology has recently been matured so that its use in studies
dealing with unobtrusive and natural user experiments has become easier to
conduct. Simultaneously, human computer interactions have become more
conversational in style, and more challenging in that they require various
human conversational strategies, such as giving feedback and managing
turn-taking. In this paper, we focus on eye-gaze in order to investigate turn
taking signals and conversation monitoring in naturally occurring dialogues. We
seek to build models that deal with the important aspects of which interlocutor
the speaker is talking to, and what kind of turn taking signals the partners
elicit, and we report the first results of our eye-tracking experiments. Keywords: eye-tracking, human-human interaction, multiparty conversation | |||
| A speech-driven embodied entrainment wall picture system for supporting virtual communication | | BIBAK | Full-Text | 309-314 | |
| Yoshihiro Sejima; Tomio Watanabe | |||
| We have developed a speech-driven embodied entrainment system called
"InterPicture" and have demonstrated the effectiveness of the system using an
embodied virtual communication system. InterPicture is an image containing
flowers that react to the speech input of talkers. We confirmed the importance
of providing a communication environment in which not only avatars but also CG
objects placed around the avatars are related to virtual communication. In this
study, we have developed an advanced speech-driven embodied entrainment system
called "InterWall". This system projects wall picture widely onto the wall
surrounding avatars and behaves as a listener by producing nodding and body
movements on the basis of the speech input of a talker. Further, a
communication experiment has been performed, and the effectiveness of
"InterWall" has been demonstrated by carrying out a sensory evaluation and a
speech-overlap analysis for 20 pairs of 40 talkers. Keywords: embodied communication, entrainment, human interaction, nonverbal
communication, virtual communication | |||
| Sensing web: to globally share sensory data avoiding privacy invasion | | BIBAK | Full-Text | 315-318 | |
| Ikuhisa Mitsugami; Michihiko Minoh; Tsuneo Ajisaka; Noboru Babaguchi | |||
| This paper gives an overview of the Sensing Web project, launched in 2007 in
Japan. The project's aim is to open the data obtained by the sensors existing
in our daily living environment for various purposes. Since the data obtained
by observing the real world directly with sensors include real-world
information different from the Web, a new worldwide social information
infrastructure -- Sensing Web -- is realized. In this article, we discuss the
research issues for arising in connection with the Sensing Web. Keywords: information infrastructure, privacy-invasion-free, sensory data,
symbolization | |||
| An agent-based management scheme of context information for context-aware service | | BIBAK | Full-Text | 319-324 | |
| Hideyuki Takahashi; Takuo Suganuma; Norio Shiratori | |||
| This paper describes a scheme to increase the availability of context-aware
services in ubiquitous computing environment by managing context information
effectively where computational and network resources are insufficient. For the
context-aware service, function of overall system and quality of service are
also required to be maintained by circulating context information in adequate
quality. This scheme manages the context information based on relationship
between quality of context and quality of service. This scheme can avoid
degradation of available network bandwidth and computational resource caused by
circulation of excessive context information. From the initial experimental
results using a prototype system of a ubiquitous live streaming video service,
we confirm the available network bandwidth and computational resource. It is
effectively maintained and recovered by controlling the update frequency of
user's location information properly depending on the situation. Keywords: context-aware service, multiagent systems, quality of context, quality of
service | |||
| The utilization method of idle PC resources | | BIBAK | Full-Text | 325-328 | |
| Yutaka Hirakawa; Yoshifumi Matsuda | |||
| Few of the large number of personal computers (PCs) in homes and small
offices are used continuously. This article discusses a method of utilizing
idle PC resources by assigning download jobs to idle PCs and distributing it
among them. The requirements for the utilization method are as follows:
R1: When a user suddenly starts using an idle PC, he/she must be able to work effectively. R2: When a user shuts down a PC abruptly, the system must continue to operate with any interruption. The proposed method of resource utilization monitors bandwidth usage and avoids inefficiency in users' work. The evaluation results of the experiment system are described. The proposed method requires the existence of a leader PC in a network. The evaluation results of a new effective leader election method that assumes the existence of network attached storage (NAS) are also described. Keywords: distributed systems, leader election, utilization of idle PCs | |||
| Analysis of hand movement variation related to speed in Japanese sign language | | BIBAK | Full-Text | 331-334 | |
| Yuta Yasugahira; Yasuo Horiuchi; Shingo Kuroiwa | |||
| To achieve the greater accessibility for deaf people, sign language
recognition systems and sign language animation systems must be developed. In
Japanese sign language (JSL), previous studies have suggested that emphasis and
emotion cause changes in hand movements. However, the relationship between
emphasis and emotion and the signing speed has not been researched enough. In
this study, we analyzed the hand movement variation in relation to the signing
speed. First, we recorded 20 signed sentences at three speeds (fast, normal,
and slow) using a digital video recorder and a 3D position sensor. Second, we
segmented sentences into three types of components (sign words, transitions,
and pauses). In our previous study, we analyzed hand movement variations of
sign words in relation to the signing speed. In this study, we analyzed
transitions between adjacent sign words by a method similar to that in the
previous study. As a result, sign words and transitions showed a similar
tendency, and we found that the variation in signing speed mainly caused
changes in the distance hands moved. Furthermore, we compared transitions with
sign words and found that transitions were slower than sign words. Keywords: Japanese sign language, hand movement, transition | |||
| High resolution computer-generated cylindrical hologram | | BIBAK | Full-Text | 335-338 | |
| Tomohisa Ito; Takeshi Yamaguchi; Hiroshi Yoshikawa | |||
| We investigate the computer-generated cylindrical hologram. Since the
general flat format hologram has a limited viewable area, we usually cannot see
the other side of the reconstructed object. There are some holograms to solve
this problem. A cylindrical-type hologram is well known as the 360-deg viewable
hologram. There are two kinds of cylindrical holograms, a multiplex hologram
and a laser reconstruction 360-deg hologram. Since the multiplex hologram
consists of many 2-D pictures, the reconstructed image is not truly 3-D. In
contrast, a laser reconstruction 360-deg hologram has a true 3-D effect. In our
previous study, the computer-generated cylindrical hologram was realized as a
Fresnel hologram. However, since the spatial resolution and pitch of the output
device is not enough. Its panel size 14.5mm x 10.9mm, resolution 1,400 x
1,050pixel, pixel pitch 10.4μm of Liquid Crystal on Silicon use reduced to
1/12 and made a hologram. In this report, panel size 13.8mm x 7.56mm,
resolution 1,920 x 1,080 pixel, pixel pitch 7μm of Liquid Crystal on Silicon
use reduced to 1/16 and made a hologram. To scale up reconstructed image size,
we calculated high resolution computer-generated cylindrical hologram. Then, we
print these fringes with the improved output device. As a result, we obtain a
good reconstructed image from a computer-generated cylindrical hologram. Keywords: computer-generated hologram, cylindrical hologram, fringe printer,
holography | |||
| Spatial memorization aid system: registration of mental memory space | | BIBAK | Full-Text | 339-343 | |
| Ken Ishigaki; Yasushi Ikei | |||
| The present paper proposes a novel approach to augment human memory based on
spatial and graphic information mediated by an electronic device. A technique
for memorizing a number of unstructured items is called mnemonics. Although its
advantage is excellent, to acquire the skill to utilize the mnemonics is
generally difficult. A new spatial mnemonics system presented in this paper has
resolved the problem by facilitating the process of acquisition of the skill. A
virtual memory peg is introduced for the purpose based on the images of the
real space and object. The characteristics of the virtual memory peg was
investigated in terms of the length of the peg that was created in the real
physical environment. Another creation process based on the photographs
provided by the experimenter was examined to show that the peg could be built
without walking through the real environment. The both results clearly
demonstrated the effectiveness of the proposed method. Keywords: imagery, memory peg, mnemonics, photo-montage | |||
| Searching for comparison points between two objects from the web | | BIBA | Full-Text | 344-349 | |
| Shinya Aoki; Takayuki Yumoto; Manabu Nii; Yutaka Takahashi | |||
| Recently, we have been able to often compare two objects using search engines. However, we often browse high ranked Web pages by search engines, which may give biased information. We propose a method for searching Web pages where two objects are compared using a search engine, extracting comparison points from those Web pages, and showing these points to users. Comparison points are keywords for comparing objects. The proposed method can be used to extract points for efficient comparison by using comparison expressions such as "Liquid Crystal TVs are better ..." and "... than Plasma TVs.", etc. | |||
| How-to information search by lightweight analysis of web pages | | BIBA | Full-Text | 350-354 | |
| Ryouji Nonaka; Takayuki Yumoto; Manabu Nii; Yutaka Takahashi | |||
| We propose a method for searching for comprehensible how-to information on the Web. In our how-to information search, we use lightweight analysis of Web pages to extract how-to information from Web pages obtained by conventional Web search engines and rank them according to their easily-viewable-degree. In the extraction process, we focus on expressions in Web page text blocks that describe procedures. In the ranking process, we focus on images, the effect of letter string and the length of the how-to information. | |||
| Linking Wikipedia entries to blog feeds by machine learning | | BIBAK | Full-Text | 355-362 | |
| Mariko Kawaba; Hiroyuki Nakasaki; Daisuke Yokomoto; Takehito Utsuro; Tomohiro Fukuhara | |||
| This paper studies the issue of conceptually indexing the blogosphere
through the whole hierarchy of Wikipedia entries. This paper proposes how to
link Wikipedia entries to blog feeds in the Japanese blogosphere by machine
learning, where about 300,000 Wikipedia entries are used for representing a
hierarchy of topics. In our experimental evaluation, we achieved over 80%
precision in the task. Keywords: Wikipedia, blog feed retrieval, blogosphere, topics | |||
| JSPad: a sign language writing tool using SignWriting | | BIBAK | Full-Text | 363-367 | |
| Tadahiro Matsumoto; Mihoko Kato; Takashi Ikeda | |||
| SignWriting is a practical writing system for sign languages. In this paper
we present a software program, JSPad, for writing Japanese sign language (JSL)
with SignWriting. SignWriting has a large set of visually iconic symbols that
represent handshapes, movements, locations and facial expressions; it can take
a lot of time to find and choose appropriate symbols from the set to compose
signs, particularly for novice users of SignWriting. JSPad can generate
SignWriting signs from JSL text written in the JJS notation, which is a
gloss-based notation system we proposed. This facility allows users to write
JSL sentences in SignWriting in shorter time. Keywords: SignWriting, deaf education, notation system, sign language, sign language
text, writing system | |||
| Resources for Mongolian language | | BIBAK | Full-Text | 368-371 | |
| Purev Jaimai; Odbayar Chimeddorj | |||
| Mongolian language is spoken by about 8 million speakers. This paper
summarizes the current status of its resources in Mongolia. Keywords: Mongolian language resources, corpus for Mongolian, natural language
processing, tools for Mongolian language | |||
| Dialogue act annotation for consulting dialogue corpus | | BIBA | Full-Text | 372-378 | |
| Kiyonori Ohtake; Teruhisa Misu; Chiori Hori; Hideki Kashioka; Satoshi Nakamura | |||
| This paper introduces a new corpus of consulting dialogues, which is designed for training a dialogue manager that can handle consulting dialogues through spontaneous interactions from the tagged dialogue corpus. We have collected 130 h of consulting dialogues in the tourist guidance domain. This paper outlines our taxonomy of dialogue act annotation that can describe two aspects of an utterances: the communicative function (speech act), and the semantic content of the utterance. We provide an overview of the Kyoto tour guide dialogue corpus and a preliminary analysis using the dialogue act tags. | |||
| Unit selection using k-nearest neighbor search for concatenative speech synthesis | | BIBAK | Full-Text | 379-382 | |
| Hideyuki Mizuno; Satoshi Takahashi | |||
| We propose a new approach to rapidly identifying adequate synthesis units in
extremely large speech corpora. Our aim is to develop a concatenative speech
synthesis system with high performance (both speech quality and throughput) for
various practical applications. Utilizing very large speech corpora allows more
natural sounding synthesized speech to be created; the downside is an increase
in the time taken to locate the synthesis units needed. The key to overcoming
this problem is introducing state-of-the art database retrieval technologies.
The first selection step, based on simple hash search, tabulates all synthesis
unit candidates. The second step selects N best candidates using nearest
neighbor search, a typical database retrieval technique. Finally, the best
sequence of synthesis units is determined by Viterbi search. A runtime
measurement test and subjective experiment are carried out. Their results
confirm that the proposed approach reduces the runtime by about 40% compared to
using only hash search with no degradation in the quality of synthesized speech
for a 15 hour corpus. Keywords: concatenative speech synthesis, nearest neighbor search, synthesis unit
selection, text to speech | |||
| Dynamic selection method of the best search engine for a user's query | | BIBAK | Full-Text | 383-388 | |
| Kodai Mizuno; Kyoji Kawagoe | |||
| In this paper, we propose a new dynamic selection method of the best search
engine for a user's query. When users retrieve on the Internet, the expert
users manually select the best search engine for their queries. However, the
most important problem is that the novice users cannot understand features of
all search engines. Consequently, because such users cannot select the best
search engine, the users cannot obtain the best retrieval results. In this
paper, we focus the number of retrieval results, and we calculate search
engines' matching scores suitable for the user's query by using this focus
point. As a result, novice users can select the best search engine using the
scores calculated by our system. Keywords: information retrieval, query, search engine selection, web search | |||
| Hyperbolic structure of fundamental frequency contour | | BIBAK | Full-Text | 389-394 | |
| Jinfu Ni; Shinsuke Sakai; Hisashi Kawai; Satoshi Nakamura | |||
| In this paper, we propose an approach to transformation of fundamental
frequency (F0) contours for conversational speech synthesis. The figure
of F0 in relations to the period of cycles of sound waves is one branch
of the rectangular hyperbola. Based on a few symmetry assumptions on the
hyperbolic property, we achieve a generalized hyperbolic structure so as to
aggressively manipulate F0 contours. The modeling proves an equivalent
expression of the resonance mechanism capable for dealing with the interaction
of tone and intonation. Also, it is language-independent because no
language-dependent hypothesis is necessary. This paper describes two
applications of the hyperbolic structures of F0 contours to prosodic
information processing. One modulates the baseline F0 contours when
fusing additional makeup information onto them without altering the underlying
linguistic information. The other separates local rise/fall F0 movements
and global scale component from observed F0 contours, both being useful
for estimating dynamical F0 variation. Our experimental results are very
positive. Keywords: F0 control, intonation, speech prosody, speech synthesis | |||
| Automatic plagiarism detection among term papers | | BIBAK | Full-Text | 395-399 | |
| Takahisa Ota; Shigeru Masuyama | |||
| Recently, plagiarized term papers have become a serious problem. Therefore,
we propose, in this paper, a method to detect plagiarized parts between two
term papers. Our method is based on the Smith-Waterman algorithm that can
detect similar parts between two molecules. Moreover, we experimented on our
method using a document set consisting of actually submitted term papers and
artificially-produced ones that plagiarized a paper written on the same theme.
Experimental results show that our method attains higher accuracy than
conventional ones. Keywords: Smith-Waterman algorithm, dynamic programming, partial text alignment,
plagiarism detection, term papers | |||
| Spoken document retrieval using topic models | | BIBAK | Full-Text | 400-403 | |
| Xinhui Hu; Ryosuke Isotani; Satoshi Nakamura | |||
| In this paper, we propose a document topic model (DTM) based on the
non-negative matrix factorization (NMF) approach to explore spontaneous spoken
document retrieval. The model uses latent semantic indexing to detect
underlying semantic relationships within documents. Each document is
interpreted as a generative topic model belonging to many topics. The relevance
of a document to a query is expressed by the probability of a query being
generated by the model. The term-document matrix used for NMF is built
stochastically from the speech recognition N-best results, so that multiple
recognition hypotheses can be utilized to compensate for the word recognition
errors. Using this approach, experiments are conducted on a test collection
from the Corpus of Spontaneous Japanese (CSJ), with 39 queries for over 600
hours of spontaneous Japanese speech. The retrieval performance of this model
is proved to be superior to the conventional vector space model (VSM) when the
dimension or topic number exceeds a certain threshold. Moreover, whether from
the viewpoint of retrieval performance or the ability of topic expression, the
NMF-based topic model is verified to surpass another latent indexing method
that is based on the singular value decomposition (SVD). The extent to which
this topic model can resist speech recognition error, which is a special
problem of spoken document retrieval, is also investigated. Keywords: NMF, document topic model, spoken document retrieval | |||
| Soft margin estimation on improving environment structures for ensemble speaker and speaking environment modeling | | BIBAK | Full-Text | 404-408 | |
| Yu Tsao; Jinyu Li; Chin-Hui Lee; Satoshi Nakamura | |||
| Recently, we proposed an ensemble speaker and speaking environment modeling
(ESSEM) approach to enhance the robustness of automatic speech recognition
(ASR) under adverse conditions. The ESSEM framework comprises two phases,
offline and online phases. In the offline phase, we prepare an environment
structure that is formed by multiple sets of hidden Markov models (HMMs). Each
HMM set represents a particular speaker and speaking environment. In the online
phase, ESSEM estimates a mapping function to transform the prepared environment
structure to a set of HMMs for the unknown testing condition. In this study, we
incorporate the soft margin estimation (SME) to increase the discriminative
power of the environment structure in the offline stage and therefore enhance
the overall ESSEM performance. We evaluated the performance on the Aurora-2
connected digit database. With the SME refined environment structure, ESSEM
provides better performance than the original framework. By using our best
online mapping function, ESSEM achieves a word error rate (WER) of 4.62%,
corresponding to 14.60% relative WER reduction (from 5.41% to 4.62%) over the
best baseline performance of 5.41% WER. Keywords: ASR, ESSEM, SME, model adaptation, noise robustness | |||
| A method for helpdesk-oriented question answering | | BIBAK | Full-Text | 409-415 | |
| Satoru Sasaki; Atsushi Fujii | |||
| We propose a Question Answering (QA) method that answers actions for a
how-question. We model an action as a verb phrase consisting of a main verb and
its governing noun phrase. Existing QA methods resemble consulting dictionaries
and encyclopedias, in which users satisfy their intellectual cravings. In
contrast, our method is a step toward automation of a helpdesk or a call
center, which suggests solutions to alleviate user's problems. We show the
effectiveness of our method experimentally. Keywords: question answering, web retrieval | |||