Download Welcome to EJC 2016

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
EJC 2016
Programme of the 26th International Conference
on Information Modelling and Knowledge Bases
June 6–10, 2016
Tampere, Finland
Welcome to EJC 2016
The series of International Conference on Information Modelling and Knowledge Bases (EJC) originally
started as a co-operation initiative between Japan and Finland in 1988 as a continuum to five conferences
in Scandinavian scope, having the same variety of study topics. The practical operations were then
organized by professor Ohsuga from Tokyo University RCAST and professors Hannu Kangassalo from
University of Tampere and Hannu Jaakkola from Tampere University of Technology. The geographical
scope has expanded first to Europe and further to international. The 26th International Conference on
Information Modelling and Knowledge Bases (EJC 2016) constitute a world-wide research forum for the
exchange of scientific results and experiences. To be exact, the conference is actually 34th, if the
preceding Scandinavian (five since 1982) and Scandinavian-Japanese (three since 1988) conferences are
counted. In this way a platform has been established drawing together researches as well as practitioners
dealing with information modelling and knowledge bases. Most of all, welcome to EJC 2016, which is
organized by Tampere University of Technology Pori Department and held in the city of Tampere in
Holiday Club Tampere Spa (Tampereen Kylpylä).
Welcome to Tampere. The city was founded in 1775 and it belongs to Pirkanmaa Region. The population
of the city is 223 292, growing to 313 058 people in the urban area, and 364 000 in the metropolitan.
Tampere is the second-largest urban area in Finland and third most-populous individual municipality in
the country. Tampere is located between two lakes, Näsijärvi and Pyhäjärvi. Since the two lakes differ in
level by 18 metres, the rapids linking them, Tammerkoski, have been an important power source
throughout history. It is also the source of its industrial past as the former center of Finnish industry.
Tampere (area) has four institutions of higher education totaling 40 000 students. University of Tampere
(more than 16 000 students), Tampere University of Technology (close to 10 000 students), and two
polytechnic institutes (Tampere University of Applied Sciences and Police College of Finland). The current
plan is to merge both universities and the University of Applied Sciences in one organization (working
name T3) from the beginning of 2018. The new university totals 35 000 students and represents wide
variety of research areas.
The organizers from Tampere University of Technology Pori Department (TUT Pori) welcomes you to EJC
2016 and wishes enjoyable time both in scientific discussions and in leisure time. TUT Pori is a satellite
organization of the university and locates in the city of Pori as a part of University Consortium of Pori
(UCPori). UCPori is an umbrella organisation of an unique partnership of four universities. It brings
together four Finnish universities – Aalto University, Tampere University of Technology, University of
Tampere and University of Turku that brings together 2 500 students and 170 experts representing
multidisciplinary scope of research and education. UCPori locates on the bank of the Kokemäenjoki River
in a renovated industry real estate having oldest parts from late 1890ies. The industry area belonged
originally to a cotton factory Porin Puuvilla, which was merged to Finlayson in 1973; this is our transfer
back to Tampere and its industrial history. In addition, the Kokemäenjoki River has its starting point in
Pyhäjärvi, the lake on the western side of Tampere.
The important milestone in the birth of industrial Tampere is machinery workshop established by
Scotchman James Finlayson in 1820. It produced the carding machines and spinning machines for the
spinning of wool and linen. This business was not very successful and it moved from the making of
machines to the spinning of a cotton thread and wool thread and to the weaving of the cloth. Finlayson
has been important factor in industrializing Tampere; in the middle of 19th century every third inhabitant
of Tampere worked for it. The first electric lights of the Nordic countries lit in the factory of Finlayson in
1882 in its big textile factory hall Plevna.
Tampereen Puuvillatehdas started its operation – the spinning mill and the textile factory - in 1899. It built
its first factory buildning in Lapinniemi cape of Näsijärvi Lake; the topping-out party of it was in January
26th, 1899. In 1934 it was bought by and merged to Tampella, which moved all its textile industry to
Lapinniemi in 1977. The factories were closed in the middle of 1980ies. This industry area hosts now our
conference in the form of Holiday Club Tampereen Kylpylä, after the large renovation and modernization
work. In spite of these changes some original rooms and atmosphere can still be seen in the area.
The short introductory story above provides a short snapshot of the industrial history of Tampere –
however very focused and narrow. As mentioned earlier Tampere has been and still is one of the most
important industrial cities in Finland; same fits to Pori. Traditional industries – textile and metal – are
replaced by high tech industry and human capital needed by the new companies have their roots in higher
education. The industry premises have found new users: Lapinniemi is used by SPA, original Tampella area
in Tampere city is in office use and as apartments, Finlayson area is mainly in office use. Additional
activities in these areas cover restaurants, theatres, museums etc; e.g. the Plevna Hall in Finlayson hosts
one of the best beer breweries in Finland.
We thank all colleagues for their support to the EJC conference, especially the program committee, the
organizing committee, and the program coordination team. We also point out our gratitude to all our
sponsors for the financial support.
The series of EJC conferences has started in Rautavesi Lake area in Ellivuori. This year we have again
returned to the Finnish lakeside. Enjoy the conference week.
EJC 2016 Organizers
Conference Organization
General Program Chair
Hannu Kangassalo, University of Tampere, Finland
Program Committee Co-Chairs
Yasushi Kiyoki, Keio University, Japan
Bernhard Thalheim, Christian-Albrechts University at Kiel, Germany
Program Committee
Boštjan Brumen, University of Maribor, Slovenia
Pierre-Jean Charrel, University of Toulouse and IRIT, France
Xing Chen, Kanagawa Institute of Technology, Japan
Marie Duží, VSB-Technical University Ostrava, Czech Republic
Jørgen Fischer Nilsson, Technical University of Denmark, Denmark
Anneli Heimbürger, University of Jyväskylä, Finland
Jaak Henno, Tallinn University of Technology, Estonia
Yoshihide Hosokawa, Gunma University, Japan
Hannu Jaakkola, Tampere University of Technology (Pori), Finland
Sebastian Link, University of Auckland, New Zealand
Heinrich C. Mayr, Alpen-Adria University Klagenfurt, Austria
Tommi Mikkonen, Tampere University of Technology, Finland
Tomoya Noro, Fujitsu Laboratories Ltd., Japan
Jari Palomäki, Tampere University of Technology, Finland
Bernhard Rumpe, RWTH Aachen, Germany
Shiori Sasaki, Keio University, Japan
Tetsuya Suzuki, Shibaura Institute of Technology, Japan
Naofumi Yoshida, Komazawa University, Japan
External Reviewers
Boštjan Šumak, University of Maribor, Slovenia
Marek Mensik, VSB-Technical University Ostrava, Czech Republic
Taoufiq Dkaki, University of Toulouse and IRIT, France
General Organizing Chair
Hannu Jaakkola, Tampere University of Technology (Pori Department), Finland
Organizing Committee
Xing Chen, Kanagawa Institute of Technology, Japan
Ulla Nevanranta, Tampere University of Technology (Pori Department), Finland
Program Coordination Team
Naofumi Yoshida, Komazawa University, Japan (Chair)
Anneli Heimbürger, University of Jyväskylä, Finland
EJC 2016 Programme
All presentations will take place in conference room Sarka, ground floor of the conference hotel.
Monday, 6.6.2016
18:00–20:00 Get together and registration, at Holiday Club conference center, ground floor
Tuesday, 7.6.2016
8:30–9:00 Conference Registration (if not possible on Monday)
9:00–9:30 Conference Opening
Welcome Speech and info about Arrangements
Organizing General Chair, Professor Hannu Jaakkola, Tampere University of Technology
9:30-10:30 The 25 years Celebration Session of Research Collaboration between “TUT Pori" and "KEIO
University SFC"
Chairs: Hannu Jaakkola and Yasushi Kiyoki
Invited talk: Challenges in Creating Smart City Services with IoT/CPS Platforms
Professor Hideyuki Tokuda, Keio University SFC
10:30–10:45 Coffee Break
10:45–12:30 Session 1: Appetizer
Chair: Bernhard Thalheim
12:30–13:30 Lunch
13:30–15:00 Session 2: Software development
Chair: Hannu Jaakkola
13:30–14:00 Visualizations for Software Development Process Management
Timo Lehtonen, Timo Aho, Kati Kuusinen, Tommi Mikkonen
14:00–14:20 Refactoring - key to success for constantly developed projects
Janari Põld, Ahto Kalja, Tarmo Robal
14:20–14:40 Ad-hoc Synthesis of Composite Content Ontology Design Patterns
Pavel Lomov, Maxim Shishaev
14:40–15:00 Data Federation by Using a Governance of Data Framework Artifact as the Tool – Case
clinical breast cancer treatment data
Tomi Dahlberg, Tiina Nokkala, Jukka Heikkilä, Marikka Heikkilä
15:00–15:15 Coffee Break
15:15–16:45 Session 3: Environment and context studies
Chair: Yasushi Kiyoki
15:15–15:45 A Multi-dimensional River-water Quality Analysis System for interpreting Environmental
Situations
Chalisa Veesommai, Yasushi Kiyoki, Shiori Sasaki
15:45–16:15 A Needs-Based Context Aware Application Model for Pervasive Environments
Manal A. Yahya, Ajantha Dahanayake
16:15–16:45 Prediction of Alum Dosage in Water Supply by WEKA Data Mining Software
Petchporn Chawakitchareon, Nattanan Boonnao, Rawin Taychamekiatchai, Pakorn Charutragulchai
Meeting point Laukontori harbour, address: Laukontori 4
18:00–18:20 Boat departs from Laukontori harbour to island Viikinsaari, time to search the island
19:30–21:00 Dinner in Restaurant Viikinsaari
21:30–21:50 Boat departs from Viikinsaari to Laukontori harbour
Wednesday, 8.6.2016
9:00–10:00 Invited talk: How does the brain process information and learn?
Senior Research Fellow Marja-Leena Linne, Signal Processing, Tampere University of Technology
Chair: Hannu Kangassalo
10:00–10:15 Coffee Break
10:15–12:20 Session 4: Environment and context studies
Chair: Xing Chen
10:15–10:35 Building the Prototype of Vector-Control Strategy Interoperability in Dengue Fever:
Case Surabaya, Kuala Lumpur, Bangkok
Wahjoe Sesulihatien, Yasushi Kiyoki, Shiori Sasaki, Azis Safei, Subagyo Yotopranoto, Virach
Sornlertlamvanich, Aran Hansuebai, Petchporn Chawakitchareon
10:35–11:05 A Globally-Integrated Environmental Analysis and Visualization System with Multi-Spectral
& Semantic Computing in “Multi-Dimensional World Map”
Yasushi Kiyoki, Xing Chen, Shiori Sasaki, Chawan Koopipat
11:05–11:35 Monitoring Atmospheric Moisture Using GPS Precipitable Water Vapor
Prawit Uang-aree, Sununtha Kingpaiboon
11:35–11:50 Vietnamese Online Hotel Reviews Classification Based on Term Features Selection
Tran Sy Bang, Choochart Haruechaiyasak, Virach Sornlertlamvanich
11:50–12:20 Accelerating Reinforcement Learning by Mirror Images
Takehiro Kitao, Takao Miura
12:30–13:30 Lunch
13:30–14:20 Session 5: Multi-cultural environments
Chair: Hannu Jaakkola
13:30–14:00 Recognising the Culture Context in Information Search
Hannu Jaakkola, Bernhard Thalheim
14:00–14:20 On Modelling e-Education Ecosystems in Multicultural Contexts
Anneli Heimbürger
14:25–15:25 Session 6: Database & Multimedia technology
Chair: Tatjana Welzer
14:25–14:55 Data Provenance Management Based on Metadata
Frank Kramer, Bernhard Thalheim
14:55–15:25 Automating Transformations in Data Vault Data Warehouse Loads
Mikko Puonti, Timo Raitalaakso, Timo Aho, Tommi Mikkonen
15:25–15:40 Coffee Break
15:40–16:50 Session 6: Database & Multimedia technology (continues)
Chair: Tatjana Welzer
15:40–16:00 An Application of Neural Network in Method for Use Case based Effort Estimation
Radoslav Štrba, Svatopluk Štolfa, Jakub Štolfa, Ivo Vondrák, Václav Snášel
16:00–16:20 Building Change Detection via Semantic Segmentation and Difference Extraction Method
Siti Nor Khuzaimah Binti Amit, Shuta Saito, Yoshimitsu Aoki, Yasushi Kiyoki
16:20–16:50 Human-Microbiome-Relations Extraction Method with Context-dependent Clustering and
Semantic Analysis
Shiori Hikichi, Shiori Sasaki, Yasushi Kiyoki
16:50–17:35 Session: Introduction of Next EJC2017 in Krabi, Thailand
Aran Hansuebai, Virach Sornlertlamvanich, Petchporn Chawkitchareon, Chawan Koopipat
Evening Free
Thursday, 9.6.2016
9:00–10:00 Invited talk: Reading images in real-time
Professor Moncef Gabbouj, Academy Professor and Head of Multimedia Research Group, Tampere
University of Technology
Chair: Hannu Jaakkola
10:00–10:15 Coffee Break
10:15-12:25 Session 7: Learning and prediction
Chair: Shiori Sasaki
10:15–10:35 Power law vs. exponential law in artificial learning
Boštjan Brumen, Ivan Rozman, Aleš Černezel
10:35–10:50 MERS-CoV Spread Prediction in Saudi Arabia: A Conceptual Model
Alhanouf Alnasser, Lujain Althunayan, Nuha Alnabit, Noor Alothaim, Wafa Alanazi, Ajantha
Dahanayake
10:50–11:05 Towards employee based knowledge interactions to facilitate group learning within a team
collaboration tool: An exploratory case study analysis
Tero Kaisti, Rauno Pirinen
11:05–11:35 Emotions Recognition System for Acoustic Music Data based on Human Perception
Features
Tatiana Endrjukaite, Yasushi Kiyoki
11:35–11:55 Content Aware Playlist Generation with Multi-Dimensional Similarity Measure
Jan Wohlfahrt-Laymann, Anneli Heimbürger
11:55–12:25 A Multispectral Imaging and Semantic Computing System for Agricultural Monitoring and
Analysis
Jinmika Wijitdechakul, Yasushi Kiyoki, Shiori Sasaki, Chawan Koopipat
12:30–13:30 Lunch
13:30-14:55 Session 8: Social media technology
Chair: Anneli Heimbürger
13:30–14:00 Integrating culture into crowdsource product designing
May Al-Sohibani, Ajantha Dahanayake
14:00–14:20 Tag Suggestions from Social Media Profiles
Petri Rantanen, Pekka Sillberg, Jari Soini, Hannu Jaakkola
14:20–14:40 Evaluation Indexes of Customer Journey for Contents of Owned Media
Kyohei Matsumoto, Takafumi Nakanishi, Takashi Kitagawa
14:40–14:55 Towards optimization of processes in multimedia content publishing
Boštjan Šumak, Marjan Heričko, Tatjana Welzer Družovec, Maja Pušnik
14:55–15:10 Coffee Break
15:10–16:10 Session 9: Data analysis
Chair: Jaak Henno
15:10–15:40 Classification of imbalanced data with allocation method and sampling
Sašo Karakatič, Marjan Hericko, Vili Podgorelec
15:40–16:10 Time series analysis and forecasting technique for converting industrial waste
management: Case study of a tape converting production in Thailand
Krittiya Lertpocasombut, Supawut Sriploy
Meeting point in front of Restaurant Plevna, address: Itäinenkatu 8 (Finlayson area)
19:00–21:00 Dinner, The Plevna Brewery Pub & Restaurant
Friday, 10.6.2016
9:00–9:45 Invited talk: Information Content of Concepts, Theories and Their Development for
Information Systems
Emeritus Professor Hannu Kangassalo, University of Tampere
Chair: Hannu Jaakkola
9:45–10:00 Coffee Break
10:00–12:00 Session 10: Information and knowledge
Chair: Marie Duži
10:00–10:30 Logic of Inferable Knowledge
Marie Duží, Marek Mensik
10:30–11:00 Information and Interaction
Jaak Henno
11:00–11:30 An Ontology-Driven Approach for Expert Knowledge Acquisition in the Medical Field
Nassim Abdeldjallal Otmani, Catherine Comparot, Malik Si Mohammed, Pierre-Jean Charrel
11:30–12:00 Detecting Topic Evolutions in Bibliographic Databases Exploiting Citations
Hiroyoshi Ito, Toshiyuki Amagasa, Hiroyuki Kitagawa
12:00–12:30 Closing of EJC and Farewell
12:30–13:30 Lunch
Abstracts
Tuesday, 7.6.2016
Session: Software development
Visualizations for Software Development Process Management
Timo Lehtonen, Timo Aho, Kati Kuusinen, Tommi Mikkonen
Software development projects have increasingly been adopting new practices, such as continuous
delivery and deployment to enable rapid delivery of new features to end users. Tools that are commonly
utilized with these practices generate a vast amount of data concerning various development events.
Analysis of the data provides a lightweight data driven view on the software process. We present an
efficient way of visualizing software process data to provide a good overall view on the features and
potential problems of the process. We use the visualization in a case project that has become more agile
by applying continuous integration and delivery together with development and infrastructure
automation. We compare data visualizations with information gathered from the development team and
describe how the evolution can be understood through our visualizations. The case project is a good
example of how a change from a traditional long cycle development to a rapid cycle DevOps culture can
actually be made in a few years. However, the results show that the team has to focus on the process
improvement continuously in order to maintain continuous delivery all the time. As the main contribution,
we present a lightweight way to software process visualization. Moreover, we discuss how such a heuristic
can be used to track the characteristics of the target process.
Refactoring - key to success for constantly developed projects
Janari Põld, Ahto Kalja, Tarmo Robal
Each day new applications are developed and extra features are added to the existing ones. This means
establishment of new program code or introducing modifications to the existing code. To ease up new
modifications, software design should be constantly improved to cope with the vast changes added. The
quality of software architecture plays an important role, it determines the time to market new features,
maintainability and extensibility of applications and readability of the source code. Sustaining overall
quality of architecture requires refactoring phases for the design to cope with new changes, which also
restores source code readability and extensibility. More experienced developers conduct code reviews in
different phases of application development to pin point places that need improvements. Several code
refactoring steps can be applied to source code, to recover maintainability and extensibility of it.
Refactoring is playing an important role, not only in software development, but also in other fields. In this
article the authors give an overview, what can be achieved by refactoring and point out some success
stories. Several aspects of refactoring process are studied, and a refactoring strategy proposed.
Ad-hoc Synthesis of Composite Content Ontology Design Patterns
Pavel Lomov, Maxim Shishaev
Using of Ontology Design Patterns (ODPs) become useful for development and reengineering ontologies.
ODPs represent encodings of best practices supporting ontology construction by facilitating reuse of
proven solution principles. In this paper, we focus on Content ODPs (CDPs), which represent small
ontology fragments that encode general use cases (e.g. participation in event, role playing, parts of
object.). Content CDPs are used as building blocks during ontology development. In such cases they could
be specializes, extends, integrates by user to obtain new composite CDP which would allow to provide
more expressive representation of domain concept in the ontology being developed. But it may demands
additional skills from the user. Therefore in this paper the automate selection of a CDP combination and
subsequent synthesis of new composite CDP is considered.
Data Federation by Using a Governance of Data Framework Artifact as the Tool – Case clinical breast
cancer treatment data
Tomi Dahlberg, Tiina Nokkala, Jukka Heikkilä, Marikka Heikkilä
Widely spread breast cancer takes patients to an early grave. Early detection and ability to predict the
effectiveness of treatments are among the
means to fight this malignant disease. Data federation from dozens of data sources is needed for data
analytics. The granularity, internality, structure and all other characteristics differ in federated data. We
discuss alternative approaches to data federation and their theoretical basis, especially the ontology and
governance of data. We developed an artifact in our on-going research. The artifact is used to support the
federation of cancer data at a university hospital. We detected that our federative approach and the
artifact improved the interoperability of data in the case. We suggest that our approach is capable to that
also in other contexts.
Wednesday, 8.6.2016
Session: Environment and context studies
Multi-dimensional River-water Quality Analysis System for interpreting Environmental Situations
Chalisa Veesommai, Yasushi Kiyoki, Shiori Sasaki
The multi-dimensional analysis is a promising approach to a new interpreting of environments by ground
of the value-information and languageinformation on intellectual activities in various environment
meanings to society. This paper presents a new analysis-system with semantic computing for
environments in water-quality areas by integrating the fundamental important parameters of waterquality for creating the new meaning to society. The multi-water-parameter-analysis in a multidimensional space is important for current research issues in some water-quality research fields, which
are based on the values and meanings of each parameter for obtaining the meaningful words in the
category of agriculture, aquatic life, fish, drinking, industrial and irrigation. The multi-dimensional
semantic space is significantly utilized for various interpretations related to the water-quality.
A Needs-Based Context Aware Application Model for Pervasive Environments
Manal A. Yahya, Ajantha Dahanayake
The concept of context-aware personalization enables implicit detection of user context to achieve
personalization of services. Coupled with pervasive technologies such as augmented reality, this concept
forms a view into the future of computing. This research argues that for such technology to be most
beneficial, it is essential to understand the needs of the human-being using it. Hence, the goal of this
research is to provide a description of the relationship between human needs and context information,
and its role in the personalization of applications. This paper answers the research question: “How to
enhance personalization using context awareness in applications for a better pervasive experience?” The
research work resulted in the development of a user-centric model that embodies the concepts of context
awareness and needs prediction. The proposed model is applicable in pervasive technology aiming to
provide information or services with minimum attention from users to attain a high satisfaction level. The
pragmatic values of this work is aimed in the fields of ambient assisted living, healthcare, entertainment,
and advertisement.
Prediction of Alum Dosage in Water Supply by WEKA Data Mining Software
Petchporn Chawakitchareon, Nattanan Boonnao, Rawin Taychamekiatchai, Pakorn Charutragulchai
This paper presents a comparison of prediction methods for alum dosage using in water supply treatment
process. Artificial neural network is a common method which has been used in many works. In this
research, we compared results from M5P, M5Rules and REPTree to the results from multilayer
perceptron, one type of artificial neural network. Six input variables, i.e. turbidity, alkalinity, pH,
conductivity, color and suspended solids relating to reaction of coagulation were used. The data in this
research had been collected from Bangkhen Branch Office of Metropolitan Waterworks Authority,
Bangkok, Thailand from 1 January 2006 - 31 July 2015. The total number of data is 3,466 records. To find
the most efficiency method we used 10-fold cross validation technique which divided data into ten sets
of size n/10 (n is number of records). The experimental results showed that M5P yielded the highest
accuracy comparing to others method.
Session: Environment and context studies
Building the Prototype of Vector-Control Strategy Interoperability in Dengue Fever: Case Surabaya,
Kuala Lumpur, Bangkok
Wahjoe Sesulihatien, Yasushi Kiyoki, Shiori Sasaki, Azis Safie, Subagyo Yotopranoto, Virach
Sornlertlamvanich, Aran Hansuebai, Petchporn Chawakitchareon
Dengue fever is a communicable disease that attacks more than 120 countries in the world during 50
years. Therefore, it is make a sense to say that collaboration among the countries, especially
neighborhood countries, is one important key to combat the dengue. Currently, except a serological
collaboration, the collaboration in dengue are sporadic and temporal. This paper addresses the initiative
to build vector-control strategy interoperability among Surabaya (Indonesia), Kuala Lumpur (Malaysia)
and Bangkok (Thailand). Deriving the global policy from World Health Organization (WHO), we build the
system that (1) extracting global feature from the local feature, (2) selecting the significant features, to
determine ranking of importance of a feature, by weighting a feature, and (3) matching the pattern of
data to the suitable strategy by measuring the similarity. We built the system from the real data of the
Surabaya, Kuala Lumpur and Bangkok in 2012. We verified reliability of the system by comparing the data
with the real action in January 2012 The result shows that the system is system feasible to be
implemented, however we still need more preparation to implement the system.
A Globally-Integrated Environmental Analysis and Visualization System with Multi-Spectral &
Semantic Computing in
“Multi-Dimensional World Map”
Yasushi Kiyoki, Xing Chen, Shiori Sasaki, Chawan Koopipat
In the design of multimedia data mining systems, one of the most important issues is how to search and
analyze media data, according to contexts. We have introduceda semantic associative search method
based on our “Mathematical Model of Meaning (MMM)[1,2,3]”. This model is applied to compute
semantic correlations between keywords, images, music and documents dynamically in a contextdependent way.
We have constructed "A Multimedia Data Mining System for International and Collaborative Research in
Global Environmental Analysis," as a new platform of a multimedia data mining environment between
our research team and international organizations. This environment is constructed by creating the
following subsystems: (1) Multimedia Data Mining System with semantic associative-search functions and
(2) 5D Space Sharing and Collaboration System for cooperative creation and manipulation of multimedia
objects.
It is very important to memorize those situations and compute environment change in various aspects
and contexts, in order to discover what are happening in the nature of our planet. We have various (almost
infinite) aspects and contexts in environmental changes in our planet, and it is essential to realize a new
analyzer for computing differences in those situations for discovering actual aspects and contexts existing
in the nature. We propose a new method for Differential Computing in our Multi-dimensional World map
[4,5,6]. We utilize a multi-dimensional computing model, the Mathematical Model of Meaning (MMM),
and a multi-dimensional space filtering method with, adaptive axis adjustment mechanism to implement
differential computing. Computing environmental changes in multi-aspects and contexts using differential
computing, important factors that change natural environment are highlighted. We present a method to
visualize the highlighted factors using our Multi-dimensional World Map.
Semantic computing is an important and promising approach to semantic analysis for various
environmental phenomena and changes in real world. This paper presents a new semantic computing
method with multi-spectral images for analyzing and interpreting environmental phenomena and changes
occurring in the physical world.
We have presented a concept of "Semantic Computing System” for realizing global environmental
analysis. This paper presents a new semantic computing method to realize semantic associative search
for the multiple-colours-spectral images in the multi-dimensional semantic space, that is “multi-spectral
semantic-image space” consisting of (a) Infra-Red filtered axis, (b) Red axis, (c) Green filtered axis, (d) Blue
filtered axis, (e) NDVI axis, and (f) NDWI axis, with semantic projection functions. This space is created for
dynamically computing semantic equivalence, similarity and difference between multi-spectral images
and environmental situations.
The most essential and significant point of our “multispectral-semantic computing method” is that it
realizes “the interpretation of substances (materials)” appearing and reflected in the multi-spectrum
images by using “6-dimensional multi-spectral semantic-image space” and “semantic projection
functions”. That is, this method interprets the substances appearing in the image into “the names of
substances” by using “knowledge of substances” expressed in this semantic-image space. This is
corresponding to the human-level interpretation when we look at an image and recognize the substances
appearing in the image. This method realizes this human-level interpretation with “multi-spectral
semantic-image space” and “semantic projection functions”.
We apply this system to global environmental analysis as a new platform of environmental computing.
We have already presented the 5D World Map System, as an international research environment with
spatio-temporal and semantic analysers. We also present several new approaches to global
environmental-analysis for multi-spectrum images in “multi-spectral semantic-image space.”
Monitoring Atmospheric Moisture Using GPS Precipitable Water Vapor
Prawit Uang-aree, Sununtha Kingpaiboon
This article is aimed at indicating correlation between climatic changes and atmospheric moisture and
precipitation using GPS precipitable water vapor values (PWV) and meteorological data in Khon Kaen,
Thailand. PWV, average temperature, and precipitation data from 2001 to 2014 were analyzed to
determine the changes over the time period. The estimation showed the average, maximum and
minimum values of PWV in Khon Kaen at 48.42, 69.88, and 11.23 mm, respectively, with the standard
deviation of 13.42 mm. Additionally, there was an increasing trend of PWV changes following temperature
changes which could be due to the warm atmospheric properties that can hold vapor better than dry
atmosphere. Again, at high temperatures, water in the environment vaporizes more easily than at low
temperatures. However, precipitation tends to decrease which could be due to topographical condition
of Khon Kaen which is on a high plain surrounded by mountains. As a result, monsoon wind is not able to
bring moisture into the area. Therefore, the slightly increasing moisture cannot be a major cause of
precipitation similar to a storm.
Vietnamese Online Hotel Reviews Classification Based on Term Features Selection
Tran Sy Bang, Choochart Haruechaiyasak, Virach Sornlertlamvanich
This paper aims to present the improved techniques to classify the user’s feedbacks on hotel service
qualities. The data were collected mainly from online feedback sources by PHP program. The training set
was manually tagged as: NEGATIVE, POSITIVE, and NEUTRAL. In total, there were 2969 terms successfully
collected in Vietnamese language. In the first part, the common machine learning techniques like KNearest Neighbor algorithm (KNN), Decision Tree, Naive Bayes (NB) and Support Vector Machines (SVM)
were applying for classification. In the second part, we enhanced the efficiency of the text categorization
by applying feature selection techniques, χ2 (CHI). At the end of the paper, we concluded that the overall
performance of general machine learning techniques was significantly improved by applying feature
selection.
Accelerating Reinforcement Learning by Mirror Images
Takehiro Kitao, Takao Miura
In this investigation we propose how to accelerate Q-learning which is one of the most successful
reinforcement learning methods using mirror images for hunting problems. Mirror images have
symmetric differences on views, and they allow us to accelerate Q-learning dramatically. In this
investigation we show Qlearning goes 2 times faster (one mirror) or 3 times faster (2 mirrors) but a
capturing ability decreases slightly. Moreover we prove that the new approach gets to the convergence if
the one with no mirror does.
Session: Multi-cultural environments
Recognising the Culture Context in Information Search
Hannu Jaakkola, Bernhard Thalheim
The importance of information in our daily life is increasing rapidly. Simultaneously the availability of
information from different sources has grown exponentially. The progress in data oriented context has
grown from traditional approach based on queries to databases to the beneficial use of wide variety of
openly available, quite often also non-structured data sources. The complexity of data needs has also
increased and solutions are based on combined multi-query results. The development has also taken us
towards global context, in which data is used over geographical and cultural borders. Information search
is a communication oriented task, in which the cultures of users and meet the culture related aspects of
data repositories. A mismatch between the users’ national culture based expectations to the behavior of
global information services (information system and their user interfaces) is the source of a variety of
problems. In our paper we analyse the characteristics of information search and the cultural aspects
guiding the behavior of the users of information systems. These two approaches are merged in the form
of query-answer profiles. The purpose of the paper is to find guidelines possible to generalize and apply
by the developers and users for survival in global information system context.
On Modelling e-Education Ecosystems in Multicultural Contexts
Anneli Heimbürger
A sociotechnical system is a complex inter-relationship of people and technology, including hardware,
software, data, physical and virtual surroundings, people, procedures, laws and regulations. An eEducation environment is a particularly complex example of a sociotechnical system that requires equal
support for user needs and technological innovations. The challenge for e-Education environment
development is that in addition to the producers, users, domain experts and software developers,
pedagogical experts are also key stakeholders. In our paper, we discuss different meta-aspects and
components of modelling e-Education ecosystems in multicultural contexts.
Session: Database & Multimedia technology
Data Provenance Management Based on Metadata
Frank Kramer, Bernhard Thalheim
Users often have to know the quality of data, their evolution history, their origin, the work ows of their
occurence, concurrent users of their data, and changes applied to their data by others. Such metadata are
supported by why-, wherefrom-, how-, who-, where-, whereby-, etc. provenance. Provenance must
however be systematically maintained. We propose a schematic approach that models provenance metadata based on a database schema for distributed and component-based databases and that is realised
based on current database technology.
Automating Transformations in Data Vault Data Warehouse Loads
Mikko Puonti, Timo Raitalaakso, Timo Aho, Tommi Mikkonen
Data warehousing is a process of integrating multiple data sources into one for, e.g., reporting purposes.
An emerging modeling technique for this is the data vault method. The use of data vault creates many
structurally similar data processing modifications in the trans-form phase of ETL work. Is it possible to
automate the creation of transformations? Based on our study, the answer is mostly affirmative. Data
vault modeling creates certain constraints to data warehouse en-tities. These model constraints and data
vault table populating princi-ples can be used to generate transformation code. Based on the original
relational database model and data flow metadata we can gather pop-ulating principles. These can then
be used to create general templates for each entity. Nevertheless, we need to note that the use of data
flow metadata can be only partially automated and includes the only manual work phases in the process.
In the end we can generate the actual trans-formation code automatically. In this paper, we carefully
describe the creation of automation procedure and analyze the practical problems based on our
experiences on PL/SQL proof of concept implementation. To the best of our knowledge, similar has not
yet been described in the scientific literature.
An Application of Neural Network in Method for Use Case based Effort Estimation
Radoslav Štrba, Svatopluk Štolfa, Jakub Štolfa, Ivo Vondrák, Václav Snášel
Effort overruns is common problem in software development. Our main intention is to support estimation
by method for classification of use cases. The goal of this paper is to evaluate usage of the feed-forward
neural network for the Use Case classification purposes. Experimental results show that the feed-forward
neural network classifier, using softmax activation function in the output layer and hyperbolic tangent
activation function in the hidden layer, offers the best classification performance.
Building Change Detection via Semantic Segmentation and Difference Extraction Method
Siti Nor Khuzaimah Binti Amit, Shuta Saito, Yasushi Kiyoki, Yoshimitsu Aoki
Google Earth with high-resolution imagery basically takes months to process new images before online
updates. It is considered as a time consuming and slow process especially for post-disaster application. In
this study, we aim to develop a fast and accurate method of updating maps by detecting local differences
occurred over different time series; where only region with differences will be updated. In our system,
aerial imageries from Massachusetts’s building open datasets are used as training datasets; meanwhile
Saitama district datasets are used as input images. Semantic segmentation is then applied to input images
to get predicted map patches of building. Semantic segmentation is a pixel-wise
classification of images by implementing convolutional neural network technique. Convolutional neural
network technique is implemented due to being not only efficient in learning highly discriminative image
features such as buildings, but also partially robust to incomplete and poorly registered target maps. Next,
in order to understand overall changes occurred in an area, both semantic segmented images from the
same scene are undergone change detection method. Lastly, difference extraction method is
implemented to specify the category of building changes. The results reveal that our proposed method is
able to overcome current time-consuming map updating problem. Hence map updating will be cheaper,
faster and more effective especially post-disaster application, by leaving unchanged region and only
updating changed region.
Human-Microbiome-Relations Extraction Method with Context-dependent Clustering and Semantic
Analysis
Shiori Hikichi, Shiori Sasaki, Yasushi Kiyoki
Human-microbiome-relations extraction is important for analyzing the effects on human gut microbiome
from the difference of human attributes such as country, sex, age and so on. Human gut microbiome, a
set of bacteria, provides various pathological and biological impacts on a hosting human body system.
This paper presents a new analytical method for data resources that are difficult to understand such as
human gut microbiome, by extracting the unknown relations with other adjunct metadata (e.g. human
attributes data) with context-dependent clustering and semantic analysis. This method realizes the
significant bacterial components acquisition for categorizing human attributes. The most important
feature of our method is to analyze the unknown relations of human-microbiome with or without
correlation between a human attribute and bacteria that is found by related studies in bacteriology. With
this method, an analyst is able to grasp the overview of bacteria data clustered by several clustering
algorithms (k-means clustering / hierarchical clustering) using bacteria data selected by human attributes
as a set of context. In addition, even without an association between human attributes and bacteria as
heuristic knowledge, an analyst is able to extract
human-microbiome-relations focusing on a number of bacteria selected from all bacteria combinations
by one-way analysis of variance (ANOVA) and our original criteria called the “degree of separation” of
clustering. This paper also presents an experimental study about human microbiome-relations extraction
and the experimental results that show the feasibility and effectiveness of this method.
Thursday, 9.6.2016
Session: Learning and prediction
Power law vs. exponential law in artificial learning
Boštjan Brumen, Ivan Rozman, Aleš Černezel
The human cognitive performance was given quite a lot of attention in the research: the power function
is generally accepted as an appropriate description in psychophysics, in skill acquisition, and in retention.
Power curves have been observed so frequently, and in such varied contexts, that the term “power law”
is now common place. Recently some arguments arose against the power law the main argument is that
it holds only on the aggregate level; on a specific learner’s level the exponential law is much better. Thus,
in human learning performance, the power law is a common description when describing a population of
learners, and the exponential law was recently proposed when it comes to a single person. Interestingly,
this dilemma has not yet been addressed properly in the machine learning world. This paper addresses
the problem of a proper functional description of an artificial learner on an individual and aggregate level.
MERS-CoV Spread Prediction in Saudi Arabia: A Conceptual Model
Alhanouf Alnasser, Lujain Althunayan, Nuha Alnabit, Noor Alothaim, Wafa Alanazi, Ajantha Dahanayake
The recent spread of a new virus known as Middle East respiratory syndrome coronavirus (MERS-CoV) in
Saudi Arabia indicates a worrisome occurrence of a new epidemic in the region. The impact of the
emergence of MERS-CoV and the consequences of the spread of the virus have caused a concern for
health authorities and the public in Saudi Arabia. The severity level of the spread of MERS-CoV remains
unpredictable as the spread pattern increases exponentially. Therefore, the situation highly demands
establishing control and preventive measures. The purpose of this study is to develop a conceptual model
to predict the spread level of MERS-CoV. The main aim of this research is to develop and evaluate the
model. A similar model for Dengue Disease has been looked at and used as a foundation for this study.
Several factors that strongly influence the spread of MERS-CoV are identified and thus derived into the
model. These factors are temperature, humidity, mass gathering, hospitals, and social trends. A point
system and a theoretical example are proposed to scale each factor and to determine the prediction of
the severity level of spread of MERS-CoV by weighing these factors for a specific month in a specific city.
The aim of such prediction is to provide an early warning for health authorities in order to reassess their
cur-rent disease preventive and control measures.
Towards employee based knowledge interactions to facilitate group learning within a team
collaboration tool: An exploratory case study analysis
Tero Kaisti, Rauno Pirinen
The indent of this study is on a case study analysis of organizational knowledge nexus and related
collective learning in customer support operations of a high-tech company. The unit of analysis was an
existing information flow as knowledge related interaction within a customer support process. The study
was focused to organizational learning targets as strengthening of effectiveness of the organization’s
knowledge building and management. The exploratory social network analysis denotes that experience
and tenure within the company is a common denominator of key personnel. In addition, employees’
contribution ratio to community progress over time as they gradually shift from learners to mentors. The
findings derived from the social network analysis of a team collaboration tool needs be further researched
in conformity studies.
Emotions Recognition System for Acoustic Music Data based on Human Perception Features
Tatiana Endrjukaite, Yasushi Kiyoki
Music plays an important role in the human’s life. It is not only a set of sounds – music evokes emotions
subjectively perceived by listeners. The growing amount of audio data wakes up a need for content-based
searching. Traditionally, tunes information has been retrieved based on reference information, for
example, the title of a tune, the name of an artist, the genre and so on. When users would like to try to
find music pieces in a specific mood such standard reference information of the tunes is not sufficiently
effective. We need new methods and approaches to realize emotion-based search and tune content
analysis. This paper proposes a new music-tune analysis approach to realize automatic emotion
recognition by means of essential musical features. The innovativeness of this research is that it uses new
musical features for tune’s analysis, which are based on human’s perception of the music. Most important
distinction of the proposed approach is that it includes broader range of tunes genres, which is very
significant for music emotion recognition system. Emotion description on continuous plane instead of
categories results in more supported adjectives for emotion description which is also a great advantage.
Content Aware Playlist Generation with Multi-Dimensional Similarity Measure
Jan Wohlfahrt-Laymann, Anneli Heimbürger
Music players and cloud solution for music recommendation and automatic playlist creation are becoming
increasingly more popular, as they intent to overcome the issue of the difficulty for users to find fitting
music, based on context, mood and impression. Much research on the topic has been conducted, which
has recommended different approaches to overcome this problem. This paper suggests a system which
uses a multi-dimensional vector space, based on the music’s key elements, as well as the mood expressed
through them and the song lyrics, which allows for difference and similarity finding to automatically
generate a contextually meaningful playlist.
A Multispectral Imaging and Semantic Computing System for Agricultural Monitoring and Analysis
Jinmika Wijitdechakul, Yasushi Kiyoki, Shiori Sasaki, Chawan Koopipat
Multispectral image becomes widely used for environmental analysis to detect an object or phenomena
that human eyes cannot capture. One of the main type of images acquired by remote sensing such as
satellite or aircraft for earth observation. This paper presents a multispectral analysis for aerial images
that captured by dual cameras (visible and infrared camera), which are mounted on an unmanned
autonomous vehicle (UAV) or Drone. In our experiments, four spectral bands (three visible and one
infrared band) were imaged, processed and analyzed to detect agricultural area and measure the health
of vegetation. To interpret environmental phenomena and realize an environmental analysis, this study
applies semantic analysis by creating a multispectral semantic image space, combined with three
numerical indicators (the normalized difference vegetation indexn (NDVI), the normalized difference
water index (NDWI) and the soil adjusted vegetation index (SAVI)) that can be used to analyze plant health,
photosynthetic activity and detect environmental object to determine an agricultural area. This paper also
proposed the concept of multispectrum semanticimage space for agricultural monitoring by defining the
correlation meaning from multidimensional parameters which related to agricultural analysis to realize
and explain agriculture conditions. This paper presents the experimental study on a rice field, a cornfield,
a salt farm and a coconut farm in Thailand.
Session: Social media technology
Integrating culture into crowdsource product designing
May Al-Sohibani, Ajantha Dahanayake
The crowdsourcing platforms provide various services and products to the crowds of different
backgrounds and cultures. These differences affect the results of services and products that are developed
using crowdsource platforms. This study presents how the cultural factors can be integrated into the
design activities of crowdsource product designing. The research presents the activities of crowdsource
product designing, and the cultural factors to derive the theoretical underpinning for formulating the
cultural factors for crowdsource product designing. Also, the methods that are used for adapting the
cultural factors into crowdsourcing product design platforms are derived. The research illustrates the
design of the user interface of crowdsourcing product design platforms taking cultural factors into
consideration. The research is validated by prototyping and conducting test cases. Finally, presents a
discussion of the research results and explains the impact of cultural factors on crowdsource product
designing. Thus confirms the necessity of designing platform activities integrating the cultural factors in
order to satisfy the crowdsource product design users’ needs.
Tag Suggestions from Social Media Profiles
Petri Rantanen, Pekka Sillberg, Jari Soini, Hannu Jaakkola
Attaching any kind of clue – event, location, person, tag or keyword – to a photo eases the process of
searching. Often the problem is that the user feels that it is difficult to think of good tags or that the
tagging process is too tedious or cumbersome. At the same time, users use social media daily, and write
about topics they feel are important and that they are actively interested in. This paper presents a method
for extracting metadata (tag suggestions) from social media profiles and illustrates the use of the tags for
photo tagging by means of a webbased photo application.
Evaluation Indexes of Customer Journey for Contents of Owned Media
Kyohei Matsumoto, Takafumi Nakanishi, Takashi Kitagawa
In this paper, we propose evaluation of customer journey for contents of Owned Media. In recent years,
many companies publish the Owned Media in order to brand its products and services. The Owned Media
is useful for provision of novel information correctly and rapidly. On these backgrounds, there is the
demand of evaluation for effectiveness of the Owned Media. It is necessary to recognize which detailed
content of Owned Media has importance. Our proposed method is able to produce attractive contents
through appreciating which contents make approaches to customers. We demonstrate the evaluation
using a certain Web-site as Owned Media and show the effectiveness of our methods.
Optimization of processes in multimedia content publishing: a proposal for architecture Design
Boštjan Šumak, Marjan Heričko, Tatjana Welzer Družovec, Maja Pušnik
Organizations in the publishing field accumulate large amounts of data and store it in usually poorly
organized knowledge databases. In the publishing field specifically, various multimedia contents
presentation is involved, including contemporary e-book standard formats (e-Pub) as well as traditional
standard formats such as HTML, PDF, and others. Publishing organizations must provide the same content
in various formats in order to meet the needs of their clients. However, business (and organizational)
processes in publishing typically have little IT support and almost no automation. Successful business flow
and competitive advantage for publishing organizations, especially for SMEs, requires a holistic solution
for optimization of publishing processes, including data/knowledge acquisition, data/knowledge
aggregation and creation of new knowledge based on existing data. Although partial IT solutions are
available and can be used to support individual steps in the publishing process, such solutions don’t come
in an out-of-the box solutions, suitable for adequate publishing processes automation and optimization.
Two basic challenges regarding existing solutions available on the market are: (1) they do not provide
support for automation of individual steps in the publishing processes, and (2) they are usually very
expensive and consequently not acceptable for SMEs. This paper is focused on analyzing publishing
processes and its quality aspects, emphasizing on automation of all possible process steps: especially the
optimization of the acquisition, aggregation and building of multimedia content for any device. In
addition, a solution built on open source components is proposed as well. Based on a case study, we
present solution’s architectural design and analyze standards and technologies, providing an acceptable
solution for SME’s from the economical point of view, optimizing its business processes to largest possible
extent. A renewed publishing process is proposed and future research directions are discussed.
Session: Data analysis
Classification of imbalanced data with allocation method and sampling
Sašo Karakatič, Marjan Hericko, Vili Podgorelec
In this paper we deal with the classification of imbalanced data with an ensemble technique – the
allocation method. This method is a two level classifier that combines unsupervised and supervised
learning, where the unsupervised anomaly detection is used as an allocator. The allocation method is
tested on 10 imbalanced datasets and the results are compared to two well used sampling methods. For
under-sampling we used under-sampling of majority instances, and for oversampling we used SMOTE
which introduces new artificial instances of minority class to the dataset. Results of all of the methods
were compared on accuracy and average F-score metrics. The results show that allocation method clearly
produces the best classification model, which is also supported by statistical analysis.
Time series analysis and forecasting technique for converting industrial waste management: Case
study of a tape converting production in Thailand
Krittiya Lertpocasombut, Supawut Sriploy
The aim of this research study is to explore the fitting forecasting technique to an existing waste data of
tape converting production using a time series method. Optimal types of a time series technique would
minimised an error between actual and forecasted data comparing within three types of time series
techniques. The data analysis, referring to the accuracy of their outputs, resulted that the “Double
Exponential Smoothing” is a preferable chosen. Since the error value is less than other techniques and the
projection forecasted values are 13.09, 11.08, 11.77 and 10.25 (Unit of measurement is 10,000 kilograms)
by January, February, March and April 2015. After benchmarked error values (MAPE) that is based on
similar techniques and problems from others, the error values of this study (18%) was less than the
benchmarking source (18% [8] and 33% [9] respectively). This technique is more accuracy than
benchmarking technique was 17% and 83 % accordingly. After rechecked with actual data with forecast
time series data, we found average MAPE was around 15% as this error values was still lower that error
values from others papers reference.
Friday, 10.6.2016
Session: Information and knowledge
Logic of Inferable Knowledge
Marie Duží, Marek Mensik
Intensional epistemic logics are not apt for handling properly the specification of communication and
reasoning of resource-bounded agents in a
multi-agent system. They oscillate between two unrealistic extremes: either the explicit knowledge of an
‘idiot’ agent, deprived of any inferential capabilities, or the implicit knowledge of an agent who is a
logical/mathematical genius. The goal of this paper is to introduce the notion of inferable knowledge of a
rational yet resource-bounded agent. The stock of inferable knowledge of such an agent a is the closure
of a chain-of-knowledge sequence validly derivable from a’s existing stock of explicit knowledge via one
or more rules of inference that a masters. We are using Pavel Tichý’s Transparent Intensional Logic as our
framework. This logic models knowing as a relation-in-intension between an agent and a construction (a
hyperintensional mode of presentation of a possible-world proposition) rather than a set of possible
worlds or a piece of syntax. We motivate the restriction of the epistemic closure principle to inferable
knowledge, present the theoretical framework, define the concept of inferable knowledge, and explain
the technicalities of the so restricted closure principle.
Information and Interaction
Jaak Henno
Here are considered information and information growth in interactions of Information Processing
Systems (IPS). Information does not exist ‘per se’ – it is always stored in some Information Processing
System (living system, social system, business system, administrative/government system etc.). All IPS are
finite and can be modelled as Finite State Machines (FSM). They are connected with each other and
interact - exchange messages. Their messages (responses to input queries) reveal to others information
about their functioning, thus IPS with more memory learn behaviour, i.e. infer information stored in other,
smaller IPS. Thus information in network of connected IPS-s is accumulating in IPS with more memory and
smaller IPS become parts of all greater and greater ‘super’ IPS – IPS on the next level of IPS development
hierarchy. The whole human society is currently moving into new era – the era of networked Super
Information Processing Systems.
An Ontology-Driven Approach for Expert Knowledge Acquisition in the Medical Field
Nassim Abdeljallal Otmani, Catherine Comparot, Malik Si Mohammed, Pierre-Jean Charrel
Each discipline, to some extent, has its own concise and precise vocabulary used to describe
unambiguously the special concepts within the domain and the relationships bounding them. In the
medical science for instance, doctors use specialized vocabulary and knowledge for an effective and
efficient way of (1) communication, like filling in an EHR (Electronic Health Record), and (2) for problem
solving, like the diagnosis process. For those who are unfamiliar with that vocabulary, it is hard for them
to express relevant information like describing symptoms to a doctor in order to get diagnosed. In our
work, we aim at bringing the common sense knowledge and the basic vocabulary closer to the expertise
knowledge for an effective communication between what we call a layperson and an expert illustrated
with a case of patient/doctor communication. In this paper, we define a communication process, pointing
out the beneficial use of the expertise knowledge and the choice for the Ontology-Driven modelling that
will enhance the notion of progressivity in the knowledge acquisition process. We also define our cyclic
acquisition process step by step starting from the first and foremost step of processing the messages to
the information extraction and the
reasoning process until the last but not least step of outputting the ontological representation.
Detecting Topic Evolutions in Bibliographic Databases Exploiting Citations
Hiroyoshi Ito, Toshiyuki Amagasa, Hiroyuki Kitagawa
This paper proposes a scheme of detecting topic evolutions in bibliographic databases. There have been
a lot of scientific bibliographies, such as DBLP, CiteSeerX, MEDLINE/PubMed, ADS, arXiv, etc., and hence
it has been extremely important to extract useful information from these databases. It should be noticed
that, in such databases, citations play crucial role to represent relationships among different publications.
To make the best use of citation information as well as textual features for extracting topic evolutions in
a bibliographic database, we propose a scheme based on non-negative matrix factorization (NMF). More
precisely, we first partition the set of publications in a database according to their publication years, and
apply NMF to extract clusters of publications. Notice that we take into account citation information to
perform NMF for better clustering. Having obtained sets of publications for each time span, we associate
similar clusters in consecutive time spans according to their similarity. Thus we can obtain time evolution
of topics and clusters of publications. In the experiments we demonstrate the proposed scheme can
successfully extract topic evolutions in real bibliographic databases, CiteSeerX and arXiv.
NOTES
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________