Download Conference Report - Zheng Li

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
A Short Report of the Conference
Zheng Li
Student number: 1003643
Supervisors: Mark Atherton, David Harrison
19 April 2013
Briefing
The International Information Conference on Data Search, Mining and Visualisation
2013 (II-SDV) closed on 17 April at Nice, France. Over 100 people attended the
conference; around 10 information processing companies demonstrated their latest
software products, such as Questel (France), Basis Technology (USA), VantagePoint
(USA), Linguamatics (UK), Patinformatics (USA), Visualizing Data (UK), etc; the
staff from European Patent Office (EPO), World Intellectual Property Office (WIPO),
Instituto Nacional da Propriedade Industrial (INPI) and National Research Council
Canada (NRC) came to the meeting; and 21 speakers did presentations.
 About the subjects, the conference was mainly about big data processing. Half
of the presentations were about social media data processing, such as data in
Twitter, Facebook, public webpages, etc; the other half were directly relevant
with our research target, patents;
 About the methods, 4 presentations focused on data search methods, 5
concentrated on data mining, 3 speakers talked about latest results of data
visualisation, and the rest 9 speakers presented combined tools of data search,
mining and visualisation;
 About the speakers, 15 come from the data processing consultancies, 2 from
industrial companies (one industrial chemistry, one pharmacy); one from the
library; 2 from the organisations related with the governments, and one from the
university (us).
 About the aims, most representatives want to gain market information and
competitive intelligence from big data through their software and sell these
1
results (services by licenses) or tools (software packages) to individuals, small
and big industrial companies, governments and universities. Most methods they
provided are good at data storing, search, mining, and visualisation. But they
only offer objective results, no business strategy suggestions. Not mentioned the
patent strategy formulisation nor infringement identification. But the speaker,
Dr Stellmach from EPO, did give a talk about how to identify the novelty and
inventive step in patent application.
Results
 The problems in different areas of data search, mining and visualisation are
just the same such as language processing, fake data, time delay,
classification criteria, etc. These become the common sense of the key issues;
 Data processing service is becoming customised. More services are
developed to suit for different demands and interests for various information
hungers/hunters;
 Various data are integrated. More types of data (patents, twitter, emails, and
pictures) are included in data processing;
 Big data is not new and the main methods for data processing didn’t change
much. But during the last two decades, there has been a significant
improvement in the identifying purposes and application areas of data
processing. And these commercial purposes and computerised applications
are the impulses of such a rise of big data processing;
 Still, only a few methods can deal with graphic mining in patents. Values in
technical drawings and sketches in patents are of significance, which
requires more mining methods;
 When it comes to specific issues, computers can do nothing. This has been
mentioned more than once by the speakers, although there are some welldeveloped machine learning methods and expert systems. Experts in
industries are crucial in evaluations of technologies;
2
 Mature tools have been developed in data storing and visualisations.
However, many problems still exist in data search and mining methods, for
example, feasibility and validity of data classification;
 Patent data mining methods differ between different industries. For instance,
chemistry patents contain lots of chemical formula, which belongs to text
mining; patents in life science and biology have many name lists of diseases,
which makes the classification, keyword search and mining methods very
complicated; the evaluations of our concerns, mechanical patents (technical
drawings and sketches), are relied on expert opinions, which has been
verified by Mr Hill from Questel in their case study of patent mining of
swimming pool cleaners.
Conclusions
Nowadays, the number of specialised data scientists or researchers is decreasing. One
reason is that data processing, which is a cross-subject, is becoming more and more
complicated and overlapped. The current education system cannot offer enough
graduates; another reason is that the aims and tasks of data search, mining and
visualisation are deeper and more difficult to achieve.
About the future of data processing, half of the attendees agree that language mining
in patents, webpages, and emails are one of the most urgent problems. New outcomes
will be discovered in the next decade; the other half believes that the info security will
be the first issue in the five or ten year time.
It is a common sense that data integration (texts, pictures, sounds, etc.) is the trend;
and methods of search, mining and visualisation in big data will be of more and more
importance.
3
4