Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Short Report of the Conference Zheng Li Student number: 1003643 Supervisors: Mark Atherton, David Harrison 19 April 2013 Briefing The International Information Conference on Data Search, Mining and Visualisation 2013 (II-SDV) closed on 17 April at Nice, France. Over 100 people attended the conference; around 10 information processing companies demonstrated their latest software products, such as Questel (France), Basis Technology (USA), VantagePoint (USA), Linguamatics (UK), Patinformatics (USA), Visualizing Data (UK), etc; the staff from European Patent Office (EPO), World Intellectual Property Office (WIPO), Instituto Nacional da Propriedade Industrial (INPI) and National Research Council Canada (NRC) came to the meeting; and 21 speakers did presentations. About the subjects, the conference was mainly about big data processing. Half of the presentations were about social media data processing, such as data in Twitter, Facebook, public webpages, etc; the other half were directly relevant with our research target, patents; About the methods, 4 presentations focused on data search methods, 5 concentrated on data mining, 3 speakers talked about latest results of data visualisation, and the rest 9 speakers presented combined tools of data search, mining and visualisation; About the speakers, 15 come from the data processing consultancies, 2 from industrial companies (one industrial chemistry, one pharmacy); one from the library; 2 from the organisations related with the governments, and one from the university (us). About the aims, most representatives want to gain market information and competitive intelligence from big data through their software and sell these 1 results (services by licenses) or tools (software packages) to individuals, small and big industrial companies, governments and universities. Most methods they provided are good at data storing, search, mining, and visualisation. But they only offer objective results, no business strategy suggestions. Not mentioned the patent strategy formulisation nor infringement identification. But the speaker, Dr Stellmach from EPO, did give a talk about how to identify the novelty and inventive step in patent application. Results The problems in different areas of data search, mining and visualisation are just the same such as language processing, fake data, time delay, classification criteria, etc. These become the common sense of the key issues; Data processing service is becoming customised. More services are developed to suit for different demands and interests for various information hungers/hunters; Various data are integrated. More types of data (patents, twitter, emails, and pictures) are included in data processing; Big data is not new and the main methods for data processing didn’t change much. But during the last two decades, there has been a significant improvement in the identifying purposes and application areas of data processing. And these commercial purposes and computerised applications are the impulses of such a rise of big data processing; Still, only a few methods can deal with graphic mining in patents. Values in technical drawings and sketches in patents are of significance, which requires more mining methods; When it comes to specific issues, computers can do nothing. This has been mentioned more than once by the speakers, although there are some welldeveloped machine learning methods and expert systems. Experts in industries are crucial in evaluations of technologies; 2 Mature tools have been developed in data storing and visualisations. However, many problems still exist in data search and mining methods, for example, feasibility and validity of data classification; Patent data mining methods differ between different industries. For instance, chemistry patents contain lots of chemical formula, which belongs to text mining; patents in life science and biology have many name lists of diseases, which makes the classification, keyword search and mining methods very complicated; the evaluations of our concerns, mechanical patents (technical drawings and sketches), are relied on expert opinions, which has been verified by Mr Hill from Questel in their case study of patent mining of swimming pool cleaners. Conclusions Nowadays, the number of specialised data scientists or researchers is decreasing. One reason is that data processing, which is a cross-subject, is becoming more and more complicated and overlapped. The current education system cannot offer enough graduates; another reason is that the aims and tasks of data search, mining and visualisation are deeper and more difficult to achieve. About the future of data processing, half of the attendees agree that language mining in patents, webpages, and emails are one of the most urgent problems. New outcomes will be discovered in the next decade; the other half believes that the info security will be the first issue in the five or ten year time. It is a common sense that data integration (texts, pictures, sounds, etc.) is the trend; and methods of search, mining and visualisation in big data will be of more and more importance. 3 4