Download Paper Title (use style: paper title)

International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com Knowledge representation for Artificial Intelligence to Natural Intelligence Priyanka Jain, Priyanka Pawar Centre for Development of Advanced Computing, Pune, India [email protected], [email protected] Abstract - The motivation behind working on this idea is to pursue benefits of expertise under Artificial Intelligence and explore natural intelligence like internal information processing mechanisms of the brain, as well as processes involved in perception and cognition. Supporting the theories based on Cognitive computing, we tried to simulate the knowledge acquisition process as happens in natural intelligence using behaviors of the brain such as thinking, inference, and learning. We are proposing a knowledge acquisition design based on OAR model for computer vision. In this paper, we have worked for knowledge understanding for Vision to Text[V2T] and for Text to Vision [T2V] as two separate applications. Keywords - Cognitive Computing, Knowledge Acquisition, Natural Intelligence, Artificial Intelligence, Image Processing, Natural Language Generation, Natural Language Visualization, Human Computer Interface and Object-Attribute-Relation (OAR) model. Introduction In last decades, the ideology of informatics and their concepts about the object of information have derived from the modern informatics and classic information theory to cognitive informatics. The fundamental research areas in cognitive informatics is the cognitive model of internal information representation and knowledge presentation in human brains. Cognitive Informatics (CI) is a transdisciplinary enquiry of computer science, information sciences, cognitive science, and intelligence science that investigates into the internal information processing mechanisms and processes of the brain and natural intelligence, as well as their engineering applications in cognitive computing. Recent studies in cognitive computing reveal that the computing power in computational intelligence can be classified at four levels: data, information, knowledge, and intelligence from the bottom up. Vision is an amazing feat of computational intelligence. The backbone of vision is understanding of images and videos. Image Computer Analysis and Machine Vision are based on Image Processing Algorithms. Choosing the best concept to comprehend a meaning in natural language generation is one of the tough task. A vision system while navigating in an environment should be able to recognize what the main objects in the scene are. The generated language can be spoken out for a visual impaired/differently enabled person and it may enable a blind person to acquire the description which is drawn in some picture. This paper depicts the work on development of an Object-Attribute-Relation (OAR) model. OAR model includes illustration of knowledge and information representation in the brain. The OAR model can also be fruitful to analyze and demonstrate a wide range of cognitive mechanisms and mental processes in natural and artificial intelligences such as learning, comprehension, and reasoning. RELATED WORK Cognitive Computing is an innovative pattern based on intelligent computing methodologies and their systems of Computational Intelligence. The implementation on computational intelligence has taken place by self-determining conjecture and perception simulations of the mechanisms of brain [1]. Cognitive Computing is developed and materialized based on the integrative research in Computational Intelligence [2]. A Cognitive Learning Engine which is known as the “CPU” [3] of Cognitive Computing which is based on concept algebra is growing up in the Cognitive Informatics and Cognitive Computing Lab [4]. This engine implements the basic and advanced cognitive computational operations which includes concepts and knowledge. A fundamental solution to this work may also be edged to computing with natural language, and computing with words [5]. Lau et al (2003) have outlined a research agenda for automatically acquiring procedural knowledge for use in autonomic systems for knowledge acquisition. It is based on learning procedures by observing experts perform these procedures on live systems, and dynamically building a procedural model that can be executed on a new system to repeat the same task [27]. Work has been stared on natural language generation system and a lots of work has been reported on behalf of Image processing and Computer Graphics. Recent trends of communication are majorly functioning with multimodal systems. We have done a literature survey and found some of the related work as presented here. Poet Image Description Tool, developed by the DIAGRAM Centre is an open-source, web-based tool for creating image descriptions for images in existing DAISY books [21]. Page | 1 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com Lately, domain adaptations for computer vision applications have attempted several works with the use of language to alleviate image scene understanding. A pre-determined production rules were also used to depict the actions in videos [6]. The news captions to design names linked with faces in the images were also adapted and processed [7], this work was elongated to branch poses that were detected from the images with various verbs in the subtitle [8]. The above mentioned approaches use well-defined examples from a finite caption corpus to understand and enroll a combined image-text model so that an annotated version can be acquired of new unfamiliar images with textual information easily. The endeavor mentioned above is not tested on any of complex daily images where one gets immense innovative deviations of objects and poses which makes it very difficult rather impossible to learn and enroll a general model. An attempt was made to introduce a framework which parses images/videos to get textual description [9]. Work was done to accomplish sentences which firstly settle learning from a set of human annotated examples [10]. If images and sentences share common properties then same sentence is produced. Generation of novel sentences from images was not achieved from these annotations. Natural language generation (NLG) is a deep-seated problem and some classic approaches which are basically based on selection, planning and realization were introduced [11]. Recently, work has been done on approaches which generates formal and specific inputs. Efforts are taken on these approaches to get output by theorem provers [12] or databases by [13]. There is a noted peripheral problem that has been ignored in the generation literature viz; how to deal with noisy inputs. Here, most of the exercise is focused on selection and realization. Cognitive informative for Knowledge acquisition using OAR Model Cognitive Informative is a process where any internal mental affair which includes phenomenality such as perception, recognition, imagining, remembering, thinking, judging, reasoning, problem solving, conceptualizing, and planning. These cognitive processes can materialize from human language, thought, imagery and symbols. Cognitive informative theory is often referred to as "information processing" [14]. Knowledge acquisition stands as a basic building pillar for cognitive information. It can be observed as a approach or practice that a knowledge engineer assembles information principally from professionals, but information can also be assembled from technical manuals, research papers, text books and other recognized sources for eventual translation into a knowledge base that is coherent to both machines and humans. Artificial Intelligence is the branch of computer science that deals with ways of representing knowledge using symbols rather than numbers and with rules-of-thumb, or heuristic, methods for processing information. See One of the fundamental issues in artificial intelligence is the problem of knowledge representation[17]. Intelligent machines must be provided with a precise definition of the knowledge that they possess, in a manner, which is Knowledge Sence Act independent of procedural considerations, context-free, and easy to manipulate, Building exchange and reason about. Any comprehensive approach to knowledge representation has to take into account the inherently dynamic nature of knowledge. As new information is acquired, new pieces of knowledge need to be dynamically added to or removed from the knowledge base. For inferences, decision-making, dialogue-exchange or machine learning, the fundamental issue Decide Interpre involved is the utilization of reasoning. Reasoning is the process of arriving at t new conclusions. To reach a conclusion, we generally conclude certain Learn investigations. Therefore, if the investigations are not formally represented using a knowledge representation language which is clear and user-friendly, Figure. - 1 performing reasoning shall become a daunting task. Knowledge representation is intertwined with representation and reasoning. If one cannot determine how to say what we are thinking we cannot use the representation to communicate with the reasoning system. A good knowledge representation may have several measures like support to efficient reasoning and expressivity, whether the represented knowledge is adequate and satisfying the goal, the quality of knowledge representation and uncertainty of expressed knowledge and measurable consistency of how much consistent the determined knowledge is to cope with dynamism in the world knowledge. Better reasoning and inferences over the represented knowledge to adapt to addition of necessary information with change in the specification or conceptualization is one of the challenges to be achieved in knowledge acquisition. It is recognized that knowledge acquisition and manipulation process mimicking the brain has machine learning as its generic form. The Object Attribute Relation (OAR) model for knowledge representation is adopted in this paper design[15]. The object attribute relational model is one worth mentioning at this juncture. A knowledge structure may be considered a set of events - see, sense, interpret, learn, decide and act Figure - 1. OAR Model says that the human memory and knowledge are represented by relations. These delegations are interrelated connections between neurons. The OAR model makes a brief explanation to the mechanisms of internal knowledge and representation of information in human brain. This model informs the wide range of mental processes and the cognitive mechanisms in natural as well as in the form of artificial intelligences. On the basis of OAR model, human knowledge can be illustrated as active combination of the existing OAR with newly generated OAR i.e. objects, attributes and/or relations[16]. Knowledge is a sub-content of memory. This content of memory as a rule is depicted by container metonymy. This metonymy was unable to explain how large amount of knowledge may be possessed without increasing the quantity of neurons in a human brain. This involves associative metonymy for annotating the active mechanisms of Page | 2 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com knowledge and memory. According to associative metonymy of knowledge human brain does not create new neurons for new information instead it creates new synapses in between the existing neurons for the knowledge representation. The logical model of long term memory or the form of knowledge representation by the DNC model can be formally described by OAR model [19]. It is noteworthy as in the OAR model that the relations themselves represent information and knowledge in the brain. The relational metaphor is totally different from the traditional container metaphor in neuropsychology and computer science, because the latter perceives that memory and knowledge are stored in individual neurons and the neurons function as containers. The novelty in OAR model [20] is that it matches the process of human cognition. COMPUTATIONAL EXPERIMENTS Knowledge representation experiments have been explored for computer vision. First application "Text-to-Vision" [T2V] is to understand natural language fiction to form the scene depicted into it and second application "Vision-to-Text" [V2T] is analyzing a image to generate a description of it in natural language. Both the processes involved the three component namely - Knowledge acquisition from input, Knowledge representation as an abstract in machine-readable form and Information generation for human user. A basic Architecture of [T2V] System is presented in Figure. - 2. A basic Architecture of [T2V] System is presented in Figure. - 3. Input Text G R A M M A R Input Image Natural Language Processing Pre Processing POS Tagging Langauge Parsing Semantic Role Labeling Dependency Identification Anaphora resolution I M A G E Knowledge Acquisition & Representation I M A G E Back ground preparation Objects selection Spatial Relation Merging and positioning Scene placement Image rendering Image Generation Scene Output Fig. - 2Architecture for [T2V] Application Image Processing Background Identification no. of objects Identification Each object Identification Size + colour Identification Position Identification Spatial Relation Knowledge Acquisition & Representation G R A M M A R Theme Identificatiion Structure selection Lexicon Identification Spatial Relation Template Selection Language Generation Natural Language Generation Hindi Output Fig. -3 Architecture for [V2T] Application In Step 1 - Knowledge acquisition Model : Acquisition of knowledge from given input is highly dependant on deep domain knowledge and expertise. In case of [T2V] system, Natural language processing components like preprocessing, POS tagging, parsing, semantic role labeling, dependency identification and anaphora resolution takes place. The outcome of this component is a Parsed structure as presented in [Jain P.-visit and others citations]. To analyze a scene, [V2T] system proceeds with the work on Visual processing with understanding an Image as a human brain does. The main tasks of Visual Processing are identifying the objects and their respective positions in the image. Identification of color composition along with their reflection properties is also a major role played by Visual Processing. It decomposes into their constituent visual patterns by an image parsing engine. [22]. In Step 2 - Knowledge Presentation Model : A knowledge base is an information repository that provides a means for information to be collected, organized, shared, searched and utilized. Experiments have been done based on OAR model, which says the human memory and knowledge are represented by relations. As explained n Section-III, an ontological model (refer Figure - 4) has been prepared using OAR model which links the attributes and relation of objects. Page | 3 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com Knowledge Space Kitchen Restaurant Room Roof Office Factory Shop Park Road Library Object Book Sit Read [MAN] Newspaper Swimming pool Sea Work Swim Eat Walk Fig. - 4 Picture needs to be changed Fig. - 5 Picture needs to be changed The same theory has been worked with [T2V] System. Anaphora and other co-references are resolved using the dependency structures found in Natural Language Parsing. A role matching and behavior assignment to the identified entities is labeled between the words. It is done by matching the corresponding word with a list of keywords in the entity descriptor. Co-references are resolved in the form of explicit names in the text. The graphical constraints which represent the position, orientation, size, color, texture, and poses of objects in the scene are obtained from the semantic relations. From the components obtained, a scene instance is constructed by events mark up with pragmatic context. Scene instance represents the predictable information extracted by identifying the functionally and visually important objects of environment. Transduction rules and adequate heuristics are applied to resolve conflicts and to add implicit constraints in environment. Language needs to be changed[Jain P.-visit. An example of an ontological know representation using OAR Model [19] is shown in Figure - 5. It couples the behavioral mapping of object (Man), attribute (walk) and possible relations (Jungle, room, park etc.) subsequently provides a ground for ambience rendering in output Image. As Visual knowledge representation referred as visual grammar is the heart of basic [V2T] system. Most informative visualizations come from fields where qualitative data are an integral feature of the discipline, such as statistics in the natural sciences, engineering, technical drawing and applied mathematics. This transforms the understanding of visual data to a specific designed knowledge form. It shall be considered as a Visual Grammar that represents the identified objects properties into a pre-designed structures. This visual grammar structures the information of possible objects, their associated attributes and relations. After identification of objects in the input scene, this information is passed for NLG module to generate description of scene. Spatial relations between multiple extracted objects and their relative behaviors are also being identified using the same visual Grammar as shown in Figure - 5. In Step 3 - Information Generation Model : Reasoning is the process of arriving at new conclusions. A Knowledge space collects the information in a structure format and supply to generate the output for human user as a natural intelligence. [T2V] System demands a good demand on Image processing techniques. There are two basic steps in the scene synthesis and generation as explained in [25]. Many methods are available for detecting object collisions for positioning objects and spatial partitioning. Object mesh voxelization with surface extraction is often used for normalizing model mesh representations. A wide spectrum of models represented by a library of polygon mesh models are often employed. These models represent various types of stored objects used in the system. Collisions of new objects with the reference object are avoided in the placement algorithm. For scene description processing and scene synthesis, there are three major activities in visualization construction process: sequential data attribute selection, visual template selection and visual mapping specification. An environment is generated by identifying objects and relations in a variety of fields and then default instantiations of the objects are generated. In [V2T] System, a quadruplet of T = {n, v, s, p} (Noun-VerbScene-Preposition) that represents the core sentence structure. The important Nouns(objects) that participate in the image, some description of the actions verbs associated with these objects(Attribute), the background ambience associated to object along with the preposition that relates the objects to the scene (Relation) have is being considered. A basic NLG system needs to have stages of planning and merging of information to enable the generation. Appropriate structure selection and lexicon selection are two tasks that make language natural to meet specified communicative goals. RESULTS AND ANALYSIS Two separate small proof of concepts have been implemented as a "Text-to-Vision" [T2V] and "Vision-to-Text" [V2T] Systems. Having classified a semantic role labeled feature set associated with possible objects in the input, an Ontological Knowledge base based on OAR Model has been designed. A rule base Natural language processing engine has been implemented for [T2V] system. The output of the engine is a dependence tree, it populates the information in the knowledgebase using defined ontological framework. The Knowledge Base is framed in a standardized .xml format where all captured information shall be stored as a structured data as shown in Figure - 6. For acquisition in [V2T] system, we have developed an Image processing engines, which is able to identify some limited number of objects in the pictures. By using Grey-Level Segmentation and Thresholding Methods and Edge-Detection Techniques, it is also able to recognize the position and color of objects along with the identification of color of background. A knowledgebase is a structured XML as shown in Figure - 7. As explained in [26], the knowledge base should be able to represent analogical, prepositional, and procedural structure. It should allow quick access to information and be easily and gracefully extensible. It should support inquiries to the analogical structures, belief maintenance, Page | 4 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com inference and Planning. We have prepared a semantic feature set for Nouns being identified in input text and input Image. A repository of Ten Nouns have prepared with tagging of attribute and semantic relations. The ontology framework has been designed and developed for selected nouns and possible behavioral exhibits. A standard XML structure is finalized to represent the captured knowledge. This XML is an input to last engine to generate output for human cognition. Figure 6 XML generated for image recognition Figure 7 XML generated for shape recognition Output generation in a form of Image for [T2V] system is carried out in three steps. In the first step, background is selected for the scene to be generated. Subsequently in step 2 objects are selected and there relative positions are determined. In the final step, an image is rendered by surface extraction and spatial partitioning along with detection and removal of object collisions for positioning objects. We use an image repository that stores various types of objects that are tagged with a wide spectrum of features. NLG system reads the data from XML and plans a text for description in [V2T] system. A limited vocabulary and sentence structures has been incorporated into system to make it proof of concept. Natural Language Generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form. It is a process of deliberately constructing a natural language text to meet specified communicative goals. Document structuring, sentence planning, lexicon selection and language smoothening are the steps have been taken care to form a correct construction with syntax, morphology, and orthography. 1. Flying eagle and sky. Fig. – 8 Image Recognition Fig. – 9 Shape Recognition Page | 5 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com Above figure shows the images that are used for experiments. To be suitable for computer processing, an image f(x,y) must be digitalized spatially. Image representation is essential to convert the input data to a form suitable for computer processing then description of image is used to extract features that result in some quantitative information of interest or features that are basic for differentiating one class of objects from another. Then image is recognized by assigning a label to an object based on the information provided by its descriptors. The shapes are recognized in figure 9 and an xml is generated which is shown in figure 7. Similarly, an eagle flying in sky is recognized by its various features as wings and background as sky, the spatial feature of image provides information for recognition. An xml is generated in this case as shown in figure 6. CONCLUSION Here, we have presented a study on cognitive informative, knowledge representation and OAR model along with a literature survey on it. We have presented basic architectures of "Text-to-Visual" [T2V] and "Visual-to-Text" [V2T] system associated with details of internal modules. We have demonstrated a "proof of concept" with implementation of two separates application. As a utility, main impact of presented research is to provide help in community in different form of modality and computer vision. This may help to build a reasoning machine for complex inferences, problem solving, and decision making using traditional logic and rule based technologies. It may assist to frame an autonomous learning system for cognitive knowledge acquisition and processing. A cognitive medical diagnosis system, cognitive computing node for the natural intelligent, cognitive processor for cognitive robots are some more forthcoming area where this research may be fruitful. The best part of this research paper is that it opens many new directions and thought processes for further research in the area of Artificial Intelligence, Image processing, Computer graphics and visualization. References [1] [Wang, 2006, 2009b, 2009c, 2010a; Wang, Tian, & Hu, 2011] Yingxu Wang, University of Calgary, Canada Robert C. Berwick, Massachusetts Institute of Technology, USA Simon Haykin, McMaster University, Canada Witold Pedrycz, University of Alberta, Canada Witold Kinsner, University of Manitoba, Canada George Baciu, Hong Kong Polytechnic University, Hong Kong Du Zhang, California State University, Sacramento, USA Virendrakumar C. Bhavsar, University of New Brunswick, Canada Marina Gavrilova, University of Calgary, Canada Cognitive Informatics and Cognitive Computing in Year 10 and Beyond [2] Cognitive Informatics for Revealing Human Cognition: Knowledge Manipulations in Natural Intelligence edited by Wang, Yingxu [Wang, 2002a, 2003, 2007b; Wang, Zhang, & Kinsner, 2010; Wang, Kinsner et al., 2009] [3] A Cognitive Learning Engine (CLE) [Tian et al., 2011] [4] Wang, Y. (2008b). On concept algebra: A denotational mathematical structure for knowledge and software modeling. International Journal of Cognitive Informatics and Natural Intelligence, 2(2), 1–19. doi:10.4018/ jcini.2008040101. [5] Fundamental solution to computational linguistics, computing with natural language (CNL), and computing with words (CWW) [Zadeh, 1965, 1975, 1999, 2008; Wang, 2010a, 2010c, 2010d] [6] Kojima, A., Izumi, M., Tamura, T., and Fukunaga, K.(2000). Generating natural language description of human behavior from video images. In Pattern Recognition, 2000. Proceedings. 15th International Conference on, volume 4, pages 728 –731 vol.4 [Kojima et al., 2000] [7] Berg, T. L., Berg, A. C., Edwards, J., and Forsyth, D. A. (2004). Who’s in the picture? In NIPS [Berg et al., 2004] [8] Jie, L., Caputo, B., and Ferrari, V. (2009). Who’s doing what: Joint modeling of names and verbs for simultaneous face and pose annotation. In NIPS, editor, Advances in Neural Information Processing Systems, NIPS. NIPS. [Jie et al., 2009] [9] Yao, B., Yang, X., Lin, L., Lee, M. W., and Zhu, S.-C.(2010). I2t: Image parsing to text description.Proceedings of the IEEE, 98(8):1485 –1508 [10] Farhadi, A., Hejrati, S. M. M., Sadeghi, M. A., Young, P., Rashtchian, C., Hockenmaier, J., and Forsyth, D.A. (2010). Every picture tells a story: Generating sentences from images. In Daniilidis, K., Maragos, P.,and Paragios, N., editors, ECCV (4), volume 6314 of Lecture Notes in Computer Science, pages 15–29. Springer [11] Traum, D., Fleischman, M., and Hovy, E. (2003). Nl generation for virtual humans in a complex social environment. In In Proceedings of he AAAI Spring Symposium on Natural Language Generation in Spoken and Written Dialogue, pages 151–158. [Traum et al.,2003] [12] McKeown, K. (2009). Query-focused summarization using text-to-text generation: When information comes from multilingual sources. In Proceedings of the 2009 Workshop on Language Generation and Summarisation (UCNLG+Sum 2009), page 3, Suntec, Singapore. Association for Computational Linguistics. [13] Golland, D., Liang, P., and Klein, D. (2010). A gametheoretic approach to generating spatial descriptions. In Proceedings of EMNLP [14] Cognitive Information Processing Theory. http://expertlearners.com/cip_theory.php [15] The OAR model for knowledge representation Yingxu Wang, PhD, Prof, PEng, SMIEEE, FWIF http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4054984&tag=1 Page | 6 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com [16] The OAR Model of Neural Informatics for Internal knowledge Representation in the Brain Yingxu Wang, University of Calgary, Canada. http://www.ucalgary.ca/icic/files/icic/90-IJCINI-1305-OAR.pdf [17] Representing Knowledge Effectively Using Indian logic Mahalakshmi G.S. and Geetha T.V. http://www.tmrfindia.org/eseries/ebookv1-c2.pdf [18] Wang, Y. (2006, July). Keynote: Cognitive informatics - Towards the future generation computers that think and feel. In Proceedings of the 5th IEEE International Conference on Cognitive Informatics, Beijing, China (pp. 3-7). Washington, DC: IEEE Computer Society [19] Poet Image Description Tool, Digital Image and Graphic Resources for Accessible Materials. http://diagramcenter.org/development/poet.html [20] POLLy: A Conversational System that uses a Shared Representation to Generate Action and Social Language https://users.soe.ucsc.edu/~maw/papers/swati_ijcnlp08.pdf [21] Baby Talk: Understanding and Generating Image Descriptions by Girish Kulkarni Visruth Premraj Sagnik Dhar Siming Li Yejin Choi Alexander C Berg Tamara L Berg, Stony Brook University Stony Brook University, NY 11794, USA. http://homes.cs.washington.edu/~yejin/Papers/cvpr11_generation.pdf [22] I2T: Image Parsing to Text Description, Benjamin Z. Yao, Xiong Yang, Liang Lin, Mun Wai Lee, and Song-Chun Zhu [23] Representing Knowledge Effectively Using Indian logic Mahalakshmi G.S. and Geetha T.V. http://www.tmrfindia.org/eseries/ebookv1-c2.pdf Page | 7

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Paper Title (use style: paper title)