Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Readings in MIS: Key Articles Database and Analysis Prepared By: Mary Burns Katherine Carl Soo Mi Cheong Jeisi Cheng Koren Elder Li Fan Edward Huang Brent Langhals Matt Pickard Nathan Twyman Shou Zeng Ray Zhao MIS696A: Readings in MIS Fall 2008 Oversight by Dr. Jay F. Nunamaker, Jr. Abstract In this report, we summarize the contributions of previous classes, which become a solid foundation which our project can be based on. With efforts by previous years and the MIS definition (by Brabb) we accepted, we developed our research philosophy and define our main contributions. The main contributions of the class 2008 are that we developed a systematic approach to build MIS research repository, evaluate the research according to predefined dimensions, and automate the classification analysis. Further, a user-friendly interface was constructed for future usage. Beside the system construction, we also extended the previous MIS research collection. We interviewed facilities during the semester and asked them to recommend studies that influenced them most. All these studies were included into our repository to make the collection more complete. Our project details are also described in this report. We used three methodologies which are Text Mining, Clustering Analysis and Citation analysis to get better insights from our MIS paper repository. After that, the limitation and future research are discussed. Future classes should read the Limitations and Future Study section of this paper for important tips on how to conduct follow up research. In the end, we conclude our finding of this project and the experiences we learned from this project. Table of Contents Abstract..........................................................................................................................................................2 Introduction ..................................................................................................................................................4 Contributions from Classes 1998 through 2007 ................................................................................4 Contributions from Class of 2008...........................................................................................................5 Definitions .....................................................................................................................................................7 Our Research Philosophy..........................................................................................................................9 Database System Development .............................................................................................................9 Database Design ....................................................................................................................................... 10 Testing ......................................................................................................................................................... 10 Research Methodologies / Results ..................................................................................................... 10 Text Mining ...................................................................................................................................... 11 Clustering Analysis of MIS Papers .............................................................................................. 15 1. Introduction ...................................................................................................................... 15 2. Experiment Design .......................................................................................................... 16 3. Result Analysis................................................................................................................... 23 4. Discussion and Future Work ......................................................................................... 25 Citation Analysis .............................................................................................................................. 26 Limitations and Future Research......................................................................................................... 34 Conclusion .................................................................................................................................................. 35 Appendix 1: User Reading Guide........................................................................................................ 36 Appendix 2: 2007 Classification Model ............................................................................................ 37 Appendix 3: General Sequel Queries ................................................................................................ 38 Appendix 4: Database Design.............................................................................................................. 41 Appendix 5: Data Mining Analysis – Data ......................................................................................... 43 Appendix 6: Clustering Analysis ........................................................................................................... 60 Appendix 7: Clustering Results............................................................................................................. 67 Introduction The purpose of this research is to familiarize the reader with key researchers, research categories, and the research articles crucial to a foundation in the Management of Information Systems (MIS) discipline. Building on the analyses, models, and reports from previous classes, we categorized 185 research articles we collected, built a database, populated it with the articles in both PDF and text format, conducted a variety of statistical and data/text mining analyses, and summarized our findings. Contributions from Classes 1998 through 2007 Previous classes have compiled excellent research to familiarize readers with the MIS domain, including key researchers and their respective research. The Class of 1998 began the process by listing seven subdomains of MIS. For each subdomain, they listed over 45 influential researchers with a one paragraph biography. 1999’s class created a list of 47 key researchers in MIS grouped by ten research areas described in several paragraphs. The Class of 2000 expanded the research areas to 15, listing 90 key researchers with highlighted research for those authors. In 2001, the class presented a timeline of events in MIS and defined eight subdomains of MIS. The Class of 2002 recategorized MIS into nine subdomains with a visual representation of the subdomain relationships. For each subdomain, they described seminal works. For the project in 2003, the class identified the top 101 MIS researchers, categorizing them by subdomain. The class of 2003 also presented a three-dimensional model of MIS research characteristics with axes representing behavior vs. technical, application vs. theory, and rigor vs. relevance. Each seminal work was mapped onto the three-dimensional model for enhanced visualization. Also, profiles of researchers with key contributions were identified. Most importantly for future classes, the Class of 2003 developed an End Note reference library of the research articles that they had collected. In 2004, the class added to the body of knowledge by identifying U.S. departments of key researchers and their corresponding key research. The Class of 2005 compiled the models of MIS from 2002, 2003, and 2004 into one comprehensive model. With this model, the class identified key research. In addition, the Class of 2005 included charts explaining the quantitative contribution of research from each subdomain. The class of 2006 extended the previous projects by exploring various research methodologies and methodological paradigms. The class of 2007 took the first steps toward automating the classification of MIS research by developing an algorithm and decision tree (Appendix 2) to aid readers in defining and identifying key MIS research articles and how they fit within the MIS continuum. Contributions from Class of 2008 Early on, we decided that our key goals were to: categorize the foundation articles based on the 2007 algorithm/decision tree, develop a database of the articles, and perform analyses for our class project/report. Description of our Project/Process: Each member in the class familiarized him/herself with the materials/models from previous project reports. We learned that there was a 160 MIS ‘foundation’ research articles formerly stored in an End Note library that had been developed by an earlier project group. While we could not find that entire library, we did find subsets of the library that contained PDF versions of these papers. Therefore, we needed to re-create the previous library as an early step. In September, we selected a team leader, Brent Langhals, and a project leader, Koren Elder. Both of them worked quickly to develop a project plan to which the rest of the team agreed: a. As one deliverable, we would deliver a database that contained all of the articles from the previous library list in both PDF and text format for the article and its accompanying bibliographic data. This database would be placed on a web site that could be accessed not only by our class but also by future classes. b. As part of that deliverable, we would identify other key research articles recommended to us by faculty in the MIS department. c. The articles would be read, categorized according to the existing decision tree, and rated on a variety of dimensions, such as Theoretical, or Applied. d. Another deliverable would consist of the analyses of the articles. To start, we divided the existing list of articles among the class members. Each class member needed to find the PDF version (if not already available via the remainder of the earlier End Note library) of the articles and place those on a shared Website set up for the class. Due to some duplication, erroneous End Note entries, or articles that were impossible to find (within the scope of the project), the number of articles uploaded to the Website is 160. To augment the list of articles, the class, broken into four teams, interviewed faculty members who recommended additional articles to be added to the existing 160 articles. 25 of these articles (in PDF format) were added to the Website. As a class, we decided to categorize the articles by using the decision tree developed by the class of 2007. We realized that we could re-develop that tree or enhance it. However, as first year Ph.D. students, our limited knowledge and experience would not permit us to develop a significantly better decision tree efficiently. Therefore, to keep moving forward to the key contributions of the project, we decided not to reinvent the wheel. Future classes can re-examine the decision tree, as necessary. To support data mining and analyses, we chose to keep track of five key words and a rating (on a scale of 1 to 5, with 5=highest rating) of each of the following domain types: Theoretic, Application, Rigorous, Relevance, Review, Innovative, Technical, and Behavioral, for each article. For each article, we obtained the citation counts from both Web of Knowledge and Google Scholar. Finally, we asked each group to answer the following question per article: “Should this article be considered for removal from the corpus?” For the data described above, Koren built a database that included attributes for the various ratings, keywords, and citation counts. In addition, she provided attributes to store text as well as PDF versions of the articles. The text version of the articles was essential for the text mining analyses conducted at the end of the project. To access the database, Koren, Li, and Ray developed a Web interface for team members to use to update the fields for each article. The four teams split 185 articles equally by random assignment. Each group member read each of the articles in the subset and independently chose corresponding classifications, ratings, and key words. Later, each group met to develop a group decision about the categories and ratings for each article. Finally, the group updated the database via the Web interface. Once the database was set up, several teams conducted statistical analyses and text mining. The results were collected for this report. Definitions While there is academic debate as to an all encompassing definition of MIS, this study uses the succinct and widely accepted definition of Brabb1: A management information system is the complement of people, machines, and procedures that develops the right information and communicates it to the right managers at the right time. Using this definition, each of the previous classes from 1998 through 2005 have created a conceptual model of MIS. Each class has subdivided MIS into various subdomains, ranging from 7 to 9 subdomains. While each class has justified the uniqueness of their findings, the distinguishing characteristics have become progressively consubstantial. Therefore, until a completely novel model of MIS is developed, we choose to follow the model presented by the Class of 2005. For simplicity’s sake, we refer the reader to their paper for discussion, defense, and justification of the model (See Figure “MIS Model by Class of 2005 “). 1Brabb, George J. Computers and Information Systems in Business. Houghton Mifflin Co., Boston. 1976, pp. 26, 37. While a difficult and potentially controversial task, defining “key” research and “key” researchers has been documented in previous MIS696 class research projects. In earlier classes, “key” represents those papers and individuals that have highly influenced and contributed to the IS body of knowledge. To employ this definition, previous classes have used interviews with established researchers and citation counts to guide their selection. This paper accepts the definition and selection criteria of the past eight projects. We will specifically use the list of key articles and researchers provided by the Class of 2005’s research paper, with a few updates to account for recent research. The focus of this paper is not to dispute the past papers’ findings, but instead to augment their findings with a discovery and discussion of the categories of key research articles. We wish we could include a much larger body of high quality research from IS in this paper, but that would include thousands of papers. Realizing that this project has a limited scope, we narrowed the number of papers to 185. MIS Model by Class of 2005 Using the MIS model from previous classes allowed us to focus on a unique contribution: to identify the categories of influential research papers. We hope that this data will provide a foundation for future researchers in selecting and justifying a research domain/subdomain. Our Research Philosophy Our research philosophy involved developing a tool that would not only automate the classification of MIS research, but also offer a lasting repository of seminal MIS literature that future classes could build upon. Our team determined that the best way to accomplish this goal was to develop a relational database containing the text from 185 articles and key attributes of each article. Database System Development The team decided to load the paper metadata and content to a database in order to use more sophisticated query and mining techniques. SQL Server was chosen as the database management tool because the CMI lab had access to developer licenses, the team had some expertise in SQL Server and the SQL Server text mining tools were going to be used for one method of analysis. Working from an EndNote library, the team loaded the paper metadata into SQL Server. Then, the full text was added for each paper by the individual members of the team. Finally, the groups added category and dimensions after reading the papers. Once the database was complete, queries to extract data for analysis were run against the Papers table. Database Design Testing We performed general testing of the database functionality by conducting multiple queries based on potential research questions. It was determined that the database is fully functional and all errors were resolved. A sample of the results of our general queries is located in Appendix 3. Research Methodologies / Results The primary objective of our project was to create a complete a relational database of the MIS corpus. However, our team sought to perform some analysis of the data for the following reasons: 1) to validate the classification model from the 2007 class, 2) validate the database was capable of supporting future text mining and key word analysis, and 3) use citation counts as a measure to validate that an article did belong in the MIS corpus. The following sections describe the types of data analysis we performed and the results we discovered. Text Mining Our goal for this project was to perform limited text mining analysis of the articles, particularly in regards to the ability of text mining techniques to classify the articles. Unfortunately our team had limited data mining experience, so the following is an example of the methods we used and some preliminary results. Process used to build mining model: 1. Build a dictionary a. Using Fulltext field on Papers table - Noun and Noun Phrase (TFIDF frequency = 10, length = 2) - to build Dictionary table b. TFIDF (term frequency–inverse document frequency) - weight is a statistical measure used to evaluate how important a word is to a document in a collection or corpus. The importance increases proportionally to the number of times a word appears in the document but is offset by the frequency of the word in the corpus. c. Ran mining models with full dictionary (no editing of words). 2. Build term vectors a. Use “Term Lookup” transform to capture frequency and paper id into TermVectors table. 3. Prepare train/test samples a. Set sampling rate to 70% b. Train sample 70% into TrainPapers c. Test sample 30% into TestPapers 4. Build/Test/Refine data mining models a. Use TermVectors, TrainPapers and TestPapers as input b. Case table: TrainPapers c. Nested table: TermVectors d. Microsoft Decision Trees (PapersDM_DT) e. Microsoft Naïve Bayes (PapersDM_NB) f. Microsoft Logistic Regression (PapersDM_NN) g. Micorsoft Clustering (PapersDM_CL) 5. Check Mining Accuracy a. Case table: TestPapers b. Nested table: TermVectors 6. Browse Models The "Population correct" is shown in the lift chart legend if you don't specify the predict value, thus showing overall correctness of the model (how many predictions are correct, no matter of the state of the target variable). Predict Probability – probability of the most popular prediction state for the model. Score – The Score column in the legend shows you the overall quality of a model the higher the score is, the better the model is. So order of charts by score would be: Neural Network, Logistic Regression, Decision Tree, Clustering, and then Naïve Bayes. The following Lift Chart depicts clustering techniques looking to predict some specific Category values. Target Population – Indicates how much of the target population you capture at the gray intercept line. Predict Probability – shows the probability score needed for each prediction to capture the shown target population. Score – used for comparison with other models. This means that by selecting all rows with the specified probability or higher, you will capture that percentage of the total possible target rows. So, in the Naïve Bayes model, selecting rows with .76% probability or higher would get 93.33% of the target rows. Logistic Regression and Naïve Bayes are the best for predicting but Neural Network is close behind. Systems Analysis & Design is harder to predict. Logistic Regression and then Neural Network models score the best. Can get to 100% with the Logistic Regression if select rows with probability of 23.75% or higher and with Neural network if select rows with probability of 31.12% or above The dependency network displays the dependencies between the input attributes and the predictable attributes in a model. Additional data from our text mining analysis can be found in Appendix 5. Clustering Analysis of MIS Papers 1. Introduction 1) Purpose The purpose of this clustering analysis is to classify the MIS papers from a different aspect, and try to build a map from a different angle. Our analysis uses unsupervised clustering method as the tool, and uses generalized characteristics of papers such as the domain of general research (theoretical vs. applied), the research methodology (rigor vs. relevance), the characteristic of the content (review vs. innovation) and the track it belongs to (technical vs. behavior), other than specific research domain and keyword, as the raw data. The methodology we use is Fuzzy kMeans Clustering method, which is suitable for unsupervised classification. The result of this analysis will provide useful information to assist the trend analysis and prediction about MIS research. 2) Fuzzy k-Means Clustering Algorithm Data clustering is the process of dividing data elements into classes or clusters so that items in the same class are as similar as possible, and items in different classes are as dissimilar as possible. Depending on the nature of the data and the purpose for which clustering is being used, different measures of similarity may be used to place items into classes, where the similarity measure controls how the clusters are formed. Some examples of measures that can be used as in clustering include distance, connectivity, and intensity. In this case, we use Fuzzy k-Means Clustering method for clustering procedure, which is one of the widely used clustering methods nowadays. In fuzzy clustering, data elements can belong to more than one cluster, and associated with each element is a set of membership values. So each point can belong to every cluster with a specific level, as in fuzzy logic, rather than completely belongs to just one cluster. It indicates the strength of the association between that data element and a particular cluster. Thus, points on the edge of a cluster may belong to a cluster with a lower level than points in the center of this cluster. Fuzzy clustering is the process of assigning these membership levels, and then using them to assign data elements into a specific cluster. The clustering procedure consists of three steps: fuzzy k-means clustering, validation and cluster evaluation. In the fuzzy clustering procedure, the number of clusters should be set by user. Because we do not know the best number of clusters for this data set at the beginning, we should run the clustering procedure many times with a series of different numbers of clusters one by one, to provide information for validation procedure to determine the best number of clusters. In the validation procedure, we choose a certain index to evaluate the performance of these clustering results, in order to find out the best number of clusters for this data set. In the cluster evaluation procedure, we re-clustered the data set with the best number of clusters, and labeled each paper with the cluster number which it belongs to. 2. Experiment Design 1) Coordinates System Definition Each paper was scored for 8 different attributes, which can be categorized into 4 pairs: Theoretical vs. Applied (TA), Rigor vs. Relevance (RR), Review vs. Innovation (RI), and Technical vs. behavior (TB). Four axes are used to represent these attributes, which we called MIS-Paper Attributes Space. The coordinates system is defined as follows: Applied Review X1 Behavior X4 X2 Rigor Relevance X3 Technical Innovation Theoretical Thus, all the 185 papers have their own coordinates in this 4-dimensional space, which will be used for clustering. 2) Data Processing Before fuzzy clustering, the coordinates of each paper should be calculated from the 8 attributes, so each paper can be represented as a point in the MIS-Paper Attributes Space. Each score value of the paper is represented by the symbol: scorei ,Attribute , i 1, 2... N where i is the number of a paper, and Attribute is the name of the attribute Because the score of each attribute of the paper ranges from 1 to 5, the score range of two attributes will be distributed along the same axis symmetrically around the origin, and the arithmetic mean of every pair of two scores is used as the coordinate on the very axis. The coordinate of each paper is represented by the vector pi : pi ( xi1 , xi 2 , xi 3 , xi 4 )T xi1 ( scorei ,Applied scorei ,Theoretical ) xi 2 2 ( scorei ,Relevance scorei ,Rigor ) 2 xi 3 xi 4 ( scorei ,Innovation scorei ,Review ) 2 ( scorei ,Behavior scorei ,Technical ) 2 i 1, 2...N 3) Clustering Procedure The number of clusters is set by user during the clustering procedure. Because we are not sure how many clusters is the best for this data set at the beginning, we execute the clustering procedures many times with a series of different numbers of clusters, which starts from a lower number to a high enough number. In this case, we choose the numbers from 3 to 15 as the numbers of clusters. The result of clustering will later be used in the validation procedure to determine the best number of clusters. 4) Validation Procedure Clustering validity refers to the problem whether a given fuzzy partition fits to the data all. The clustering algorithm always tries to find the best fit for a fixed number of clusters and the parameterized cluster shapes. However this does not mean that even the best fit is meaningful at all. Either the number of clusters might be wrong or the cluster shapes might not correspond to the groups in the data, if the data can be grouped in a meaningful way at all. Based on the result of a series of clustering experiments in the clustering procedure, we use Partition Index as the index to evaluate the performance of the clustering result. Partition Index (SC) is the ratio of the sum of compactness and separation of the clusters. It is a sum of individual cluster validity measures normalized through division by the fuzzy cardinality of each cluster. c SC (c) i 1 N j 1 ( ij ) m x j vi N i k 1 vk vi c 2 2 where i is the cluster number j is the object number ij is the fuzzy membership value of object j belonging to cluster i x j is the coordinates of the object j vi , vk is the coordinates of the center of cluster i, k N i is the number of objects that belong to cluster i The goal of clustering is to categorize the objects with similar attributes into the same group, so the reason we choose this index is that it covers the two aspects of our goal: to group the papers with as many similarities as possible, and to separate different groups as far away with each other as possible. According to the definition of the Partition Index, a lower value indicates a better partition. The Partition Index is calculated based on the series of results from the clustering procedure, with numbers of clusters ranging from 3 to 15 in this case. The curve of Partition Index is shown in the following figure: Validation curve using Partition Index (SC) 3 2.8 Partition Index Value 2.6 2.4 2.2 2 1.8 X: 7 Y: 1.47 1.6 1.4 2 4 6 8 10 Number of Clusters 12 14 16 We want to cluster the data to achieve a better performance with as few clusters as possible. If the number of clusters is too big, the number of papers in each cluster would be too small to be used in generalizing the characteristics of each cluster as a classification model for future publication analysis. A fact from the figure is that the Partition Index decreases sharply with the numbers of clusters from 3 to 7, and decreases tardily with the numbers of clusters greater than 7. Although cluster numbers that are greater than 7 result in lower Partition Index values, the increase of performance is not as prominent as the cluster numbers that is smaller than 7. So 7 is the “elbow” point of the curve, where the absolute value of the slope suddenly drops. As noticed, although the Partition Index would be the lowest among the points in this curve if the data set is partitioned into 12 clusters, considering the 185 as the total number of papers in this dataset and the generalization in characteristics of every cluster, we will choose 7 as the best number of clusters. Best Number of Clusters: 7 Partition Index Value for 7 Clusters: 1.4703 5) Clustering Evaluation and Data Visualization With the best number of clusters 7, the data set was re-clustered again. Each paper will have 7 membership values that indicate its level of belonging to every cluster. The sum of these 7 values should be 1. By selecting the greatest membership value among the seven values, each paper was labeled with the number of category it belongs to. We use 3-dimensional graphs to display the result of clustering, and 4 graphs are drawn to cover all aspects of the MIS-Paper Attributes Space. In these graph, different clusters are represented in different colors (Cluster 1 – Green, Cluster 2 – Blue, Cluster 3 – Cyan, Cluster 4 – Red, Cluster 5 – Yellow, Cluster 6 – Purple, Cluster 7 - Black): 3. Result Analysis The clustering result is a partition of all 185 papers based on the similarity of 4 groups of features: Theoretical vs. Applied, Rigor vs. Relevance, Review vs. Innovation, and Technical vs. Behavior. Papers in the same cluster have similarities based on all the features, all the papers can be categorized into 7 groups. By calculating the mean coordinates of all the papers of each cluster, the center coordinates of each cluster can be obtained, and the characteristics of each category can be shown based on the value of different pair of features. The coordinates and the number of papers of each cluster are shown as following: v1 0.6052 0.5752 0.8217 1.1602 , N1 28 v2 1.2773 0.5368 0.2232 1.4327 , N 2 24 v3 0.2477 0.0712 0.3298 -1.3236 , N3 29 v4 -0.9300 -0.3595 0.9106 -1.7341 , N 4 26 v5 -0.1861 0.6770 -0.8188 -0.4133 , N5 28 v6 1.0888 0.6974 0.9997 -1.6909 , N6 29 v7 0.1839 0.9297 -0.8653 1.1222 , N 7 21 The characteristics of different clusters can be summarized as follow. In order to describe the degree of features, we map the values with different descriptions. For the absolute value of every attributes [0, 0.1) slightly [0.1, 1) moderately [1, 1.5) normal [1.5, 2] extreme So the characteristics of each cluster could be translated into the following table: Theoretical Applied Rigorous Relevant Review Innovation Technical Behavior Cluster1 --- Moderate --- Moderate --- Moderate --- Normal Cluster2 Normal --- --- Moderate --- Moderate --- Normal Cluster3 --- Moderate Slight Slight --- Moderate Normal --- Cluster4 Moderate --- Moderate --- --- Moderate Extreme --- Cluster5 Moderate --- --- Moderate Moderate --- Moderate --- Cluster6 --- Normal --- Moderate --- Moderate Extreme --- Cluster7 --- Moderate --- Moderate Moderate --- --- Normal Further more, we mapped the clustering result with the categorized domain of the papers: Artificial Intelligence Collaboration Data Management Decision Sciences eCommerce Ecomomics of Information HCI Information Assurance Knowledge Management Operations Management Social Informatics Supply Chain Management System Analysis & Design Workflow/Business Process Management OTHER Sum Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 0 1 5 1 2 4 0 4 5 1 1 0 1 5 1 1 12 10 7 8 0 2 1 1 4 3 0 3 2 0 1 0 1 0 1 4 2 0 0 4 0 1 3 2 1 0 2 1 1 1 0 0 0 0 0 0 1 5 1 0 0 0 1 1 0 0 1 0 0 0 3 1 0 0 2 0 4 0 0 0 2 0 0 0 6 0 4 4 3 13 4 0 0 28 1 5 24 0 3 29 1 2 26 0 4 28 1 1 29 0 1 21 Sum 13 17 39 14 5 11 10 1 8 2 10 2 34 3 16 185 Pie charts (Appendix 6) are also generated to display the data explicitly. In these pie chart, the percentage numbers represents the proportion of papers in the specific domain which belong to different clusters. From these pie charts and the table of characteristics description of every cluster, one can easily check the attributes structure of papers in every MIS domain. Combining with the detail information of every paper (Appendix 7), such as the author, the author’s affiliation, and the journal, we can summarize the following information: 1) Author’s research map By analyzing the author’s paper distribution among these pie charts and the characteristics description table, domains and attributes of the research work of this author can be extracted, where the attributes are defined the same as in MIS-Paper-Space. 2) Research map of a university By analyzing the authors that belong to the same department of a university, we can also extract the domains and attributes of the research work of this university. 3) Types of paper of a journal By analyzing the paper distribution from the same journal, we can extract the preference of its selection of papers. 4) Trend analysis and prediction By adding time dimension to the above result, we can analyze and predict the trends of the author’s research work, university’s research work and journal’s preference. Also by adding other information such as the change of affiliation of professors in university or editors of journal, the analysis can also cover the sudden changes in the trends due to these unpredictable factors. With the result of this analysis, we can easily catch the latest research hotspot in every domain, follow the change of the preference of journals, acquire the real-time information of the changes of universities’ and professors’ role in the MIS community, and the most important, to discover the unexplored domain in MIS area, all in order to make better decisions in choosing our research field and managing our research portfolio. 4. Discussion and Future Work There are two major difficulties in applying this analysis method. First, due the unsupervised essence of clustering method, it needs information from other aspects to make a reasonable explanation about the result. In the clustering procedure, the algorithm will automatically “find” some patterns as the standard of partitioning, which is often invisible. So the result may have no meaning unless it is analyzed with the help of other information. Manual work is usually required in this analysis process. Second, the attributes of every paper are scored by reviewing the paper, which will contain bias due to the research domain and the familiarity to the paper’s research domain of the reviewer. Because we use the attributes of papers as all the data in our clustering analysis, the bias will affect the performance of the results. The future work may contain these three aspects: a) Select new attributes to evaluate the paper, which may be more efficient and unbiased b) Examine the effect of bias in paper rating of every attribute, and design a better approach, either manual (decision tree) or automatic (text mining) to rate the paper, to eliminate the bias c) Replace some of the manual analysis with automatic process, such as Text Mining, and Social Network Analysis, to analyze the content of the paper and the relationship between authors by machine Citation Analysis As one method to determine the completeness of our corpus, we looked at the number of articles in our corpus which were published in each year. The chart below (see Figure “Precentage of Articles in Corpus by Year Publiched”) suggests that our corpus might be over-represented by articles published from the mid 1980’s through the 1990’s. Before the mid-1980’s it could be argued that the MIS field was still very young so there were fewer MIS articles published. After the 1990’s we are probably left to draw the conclusion that our corpus needs to be updated with recent papers which have had significant impact on the field. Of course, it could also be argued that papers published after 2000 have not had significant time to have an impact on the field, but we feel that the corpus probably needs updating. Percentage of Articles in Corpus by Year Published 7.00% 6.00% 5.00% 4.00% 3.00% 2.00% 1.00% 2008 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991 1990 1989 1988 1987 1986 1985 1984 1983 1982 1981 1980 1979 1978 1977 1976 1975 1974 1972 1971 1970 1969 1968 1967 1966 1963 1962 1960 1959 1945 1937 0.00% Precentage of Articles in Corpus by Year Publiched To further analyze the completeness and representation of our corpus, two figures below show the article count by decade group by category. It is interesting to see the growth in the data management and information assurance articles from the 1970’s through the 1990’s. Again, from these charts we can see the dip in article representation in the 2000’s. 14 Artificial Intelligence 12 Collaboration Data Management Decision Sciences 10 eCommerce Economics of Information 8 HCI Information Assurance 6 Knowledge Management Operations Management 4 OTHER Social Informatics 2 Supply Chain Management System Analysis & Design Workflow/Business Process Management 0 1930-1939 1950-1959 1960-1969 1970-1979 1980-1989 1990-1999 2000-2009 14 Artificial Intelligence 12 Collaboration Data Management Decision Sciences 10 eCommerce Economics of Information 8 HCI Information Assurance 6 Knowledge Management Operations Management 4 OTHER Social Informatics 2 Supply Chain Management System Analysis & Design Workflow/Business Process Management 0 1930-1939 1950-1959 1960-1969 1970-1979 1980-1989 1990-1999 2000-2009 Next, we wanted to analyze our corpus according to the citation counts from Web of Knowledge and Google. Web of Knowledge is more specific to the social sciences and applies more readily to the MIS field. In addition, Web of Knowledge limits the citation counts to articles that are published in top journals, whereas Google counts any citation that they find in their database. As such Web of Knowledge is probably a better indicator of the importance and the depth of an article’s impact. The Google citation count, on the other hand, seems to capture more of the breadth of an article’s influence. In any case, we placed more emphasis on analyzing the Web of Knowledge citation counts. As can be seen from the pie charts below, in both Web of Knowledge (See Figures “Top 10 Categories by % of WK Citations“ and “Top 10 Categories by Web of Knowledge Citation Count“) and Google citations (See Figure “Top 10 Categories by % of Google Citations”), the articles in the “other” category accounted for the largest majority of the citations. In order to determine whether this result was because we grouped more papers into the “other” category or whether the “other” papers were more heavily cited, we looked at the number of papers from each category in our corpus. Figure “Top 10 Categories by % of Article Count “ and “Paper Counts in Categories“ show that the data management category contained the most articles, account for roughly 23 percent of the articles in the corpus; whereas, “other” papers accounted for only 9.3 percent of the articles in the corpus. These findings may suggest that while the “other” papers may not fit squarely into the current MIS field, they are very influential in defining the field. Top 10 Categories by % of WK Citations Artificial Intelligence, 6.30% Social Informatics, 4.22% Workflow/Business Process Management, 8.60% OTHER, 16.12% Data Management, 14.67% Decision Sciences, 8.84% Economics of Information, 8.96% System Analysis & Design, 12.14% Collaboration, 9.38% Knowledge Managment, 10.77% Category WK Citations OTHER Data Management System Analysis & Design Knowledge Managment Collaboration Economics of Information Decision Sciences Workflow/Business Process Management Artificial Intelligence Social Informatics Top 10 Categories by Web of Knowledge Citation Count 2809 2556 2116 1877 1635 1561 1540 1499 1097 736 Top 10 Categories by % of Google Citations Artificial Intelligence, 3.52% HCI, 3.50% Decision Sciences, 4.06% Workflow/Business Process Management, 4.42% OTHER, 24.75% Economics of Information, 7.24% Collaboration, 6.28% Knowledge Managment, 11.19% System Analysis & Design, 15.06% Data Management, 19.97% Top 10 Categories by % of Article Count Social Informatics, 5.81% Economics of Information, 6.40% Knowledge Managment, 4.65% HCI, 5.81% Data Management, 22.67% Artificial Intelligence, 7.56% System Analysis & Design, 19.77% Decision Sciences, 8.14% OTHER, 9.30% Collaboration, 9.88% Category Data Management System Analysis & Design Collaboration OTHER Decision Sciences Artificial Intelligence Economics of Information HCI Social Informatics Knowledge Management eCommerce Workflow/Business Process Management Supply Chain Management Operations Management Information Assurance # of Papers 39 34 17 16 14 13 11 10 10 8 5 3 2 2 1 Paper Counts in Categories To further explore the influence that the “other” articles have in our corpus and in the MIS field we compiled the titles of the “other” articles into the chart below (see Figure Articles Categorized as “Other”) along with citation counts from Web of Knowledge and Google. If a citation count is listed as ‘-1’, then we didn’t find it in that specific database. This chart could aid us in eliminating certain articles from the database. For instance, those articles that did not appear in Web of Knowledge and had low Google citation counts (less than 200) may be candidates for removal. Articles such as Coase’s “Nature of the Firm” which did not appear in Web of Knowledge, but were highly cited in Google, should most likely stay in the corpus assuming that they are related to or influence the MIS field. In addition, Im’s 1998 article, though not cited a lot, is probably referred (i.e. read) by many MIS researchers. Author Title Year Google WK Citations Citations Dynamic capabilities and strategic management A Resource-Based Theory of the Firm: Knowledge versus Opportunism Teece et al 1997 1532 5536 Conner and Prahalad 1996 343 1219 Using technology and constituting structures: A practice lens for studying technology in organizations A taxonomy of part-whole relations Simula--an ALGOL-Based Simulation Language Design Science in Information Systems Research Dynamic Competition with Switching Costs A comparative analysis of disk scheduling policies Copyright's fair use doctrine and digital data On applications of differential equations in general problem solving Orlikowski 2000 251 791 Morton et al Dahl and Nygaard 1987 1966 194 176 431 524 Hevner et al 2004 131 490 Farrell and Shapiro 1988 103 265 Teorey and Pinkerton 1972 56 187 Samuelson 1994 29 35 Kubik 1966 0 0 The Nature of the Firm Studying information technology in organizations: Research approaches and assumptions A Note on Two Problems in Connection with Graphs Systems Development in Information Systems Research Software Engineering Programs are not Computer Science Programs An Assessment of Individual and Institutional Research Productivity in MIS R. H. Coase Orlikowski and Baroudi 1937 1991 -1 -1 11151 953 Dijkstra 1959 -1 2190 Nunamaker et al 1991 -1 211 Parnas 1999 -1 103 Im et al 1998 -1 31 Articles Categorized as “Other” Lastly, with regards to citation counts, we extracted several “Top 10” lists to provide a few more insightful views. The first two charts below (see Figure “Top 10 Articles by Web of Knowledge Citations” and Figure “Top 10 Articles by Google Citations”) show the top 10 most cited articles in Web of Knowledge and Google, respectively, sorted by number of citations. The following two charts (see “Top 10 Articles by Average Web of Knowledge Citations per Year” and Figure “Top 10 Articles by Average Google Citations per Year”) show the top 10 cited articles by citations per year. We chose to run the citations per year analysis in order to take into account when the article was published. For instance, if article A was published in 1960 and has been cited one thousand times, it probably is not as significant as article B which was published in 1999 and has been cited one thousand times. Top 10 Articles by Web of Knowledge Citations Title Author Year Category WK WK Citations per Citations Year Dynamic capabilities and strategic management A Relational Model of Data for Large Shared Data Banks Organizational Information Requirements, Media Richness and Structural Design A Dynamic Theory of Organizational Knowledge Creation On the Criteria To Be Used in Decomposing Systems into Modules Teece et al Codd Daft and Lengel 1997 OTHER 1970 Data Management 1986 Workflow/Business Process Management 1532 1269 1100 139 33 50 Nonaka Parnas 1994 Knowledge Managment 1972 System Analysis & Design 1098 680 78 19 Machine learning in automated text categorization The Lagrangian Relaxation Method for Solving Integer Programming Problems Electronic Markets and Electronic Hierarchies A Foundation for the Study of Group Decision Support Systems Internet paradox: A social technology that reduces social involvement and psychological well-being? Sebastiani Fisher 2002 Artificial Intelligence 1981 Decision Sciences 663 658 111 24 Malone et al DeSanctis and Gallupe Kraut et al 1987 Economics of Information 1987 Collaboration 1998 Social Informatics 575 543 509 27 26 51 Top 10 Articles by GoogleCitations Title Author Year The Nature of the Firm Dynamic capabilities and strategic management A Dynamic Theory of Organizational Knowledge Creation The entity-relationship model toward a unified view of data A Relational Model of Data for Large Shared Data Banks As We May Think Organizational Information Requirements, Media Richness and Structural Design A Spiral Model of Software Development and Enhancement On the Criteria To Be Used in Decomposing Systems into Modules Coase Teece et al Nonaka Chen Codd Bush Daft and Lengel 1937 1997 1994 1976 1970 1945 1986 Boehm Parnas A Note on Two Problems in Connection with Graphs Dijkstra Category Google Google Citations Citations per Year OTHER OTHER Knowledge Managment Data Management Data Management Knowledge Managment Workflow/Business Process Management 11151 5536 5121 4583 4197 2972 2853 157 503 366 143 110 47 130 1988 System Analysis & Design 1972 System Analysis & Design 2848 2729 142 76 1959 OTHER 2190 45 Top 10 Articles by Average Web of Knowledge Citations per Year Title Author Year Category Review: Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issues Dynamic capabilities and strategic management Machine learning in automated text categorization A Dynamic Theory of Organizational Knowledge Creation User acceptance of information technology: Toward a unified view Internet paradox: A social technology that reduces social involvement and psychological well-being? Organizational Information Requirements, Media Richness and Structural Design A Relational Model of Data for Large Shared Data Banks Frictionless Commerce? A Comparison of Internet and Conventional Retailers Alavi and Leidner 2005 Knowledge Managment Teece et al Sebastiani Nonaka Venkatesh et al Kraut et al 1997 2002 1994 2003 1998 Daft and Lengel Design Science in Information Systems Research WK WK Citations per Citations Year 418 139 1532 663 1098 379 509 139 111 78 76 51 1986 Workflow/Business Process Management 1100 50 Codd Brynjolfsson and Smith 1970 Data Management 2000 eCommerce 1269 263 33 33 Hevner et al 2004 OTHER 131 33 OTHER Artificial Intelligence Knowledge Managment HCI Social Informatics Top 10 Articles by Average Google Citations per Year Title Author Year Category Google Google Citations Citations per Year Review: Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issues Dynamic capabilities and strategic management A Dynamic Theory of Organizational Knowledge Creation Machine learning in automated text categorization User acceptance of information technology: Toward a unified view The Nature of the Firm The entity-relationship model toward a unified view of data A Spiral Model of Software Development and Enhancement Frictionless Commerce? A Comparison of Internet and Conventional Retailers Alavi and Leidner 2005 Knowledge Managment Teece et al Nonaka Sebastiani Venkatesh et al R. H. Coase Peter Pin-Shan Chen Boehm Brynjolfsson and Smith 1997 1994 2002 2003 1937 1976 1988 2000 Organizational Information Requirements, Media Richness and Structural Design Daft and Lengel 1986 Workflow/Business Process Management OTHER Knowledge Managment Artificial Intelligence HCI OTHER Data Management System Analysis & Design eCommerce 1593 531 5536 5121 2056 1055 11151 4583 2848 1129 503 366 343 211 157 143 142 141 2853 130 We also looked at the breakdown of categories and dimensions. Our biggest take away here was basically a lesson learned. As can be seen in Figure below, all the dimension assessments regressed toward three, the middle of the scale. We should have either expanded the scale to a 7 or 9 point to allow for more spread, or kept the original dimension opposites together (for instance, rigor vs. relevance). The dimension analysis could also speak to the lack of knowledge and experience we have about the MIS field as Ph.D. students. Category Artificial Intelligence Collaboration Data Management Decision Sciences eCommerce Economics of Information HCI Information Assurance Knowledge Management Operations Management OTHER Social Informatics Supply Chain Management System Analysis & Design Workflow/Business Process Mgt Theoretical Applied Rigor Relevance Review Innovation Technical Behavioral 2 3 2 3 2 3 3 1 3 3 2 3 3 3 2 4 3 3 3 3 2 3 4 1 3 3 3 3 3 3 2 2 3 4 3 4 2 2 2 2 3 3 2 4 3 3 2 3 3 3 2 3 2 3 3 3 3 4 2 4 1 3 1 5 3 2 2 3 2 3 1 3 3 3 3 3 2 4 3 3 3 2 2 3 3 2 2 2 2 3 2 4 3 2 1 4 4 2 5 2 1 5 5 1 2 3 2 3 2 3 3 2 3 3 3 4 3 3 3 2 Limitations and Future Research Our research approach was to take the 185 key papers in MIS, add them to a relational database in order to perform analysis such as grouping by category, text mining, and citation analysis. The greatest limitation to our research was that we did not have enough time to perform as thorough analysis as we would have liked. Our limited knowledge of data mining techniques limited our ability to accomplish in depth analysis of the key articles. In our research, we have built a foundation for future work because the database we created could save a lot of preliminary and repeating work. Future classes can make use of the database information to conduct greater analysis and research on the key MIS papers. Conclusion In this project, we have extracted key information in the 185 MIS papers, added our own assessment on different character of these papers and then stored them into a relational database. We conducted categorization, text mining and citation analysis based on our database, in this process we have discovered some interesting and valuable findings. However, due to lack of relevant experiences, there still exist some limitations in our work. Since we have done much fundamental work for the future group, we believe they can generate better ideas and more scientific model based on our contributions. In the past semester, each team member has learned a lot from both individual work and team collaboration. This project is extremely beneficial for us to get familiar with the history, development and present in MIS. Finally, we should say thank you to Dr. Nunamaker and Chris Diller for their valuable suggestion and sincerely help. Appendix 1: User Reading Guide Article Reading/Scoring Guide As you are reading your articles, please attempt to extract the following data: Article/Reviewer Data: 1) Articles Name: ______________________________________________ 2) Primary Author: ______________________________________________ 3) Team Member Reviewing Article: _________________________________ Information For the Database: 1) Decision Tree Result: ________________________________________ 2) 5 Key Words that describe the content of the article: _____________________ _____________________ _____________________ _____________________ _____________________ 3) Domain Type: Based upon your interpretation of the article, please rate each of the domains below by circling the number that most closely approximates your opinion. Theoretic 1 2 3 4 5 Application 1 2 3 4 5 Rigorous 1 2 3 4 5 Relevance 1 2 3 4 5 Review 1 2 3 4 5 Innovative 1 2 3 4 5 Technical 1 2 3 4 5 Behavioral 1 2 3 4 5 4) Should this article be considered for removal from the corpus? Y 5) How many times has this article been cited? ______________ N Appendix 2: 2007 Classification Model Does the research describe, conceptualize or theorize a technological component? No Yes Is technological component treated as an unchanging, discrete entity (black box)? Is there a technological factor in the context, motivation or background? Yes No Yes Is the focus on universal System Analysis and Design Methodologies or a specific problem or a specific system design and/or implementation? Is technology a solely independent variable? Yes No Yes Is the focus on the application or the value of the technological component? No Strategy System Analysis and Design Meta IS Design / Implementation IS Strategy Research About... IS Meta Research About... Distinction according to functionality Distinction according to functionality Distinction according to functionality Value Does it focus on organizational or behavioral aspects? Application Does the research focus on perception of technology, diffusion or capital? Behavioral about MIS? Specific Universal Is the component utilized or viewed as a tool? No Organizational Strategy or Meta Research Organizational No Yes Is the focus on interactions between human and computers? IS Organizational No Research Yes No Is the research focused on economic impact? IS Strategy Research About... HCI Is the focus on social impacts of the technological component? Distinction according to functionality Yes Is the focus on interactions between human and computers? No Distinction according to functionality Yes No HCI Yes Social Informatics No NOT MIS Economics of Information Is the focus on social impacts of the technological component? Social Informatics Yes Social Informatics No NOT MIS Distinction according to functionality Data Management Collaboration Workflow / Business Process Management Bioinformatics eCommerce Decision Sciences Artificial Intelligence Healthcare Systems Information Assurance Supply Chain Management Operations Management Accounting Information System NOT MIS Appendix 3: General Sequel Queries Papers by Category Category Data Management System Analysis & Design Collaboration OTHER Decision Sciences Artificial Intelligence Economics of Information HCI Social Informatics Knowledge Management eCommerce Workflow/Business Process Management Supply Chain Management Operations Management Information Assurance # of Papers 39 34 17 16 14 13 11 10 10 8 5 3 2 2 1 Average Dimension Ratings by Category Category Artificial Intelligence Collaboration Data Management Decision Sciences eCommerce Economics of Information HCI Information Assurance Knowledge Management Operations Management OTHER Social Informatics Supply Chain Management System Analysis & Design Workflow/Business Process Management Theoretical Applied Rigor Relevance Review Innovation Technical Behavioral 2 3 2 3 2 3 3 1 3 3 3 3 3 3 3 3 4 3 2 3 3 3 2 3 3 3 4 4 3 2 3 2 3 3 3 3 2 3 2 4 2 2 2 4 1 2 2 3 3 3 3 4 2 2 3 4 2 1 3 3 3 1 3 5 3 2 2 3 2 3 1 3 3 3 3 3 2 4 3 3 3 2 4 2 3 2 2 2 5 3 4 2 3 3 1 2 2 5 2 1 5 2 4 1 2 3 2 3 2 3 3 2 3 3 3 4 3 3 3 2 Category Counts by Decade Decade 1930-1939 1940-1949 1950-1959 1960-1969 1960-1969 1960-1969 1960-1969 1960-1969 1960-1969 1970-1979 1970-1979 1970-1979 1970-1979 1970-1979 1970-1979 1970-1979 1980-1989 1980-1989 1980-1989 1980-1989 1980-1989 1980-1989 1980-1989 1980-1989 1980-1989 1980-1989 1980-1989 1980-1989 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 1990-1999 Category OTHER Knowledge Management OTHER Artificial Intelligence Decision Sciences HCI Knowledge Management OTHER System Analysis & Design Artificial Intelligence Collaboration Data Management HCI OTHER Social Informatics System Analysis & Design Artificial Intelligence Collaboration Data Management Decision Sciences Economics of Information HCI Knowledge Management Operations Management OTHER Social Informatics System Analysis & Design Workflow/Business Process Management Artificial Intelligence Collaboration Data Management Decision Sciences eCommerce Economics of Information HCI Information Assurance Knowledge Management Operations Management OTHER Social Informatics Supply Chain Management System Analysis & Design Workflow/Business Process Management Count of papers 1 1 1 1 1 2 1 2 7 1 1 10 1 1 1 8 2 6 12 4 3 3 1 1 2 1 7 1 6 9 13 6 2 6 2 1 3 1 7 5 1 7 2 2000-2009 2000-2009 2000-2009 2000-2009 2000-2009 2000-2009 2000-2009 2000-2009 2000-2009 2000-2009 2000-2009 2000-2009 Artificial Intelligence Collaboration Data Management Decision Sciences eCommerce Economics of Information HCI Knowledge Management OTHER Social Informatics Supply Chain Management System Analysis & Design 3 1 4 3 3 2 2 2 2 3 1 5 Appendix 4: Database Design Database Design EndNote 1. Using EndNote, export the references to a .txt file using the “SQL Export” style. This will create a text file of references in an xml format. 2. Add this as first line of text file: <?xml version="1.0" encoding="UTF-8" ?><xml><records> 3. Add this as last line of text file: </records></xml> 4. Rename file to .xml extension. 5. Open the file in an xml viewer and make sure the xml is good. The most likely changes to fix the xml are: a. Change & to & b. Change < in abstract text to < c. Change > in abstract text to > Access 6. Open a new access database. 7. Select External Data, XML file for import and select the xml file from step 5 above. Choose Structure only the first time because some of the field formats need to be updated. 8. Open the record table in design view and change the following fields: a. Change recnum to Number b. Change abstract to Memo c. Change note to Memo d. Change keywords to Memo 9. Select External Data, XML file for import and select the XML file again. Choose to Append file to existing table. 10. Save the database as an .mdb for importing to SQL if you are using Office 2007 SQL 11. From the CodifyNet database, select Tasks, Import Data 12. Choose Access as the input data source and select the mdb database created above. 13. Choose the ADAR CodifyNet database as the destination data source (use a SQL login if doing this remotely). 14. Choose ‘Copy data from one or more tables or views’ 15. For a first time load: a. You will see the Access record table mapped to a SQL record table. Change the name of the destination table to Papers. b. Choose to create a table c. Select Edit mappings and change the following: i. Reftype – change to nvarchar(50) ii. Refyear – change to nvarchar(20) iii. Volume – change to nvarchar(20) iv. Number – change to nvarchar(20) v. Pages – change to nvarchar(20) 16. For a subsequent load: a. Select the Papers table from the dropdown b. Choose to append data to the existing table 17. Run the package. Appendix 5: Data Mining Analysis – Data Naïve Bayes – Artificial Intelligence (Characteristics and Discrimination) Naïve Bayes – Collaboration Naïve Bayes – Data Management Naïve Bayes – Decision Sciences Naïve Bayes – ecommerce Naïve Bayes – Economics of Information Naïve Bayes – Human Computer Interaction Naïve Bayes – Knowledge Management Naïve Bayes – Operations Management Naïve Bayes – Social Informatics Naïve Bayes – Supply Chain Management Naïve Bayes – Systems Analysis & Design Naïve Bayes – Workflow/Business Process Management Naïve Bayes - Other Microsoft Clustering Cluster1 Cluster2 Cluster3 Cluster 4 Appendix 6: Clustering Analysis Economics of Information Cluster 6 0% Cluster 7 9% Cluster 4 0% Cluster 1 37% Cluster 3 0% Cluster 5 36% Cluster 2 18% Appendix 7: Clustering Results Cluster No. No. Title Year A Spiral Model of Software Development and 1988 Enhancement Authors Domain Barry W. Boehm System Analysis & Design Erik Brynjolfsson,Lorin Hitt Economics of Information Russell L Ackoff System Analysis & Design R. B. Cooper,R. W. Zmud Information Assurance Mendelson Haim Economics of Information 1 45 1 Paradox Lost? Firm-level Evidence on the Returns to 217 1996 Information Systems Spending 1 41 1 Information Technology Implementation Research - a 232 1990 Technological Diffusion Approach 1 218 1 A Response to "Assessing Research Productivity: 119 Important But Neglected Considerations" 1 Computer Support For Meetings Of Groups 125 Working On Unstructtured 1988 Problems: A Field Experiment Sirkka L. Jarvenpaa, V. Srinivasan Rao,George P Huber Knowledge Managment 1 Information technology adoption across time: A 235 cross-sectional comparison 1999 of pre-adoption and postadoption beliefs E. Karahanna, D. W. Straub,N. L. Chervany Social Informatics 1 133 Dongmin Kim,Izak Benbasat eCommerce 1 An Experimental Investigation of the Impact 192 of Computer Based Decision 1991 Aids on Decision Making Strategies Peter Todd,Izak Benbasat Decision Sciences 1 194 M Turoff Collaboration Management Misinformation Systems 1967 Pricing computer services: 1985 queueing effects 1998 Trust-Related Arguments in Internet Stores: A 2003 Framework for Evaluation Computer Mediated 1991 Kun Shin Im, Kee Young Kim,Joon S. Social Informatics Kim Communication Requirements for Group Support 1 Using a GDSS to Facilitate Group Consensus: Some 197 1988 Intended and Unintended Consequences 1 219 1 139 1 A Theory of Attributed Equivalence in Databases 146 1989 J. A. Larson, S. B. Navathe,R. Elmasri Data Management with Application to Schema Integration 1 148 1 150 Man-Computer Symbiosis 1960 1 151 Electronic Markets and Electronic Hierarchies 1 164 The Usability Engineering Life-Cycle 1 Richard T. Watson, Geraldine Desanctis,Marshall Scott Poole Collaboration The Economics of Organization: The 1981 Transaction Cost Approach Oliver E. Williamson Economics of Information Computerization and Social 1991 Transformations Rob Kling Social Informatics Winning the Last Mile of E2001 Commerce Hau L. Lee,Seungjin Whang eCommerce J. C. R. Licklider System Analysis & Design 1987 Thomas W. Malone, Joanne Yates,Robert I. Benjamin Economics of Information 1992 J. Nielsen HCI 165 Finding usability problems 1992 through heuristic evaluation J. Nielsen System Analysis & Design 1 167 Improving system usability 1996 through parallel design J. Nielsen,J. M. Faber System Analysis & Design 1 Future research in group support systems: needs, 170 1997 some questions and possible directions J. F. Nunamaker Collaboration G Robertson, Allen Newell,K Ramakrishna HCI ZOG: A Man-Machine 1977 Communication Philosophy 1 4 1 Managing the Development 7 of Large Software Systems: 1970 Concepts and Techniques Winston W. Royce System Analysis & Design 1 Critical Success Factor 15 Analysis as a Methodology 1985 for MIS Planning Michael E. Shank, Andrew C. Boynton,Robert W. Zmud Decision Sciences 1 Direct Manipulation - a Step 21 Beyond Programming1983 Languages B. Shneiderman HCI 1 27 Reducing Social Context Cues: Electronic Mail in Organizational Communication 1 28 Thinking about implementation 2 Testing the interactivity model: Communication 230 processes, partner 1999 assessments, and the quality of collaborative work 2 43 1945 Vannevar Bush Knowledge Managment 2 Evaluation of Strategic 63 Investments in Information 1991 Technology Eric K Clemons Economics of Information 2 64 R. H. Coase OTHER 2 A Resource-Based Theory of 222 the Firm: Knowledge versus 1996 Kathleen R. Conner,C. K. Prahalad Opportunism As We May Think The Nature of the Firm Effects of Anonymity and Evaluative Tone on Idea Generation in ComputerMediated Groups 1986 Lee Sproull,Sara Kiesler Collaboration 1986 Lee S. Sproull,Kay R. Hofmeister Operations Management J. K. Burgoon, J. A. Bonito, B. Bengtsson, A. Ramirez, N. E. Dunbar,N. Miczo Collaboration 1937 OTHER Terry Connolly, Leonard M. Jessup,Joseph S. Valacich Collaboration 2 Organizational Information Requirements, Media 70 1986 Richness and Structural Design Richard L. Daft,Robert H. Lengel Workflow/Business Process Management 2 MEDIA, TASKS, AND COMMUNICATION 234 2008 PROCESSES: A THEORY OF MEDIA SYNCHRONICITY Allen R. Dennis, Robert M. Fuller,Joseph S. Valacich Collaboration 2 96 DC Englebart Knowledge Managment 2 98 Joseph Farrell,Carl Shapiro OTHER 2 68 Augmenting Human Intellect: A Conceptual Framework 1990 1962 Dynamic Competition with 1988 Switching Costs 2 103 Effective View Navigation 2 106 2 George W. Furnas Decision Sciences Electronic Brainstorming and 1992 Group-Size R. B. Gallupe, A. R. Dennis, W. H. Cooper, J. S. Valacich, L. M. Bastianutti,J. F. Nunamaker Collaboration The Impact of Information 111 Systems on Organizations 1991 and Markets Vijay Gurbaxani,Seungjin Whang Economics of Information Sirkka L. Jarvenpaa, Kathleen Knoll,Dorothy E. Leidner Collaboration Wanda J Orlikowski OTHER Is anybody out there? Antecedents of trust in global virtual teams. 1997 2 124 2 Using technology and constituting structures: A 175 2000 practice lens for studying technology in organizations 2 182 Information foraging 1999 P. Pirolli,S. Card Data Management 2 228 The Firm as a Distributed Knowledge System: A Constructionist Approach 1996 Haridimos Tsoukas Knowledge Managment 2 238 User acceptance of information technology: Toward a unified view 2003 V. Venkatesh, M. G. Morris, G. B. Davis,F. D. Davis HCI 2 The psychobiological model: Towards a new theory of 236 computer-mediated 2004 communication based on Darwinian evolution N. Kock HCI Unraveling the temporal fabric of knowledge conversion: A model of media selection and use 1998 2006 A. P. Massey,M. M. Montoya-Weiss Knowledge Managment 2 237 2 Computer Science as 162 Empirical Inquiry - Symbols 1976 and Search A. Newell,H. A. Simon Artificial Intelligence 2 A Dynamic Theory of 226 Organizational Knowledge 1994 Creation Ikujiro Nonaka Knowledge Managment 2 10 Toward a new politics of intellectual property 2001 Pamela Samuelson Social Informatics 2 227 Dynamic capabilities and strategic management 1997 David J. Teece, Gary Pisano,Amy Shuen OTHER 3 Frictionless Commerce? A 52 Comparison of Internet and 2000 Erik Brynjolfsson,Michael D. Smith Conventional Retailers System R: relational approach to database management eCommerce M. M. Astrahan, M. W. Blasgen, D. D. Chamberlin, K. P. Eswaran, J. N. Gray, P. P. Griffiths, W. F. King, R. A. 1976 Data Management Lorie, P. R. McJones, J. W. Mehl, G. R. Putzolu, I. L. Traiger, B. W. Wade,V. Watson 3 59 3 Data Model Issues for 55 1987 Object-Oriented Applications 3 AI and computational science: Implementing Fuzzy 60 2003 Elva. Corona Carlos A. Reyes-Garcia Artificial Intelligence Expert System for intelligent buildings 3 The entity-relationship 62 model toward a unified view 1976 of data 3 82 3 84 3 85 3 94 The category concept: an extension to the entityrelationship model 1985 R. Elmasri, J. Weeldreyer,A. Hevner Data Management 3 211 Software reuse research: status and future 2005 3 J. Banerjee, H. T. Chou, J. F. Garza, K. Won, D. Woelk, N. Ballou,H. J. Data Management Kim Peter Pin-Shan Chen Data Management Thomas Devogele, Christine Parent,Stefano Spaccapietra Data Management A Note on Two Problems in 1959 Connection with Graphs Edsger W. Dijkstra OTHER Go To Statement Considered 1968 Harmful Edsger W. Dijkstra System Analysis & Design On spatial database integration 1998 W. B. Frakes,Kang Kyo System Analysis & Design 102 Generalized fisheye views 1986 G. W. Furnas HCI 3 Computer-Mediated Communication for 105 1994 Intellectual Teamwork - an Experiment in Group Writing J. Galegher,R. E. Kraut Collaboration 3 223 AR Hevner, March, S, Ram, S OTHER 3 Joyce McDowell Kathleen Dahlgren, 131 Knowledge representation 1989 Artificial Intelligence Edward P. Stabler for commonsense reasoning Design Science in Information Systems Research 2004 with text On optimizing an SQL-like nested query 3 134 3 The Inter-Database Instance Identification Problem in 196 1989 Y. Richard Wang,Stuart E. Madnick Data Management Integrating Autonomous Systems 3 Embedding web-based statistical translation models Jian-Yun Nie Wessel Kraaij, Michel 198 2003 in cross-language Simard information retrieval 3 229 A taxonomy of part-whole 1987 relations Morton E. Winston, Roger Chaffin,Douglas Herrmann OTHER 3 141 Dynamic Configuration for 1985 Distributed Systems Jeff Kramer,Jeff Magee System Analysis & Design 3 143 Jeff Kramer, Jeff Magee,M. S. Sloman System Analysis & Design 3 Allocating Data and 152 Operations to Nodes in 1995 Distributed Database Design Salvatore T. March,Sangkyu Rho Data Management C. Mohan Data Management Managing Evolution in Distributed Systems 1982 1989 Won Kim Data Management Decision Sciences 3 154 Distributed data base management: Some thoughts and analyses 3 159 Report on a general problem-solving program 3 8 Automatic information retrieval 1980 G Salton Artificial Intelligence 3 12 Machine learning in automated text categorization 2002 Fabrizio Sebastiani Artificial Intelligence 3 13 Knowledge compilation and 1996 theory approximation Bart Selman,Henry Kautz Knowledge Managment 3 View Integration - a Step 24 Forward in Solving Structural 1994 Conflicts S. Spaccapietra,C. Parent Data Management 3 Model independent 25 assertions for integration of 1992 heterogeneous schemas Stefano Spaccapietra, Christine Parent,Yann Dupont Data Management 3 189 Ontologies for conceptual 2002 Vijayan Sugumaran,Veda C. Storey Data Management 1980 1960 A. Newell, Shaw, J.C. & Simon, H.A Artificial Intelligence modeling: their creation, use, and management 4 Branch-and-Price: Column Cynthia Barnhart, Ellis L. Johnson, 54 Generation for Solving Huge 1998 George L. Nemhauser, Martin W.P. Integer Programs Savelsbergh,Pamela H. Vance 4 47 4 57 4 221 4 Supply Chain Inventory 53 Management and the Value 2000 Gerard P. Cachon,Marshall L. Fisher of Shared Information 4 50 4 System Test Planning of 209 Software: An Optimization 2006 Approach K. Chari,A. Hevner System Analysis & Design 4 A Machine Learning Approach to Inductive Query by Examples: An Experiment 61 1998 Using Relevance Feedback, ID3, Genetic Algorithms, and Simulated Annealing Chen H., Shankaranarayanan G., Iyer A., She L Artificial Intelligence 4 65 A Relational Model of Data 1970 for Large Shared Data Banks E. F. Codd Data Management 4 66 Relational Completeness of 1972 Data Base Sublanguages E. F. Codd Data Management 4 Extending the database 67 relational model to capture 1979 more meaning E. F. Codd Data Management 4 72 1966 Ole-Johan Dahl,Kristen Nygaard OTHER 4 73 Decomposition Principle for 1960 Linear Programs George B. Dantzig,Philip Wolfe Decision Sciences 4 239 The Working Set Model for 1968 Program Behavior Peter J. Denning System Analysis & Design DSS Design: A Systematic View of Decision Support 1985 An evaluation of research 2000 productivity in academic IT Why and Where: A Characterization of Data Provenance 2001 Operations Management Gad Ariav,Michael J. Ginzberg Collaboration Susan Athey,John Plotnicki Decision Sciences P Buneman Data Management Supply Chain Management SEQUEL: A structured English Donald D. Chamberlin,Raymond F. 1974 Data Management query language Boyce Simula--an ALGOL-Based Simulation Language 4 The Lagrangian Relaxation 101 Method for Solving Integer 1981 Programming Problems 4 An Overview of Workflow Management: From Process Diimitrios Georgakopoulos, Mark 109 1995 Modeling to Workflow Hornick,Amit P. Sheth Automation Infrastructure 4 113 1996 Venky Harinarayan, Anand Rajaraman,Jeffrey D. Ullman Decision Sciences 4 Multi-User View Integration 114 System (MUVIS): An Expert 1990 System for View Integration Stephen Hayne,Sudha Ram System Analysis & Design 4 145 On applications of differential equations in general problem solving 1966 Robert N. Kubik OTHER 4 240 Some Approaches to the Theory of Information Systems 1963 Borje Langefors System Analysis & Design 4 Information Distortion in a 147 Supply Chain: The Bullwhip 1997 Effect Hau L. Lee, V. Padmanabhan,Seungjin Whang Supply Chain Management 4 The semantic data model: a 225 modelling mechanism for 1978 Hammer Michael,McLeod Dennis data base applications Data Management 4 InfoHarness: An Information Integration Platform for 14 1999 Managing Distributed, Heterogeneous Information K. Shah,Amit P. Sheth Data Management 4 23 J. M. Smith,D. C. P. Smith Data Management 4 186 Michael Stonebraker Data Management 4 A logical design methodology for relational 191 databases using the 1986 extended entity-relationship model Toby J. Teorey, Dongqing Yang,James P. Fry Data Management 5 44 Anitesh Barua, Charles H Kriebel,Tridas Mukhopadhyay Economics of Information Implementing data cubes efficiently Database abstractions: aggregation and generalization 1977 The Design of the POSTGRES 1987 Storage System Information technologies and business value: An analytic and empirical 1995 Marshall L. Fisher Decision Sciences Workflow/Business Process Management investigation 5 A comparative analysis of 48 methodologies for database 1986 C. Batini, M. Lenzerini,S. B. Navathe Data Management schema integration 5 58 5 46 5 The Evolution of Research on Information Systems: A 216 Fiftieth-Year Survey of the 2004 Rajiv D. Banker,Robert J. Kauffman Literature in Management Science Decision Sciences 5 Interactions between system evaluation and theory testing: A demonstration of 231 2006 the power of a multifaceted approach to information systems research System Analysis & Design 5 69 5 Information technology and economic performance: A 74 2003 critical review of the empirical evidence 5 Information technology and 83 productivity: Evidence from 2000 Sanjeev Dewan,Kenneth L Kraemer country-level data 5 92 5 Dendral and Meta-dendral: Roots of Knowledge Systems 99 1993 and Expert System Applications 5 100 5 104 Kevin Bacon, Degrees-ofSeparation, and MIS Research 2002 The productivity paradox of 1993 information technology Process Modeling Social Informatics Erik Brynjolfsson Economics of Information J. W. Cao, J. M. Crews, M. Lin, A. Deokar, J. K. Burgoon,J. F. Nunamaker 1992 Bill Curtis, Marc I. Kellner,Jim Over Data Management Office Information Systems 1980 and Computer Science Inconsistency Handling in Multiperspective Specifications Paul Beckman,Asa Forsman 1994 Jason Dedrick, Vijay Gurbaxani,Kenneth L Kraemer Economics of Information Economics of Information Clarence A. Ellis,Gary J. Nutt System Analysis & Design E.A Feigenbaum,B.G Buchanan Artificial Intelligence Anthony C. W. Finkelstein, Dov Gabbay, Anthony Hunter, Jeff Kramer,Bashar Nuseibeh Data Management The Vocabulary Problem in G. W. Furnas, T. K. Landauer, L. M. Human System 1987 Data Management Gomez,S. T. Dumais Communication Decision Making and Problem Solving Robin Hogarth Herbert A. Simon George B. Dantzig, Charles R. Piott, 1986 Howard Raiffa, Thomas C. Schelling, Decision Sciences Kennth A. Shepsle, Richard Thaier, Amos Tversky, and Sidney Winter 5 115 5 Semantic database modeling: survey, 118 1987 applications, and research issues 5 An Assessment of Individual Kun Shin Im, Kee Young Kim,Joon S. 120 and Institutional Research 1998 Kim Productivity in MIS 5 130 Natural Language Processing 1996 for Information Retrieval K.S Jones Artificial Intelligence 5 135 Comparing Data Modeling 1995 Formalisms Y. G. Kim,S. T. March Data Management 5 Systems Development in 172 Information Systems Research J. F. Nunamaker, Jr., Minder Chen,Titus D.M. Purdin OTHER 5 Software Engineering 180 Programs are not Computer 1999 Science Programs David Lorge Parnas OTHER 5 140 What is social informatics and why does it matter? Rob Kling Social Informatics 5 149 Computer as a Communication Device 1968 Licklide.Jc, R. W. Taylor,E. Herbert HCI 157 A Brief history of Human Computer Interaction Technology 1998 Brad A. Myers HCI Sudha Ram Data Management Pamela Samuelson OTHER Ralph H. Sprague Decision Sciences Daniel Teichroew,John F. Lubin System Analysis & Design 5 19901991 1999 5 Guest Editor's Introduction: 1 Heterogeneous Distributed 1991 Database Systems 5 9 5 26 5 Computer Simulation -241 Discussion of the Technique 1966 and Comparison of Copyright's fair use doctrine 1994 and digital data A Framework for the Development of Decision Support Systems 1980 Richard Hull,Roger King Data Management OTHER Languages E-Commerce: Structures and 1996 Issues 5 207 6 Semantics and implementation of schema Jay Banerjee, Won Kim, Hyoung-Joo 56 1987 Data Management evolution in object-oriented Kim,Henry F. Korth databases 6 Programming-in-the-Large 78 versus Programming-in-the- 1975 Small 6 86 6 95 6 97 6 Context interchange: overcoming the challenges 110 of large-scale interoperable 1994 database systems in a dynamic environment 6 210 Notes on Structured Programming 1969 Vladimir Zwass Frank DeRemer,Hans Kron System Analysis & Design Edsger W. Dijkstra System Analysis & Design Aspect-Oriented Tzilla Elrad, Robert E. Filman,Atef 2001 Programming: Introduction Bader Display-Selection Techniques 1967 for Text Manipulation eCommerce System Analysis & Design W. K. English, Engelbar.Dc,M. L. Berman HCI Cheng Hian Goh, Stuart E. Madnick,Michael D. Siegel Data Management Service-oriented computing: 2005 key concepts and principles M. N. Huhns,M. P. Singh System Analysis & Design 6 Information systems 224 interoperability: What lies 2004 beneath? Park Jinsoo,Ram Sudha Data Management 6 132 6 On the Criteria To Be Used in 177 Decomposing Systems into 1972 Modules David Lorge Parnas System Analysis & Design 6 A Technique for Software 178 Module Specification with 1972 Examples David Lorge Parnas System Analysis & Design 6 On the Design and 179 Development of Program Families. David Lorge Parnas System Analysis & Design 6 Machine learning Lin Liang Patrick Suppes, Michael 181 comprehension grammars 1996 Artificial Intelligence Buettner for ten languages Querying object-oriented databases 1992 1976 Michael Kifer, Won Kim,Yehoshua Data Management Sagiv 6 A Language/Action 201 Perspective on the Design of 1988 Cooperative Work Terry Winograd System Analysis & Design 6 The Evolving Philosophers 142 Problem: Dynamic Change 1990 Management Jeff Kramer,Jeff Magee System Analysis & Design 6 The determination of efficient record 153 1977 segmentations and blocking factors for shared data files Salvatore T. March,Dennis G. Serverance Data Management 6 The Draco Approach to 158 Constructing Software from 1984 Reusable Components James M. Neighbors System Analysis & Design 6 Intelligent database design 2 using the unifying semantic 1995 model Sudha Ram Data Management 6 3 DENDRAL: A Case Study of the First Expert System for 1993 Scientific Hypothesis Formation 6 5 6 6 6 Semantic Content Amit P. Sheth, C. Bertram, D. Avant, 18 Management for Enterprises 2002 Artificial Intelligence B. Hammond, K. Kochut,Y. Warke and the Web 6 Managing Heterogeneous Multi-System Tasks to 19 1995 Support Enterprise-wide Operations 6 Beyond the Chalkboard Computer Support for M. Stefik, G. Foster, D. G. Bobrow, 184 1987 Collaboration and ProblemK. Kahn, S. Lanning,L. Suchman Solving in Meetings 6 185 6 187 6 188 A Methodology for Learning 2002 Across Application Domains Learning to reason 1997 Structured Analysis (SA): A Language for 1976 Communicating Ideas Structured Design 1974 Bruce G. Buchanan Robert K. Lindsay, Edward A. Feigenbaum, Artificial Intelligence Joshua Lederberg Dan Roth Roni Khardon Artificial Intelligence Douglas T. Ross System Analysis & Design Amit P. Sheth,N. Krishnakumar W. P. Stevens, G. J. Myers,Larry L. Constantine Workflow/Business Process Management Collaboration System Analysis & Design The design and Michael Stonebraker, Gerald Held, 1976 Data Management implementation of INGRES Eugene Wong,Peter Kreps V. C. Storey,D. Dey Data Management for Database Design Systems 6 190 A comparative analysis of disk scheduling policies 1972 Toby J. Teorey,Tad B. Pinkerton OTHER 6 204 Program Development by Stepwise Refinement 1971 Niklaus Wirth System Analysis & Design 7 32 Anchoring the Software Process 1996 Barry W. Boehm System Analysis & Design 7 42 Software Risk Management 1997 Barry W. Boehm,Tom DeMarco System Analysis & Design 7 Review: Knowledge Management and Knowledge Management 220 2005 Systems: Conceptual Foundations and Research Issues M. Alavi,D. E. Leidner Knowledge Managment 7 51 Bundling information goods: Pricing, profits, and 1999 efficiency Yannis Bakos,Erik Brynjolfsson eCommerce 7 User Acceptance of Computer-Technology - a 233 1989 Comparison of 2 TheoreticalModels F. D. Davis, R. P. Bagozzi,P. R. Warshaw Social Informatics 7 79 Toward Friendly User MIS 1983 Implementation Geraldine Desanctis,James F. Courtney System Analysis & Design 7 A Foundation for the Study 80 of Group Decision Support 1987 Systems Gerardine DeSanctis,R. Brent Gallupe Collaboration 7 Using Computing in Quality Geraldine Desanctis, Marshall Scott Team Meetings: Initial 81 1991 Poole, Howard Lewis,George Decision Sciences Observations from the IRS-Desharnais Minnesota Project 7 89 Groupware: Some Issues and Clarence A. Ellis, Simon J. Gibbs,Gail 1991 Experiences Rein Collaboration 7 116 Issues in the Design of Group 1984 Decision Support Systems G. P. Huber Collaboration 7 126 R. Johansen Social Informatics 7 173 Social Evaluations of Teleconferencing 1977 Information Technology for J. F. Nunamaker, A. R. Dennis, J. S. 1991 Negotiating Groups Valacich,D. R. Vogel Generating Options for System Analysis & Design Mutual Gain Electronic Meeting Systems J. F. Nunamaker, A. R. Dennis, J. S. 1991 to Support Group Work Valacich, D. R. Vogel,J. F. George 7 174 7 Studying information technology in organizations: 176 1991 Research approaches and assumptions Delphi and its Potential Impact on Information Systems 7 193 7 An empirical assessment of the organization of 138 1999 transnational information systems 7 Internet paradox: A social technology that reduces 144 social involvement and psychological well-being? The Prospects for Psychological Science in Human-Computer Interaction 1971 Wanda J Orlikowski,J J Baroudi OTHER M Turoff Collaboration William R King,Vikram Sethi Decision Sciences Robert Kraut, Michael Patterson, 1998 Vicki Lundmark, Sara Kiesler, Tridas Social Informatics Mukophadhyay,William Scherlis 7 161 7 Lessons from a Dozen Years J. F. Nunamaker, Jr., R.O. Briggs, of Group Support Systems 1996171 D.D. Mittleman, D. R. Vogel,P.A. Research: A Discussion of 1997 Balthazard Lab and Field Findings 7 11 7 16 Social informatics in the information sciences: Current activities and emerging directions Decision Sciences 1985 Allen Newell,Stuart K. Card HCI Collaboration 2000 S Sawyer,H Rosenbaum Social Informatics Versioning: The smart way to 1998 sell information Carl Shapiro,Hal R Varian Economics of Information