* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download chapter i - Sacramento - California State University
Survey
Document related concepts
Concurrency control wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Functional Database Model wikipedia , lookup
Versant Object Database wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
CLOUD DATABASE QUERY A Project Presented to the faculty of the Department of Computer Science California State University, Sacramento Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in Computer Science by Jinsheng Li FALL 2013 CLOUD DATABASE QUERY A Project by Jinsheng Li Approved by: __________________________________, Committee Chair Ying Jin, Ph.D. __________________________________, Second Reader Mary J Lee, Ph.D. ____________________________ Date ii Student: Jinsheng Li I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the project. __________________________, Graduate Coordinator Nikrouz Faroughi, Ph.D. Department of Computer Science iii ___________________ Date Abstract of CLOUD DATABASE QUERY by Jinsheng Li The birth of cloud computing indicates the shifts in computing architecture and the data processing mechanisms. The location of the computing infrastructure is shifted from local to the network for managing large scale data running in the cloud environment. Challenges exist in traditional database systems due to the gap between the large amount of data being produced and the relatively limited size of traditional databases that are used to store data. Additionally, a new movement on data model and data storage - NoSQL (Not Only SQL or Not the relational DBs) is called for considering the needs in data availability, reliability, and scalability. In this project, an example with designed scenarios is applied to study the query over four selected database systems that are from various popular vendors. These four database systems are: Oracle Database 11g Expression Edition, Oracle Public Cloud, Amazon SimpleDB, and Google App Engine Datastore. We studied the data definition, database population, and query languages in all the above four systems with the scenarios designed from our example. A discussion of data definition, query syntax, data types in iv query, and database population over these four database systems based on our study is also provided. The aim of this project is to help our students to learn and to understand how the data is stored and retrieved through queries from cloud databases. Two applications are developed and implemented using Java language with the Eclipse IDE to show how queries are processed on the application level. _______________________, Committee Chair Ying Jin, Ph.D. _______________________ Date v ACKNOWLEDGEMENTS Since I began my study on the Master Program, I have received a great deal of invaluable assistance from the people around me. I want to say a BIG THANK YOU for those who have helped and encouraged me on my Master’s Degree completion. I would like to thank my project advisor Dr. Ying Jin for her guidance throughout this project and the constructive input received from her. I would also like to thank my second advisor Dr. Mary J Lee for taking time to review my report and give me advice. I am also very thankful to my parents and my fiancé for their unlimited support and encouragement throughout my entire graduate study. Special thanks to my friends Steve Hubbard and Wayne Fong for their friendship and assistance over the years. vi TABLE OF CONTENTS Page Acknowledgements…………………………………………………………………...vi List Of Tables………………………………………………………………………. . ix List Of Figures……………………………………………………………………...….x Chapter 1. INTRODUCTION ...................................................................................................1 2. BACKGROUND .....................................................................................................3 2.1 Cloud Databases…………………………………………………………...3 2.2 Research of Cloud Data Services on Vendors…………………………….4 2.2.1 Data Models……………………………………………………..4 2.2.2 Deployment Models……………………………………………..5 3. CASE STUDY .........................................................................................................7 3.1 Motivating Example……………………………………………………….7 3.2 Implementation…………………………………………………………….9 3.2.1 Oracle Database 11g Express Edition……………………………9 3.2.1.1 Data Definition and Populating Database……………...9 3.2.1.2 Queries………………………………………………...11 3.2.2 Oracle Public’s Cloud (Oracle Database Cloud Service)……….18 3.2.2.1 Data Definition and Populating Database…………….20 3.2.2.2 Queries………………………………………………...22 vii 3.2.3 Amazon SimpleDB…………………………………………… 23 3.2.3.1 Data Definition and Populating Database……………23 3.2.3.2 Queries………………………………………………..29 3.2.4 Google App Engine (GAE) Datastore………………………….41 3.2.4.1 Data Definition and Populating Database…………....41 3.2.4.2 Queries………………………………………………..46 3.3 Database Comparison…………………………………………………….54 3.3.1 Comparison of Data Definition………………………………...54 3.3.2 Comparison of Query Syntax…………………………………..57 3.3.3 Comparison of Data Types……………………………………..59 3.3.4 Comparison of Database Population…………………………...60 4. CONCLUSION AND FUTURE WORK ..............................................................62 Appendix A. SQL Statement .......................................................................................64 Appendix B. Source Code............................................................................................69 Appendix C. Query Language .....................................................................................94 Bibliography ................................................................................................................96 viii LIST OF TABLES Tables Page Table 1 Relational Schema…………………………………………………………...10 ix LIST OF FIGURES Figures Page Figure 1 EER Diagram...................................................................................................... 10 Figure 2 Scenario 2 Query Statement and Result Generated in Oracle 11g Express ....... 13 Figure 3 Scenario 4 Query Statement and Result Generated in Oracle 11g Express ...... 14 Figure 4 Scenario 5 Query Statement and Result Generated in Oracle 11g Express ...... 15 Figure 5 Scenario 6 Result Generated After Procedure Executed in Oracle 11g Express ............................................................................................. 17 Figure 6 SQL Workshop in APEX in Oracle Public Cloud............................................. 20 Figure 7 INSERT Statement Entered in Oracle Public Cloud - SQL Commands ........... 21 Figure 8 SQL Load Data in Oracle Public Cloud ............................................................ 21 Figure 9 Function in Scenario 10 Created in Oracle Public Cloud SQL Commands Window ........................................................................................... 22 Figure 10 The Result Generated after Function in Scenario 10 is Executed ................... 23 Figure 11 Create Domain “Artist” in Amazon SimpleDB............................................... 27 Figure 12 Populating Data into Domain “Artist” via PutAttributes in Amazon SimpleDB......................................................................................................... 27 Figure 13 Populating Data into Domain “Artist” via BatchPutAttributes in Amazon SimpleDB......................................................................................................... 28 Figure 14 Query 1.1 Select Expression in Javascript Scratchpad .................................... 30 Figure 15 Result Returned from Query 1.1 in Javascript Scratchpad .............................. 31 x Figure 16 Query 1.2 Select Expression in Javascript Scratchpad .................................... 32 Figure 17 Result Returned from Query 1.2 in Javascript Scratchpad .............................. 32 Figure 18 Java Codes to Implement Query 1.1 and 1.2 in Amazon SimpleDB .............. 33 Figure 19 Result for Query 1.1 and 1.2 Generated by Running Java Application “SimpleDB_ArtShow” in Amazon SimpleDB ................................................ 33 Figure 20 Java Codes to Implement Query 4.1 in Amazon SimpleDB ........................... 37 Figure 21 Java Codes to Implement Query 4.2 in Amazon SimpleDB ........................... 38 Figure 22 Java Codes to Implement Query 4.3 in Amazon SimpleDB ........................... 39 Figure 23 Result for Scenario 4 Generated by Running Java Application “SimpleDB_ArtShow” in Amazon SimpleDB ................................................ 40 Figure 24 Java Codes to Create a Kind (“Artist”) in GAE Datastore .............................. 44 Figure 25 Web Page for Populating with Artist Data (Generated by Application “ExArtShow”) .................................................................................................. 45 Figure 26 Artist Entities Displaying in Datastore Viewer ............................................... 45 Figure 27 Java Codes to Add a New Attribute to an Item ............................................... 56 Figure 28 A New Attribute "Email Address" is added to Item "Artist_05" .................... 56 xi 1 Chapter 1 INTRODUCTION Cloud computing has been a hot topic and trend since the latter part of 2007. The birth of cloud computing indicates a major transition in computing architecture that has shifted the location of computing infrastructure from local to the network. Correspondingly, the data processing mechanism has also been shifted for managing large amounts of data which are running in the cloud environment. Questions are raised. Is the way of data storage and retrieval on the traditional database systems the same as what has been done in the cloud environment? Additionally, can the concepts we use in the traditional databases be applied to the cloud databases? Are there any advantages when storing data and querying data from the cloud databases compared to the traditional databases? In this project, we apply an example with designed scenarios on different cloud database systems that are from various popular vendors. The aim is to help our students to learn and to understand how the data is stored and retrieved through queries from these cloud databases. We study the differences, advantages/disadvantages, improvements on the RDBs query in the cloud environments vs. traditional RDBs query; the queries used in traditional RDBs (SQL query) vs. those NoSQL queries used in several popular vendors’ cloud DB services. The comparison is through direct database statements and queries rather than through the application level, which is more intuitive regarding the database learning aspect. We also discuss the possibilities of applying the knowledge and 2 concepts we have already learned on the traditional RDBMS to Cloud DBs. Some common characteristics on cloud database query are highlighted. In this project, we only study the data storage and queries used in cloud database services that are provided and maintained by the cloud database providers, such as Oracle Public Cloud, Amazon Simple DB, etc. rather than those used in the databases that are running on the cloud independently, such as Amazon’s EC2. Amazon EC2 allows clients to use the web service interfaces to launch instances with a variety of operating systems, including loading the instances on the client’s application environment and running a preconfigured, templated Amazon Machine Image using as many systems as the client wants [1]. Chapter 2 of this paper discusses the background of this project with the cloud database systems on the market. Chapter 3 shows the implementation of query processing on each of the selected database system through a case study. And finally Chapter 4 gives a brief summary and potential future work on cloud database query. 3 Chapter 2 BACKGROUND Many vendors in the market provide cloud database services, such as Amazon, Oracle, Google, IBM, etc. Cloud databases are not the patent of these big players. Some open source projects are able to provide their cloud databases to the clients as well with either the databases running on the cloud independently or on the services maintained by them. 2.1 Cloud Databases A cloud database usually refers to the database that can be accessed by the users via Internet in the cloud environment with services provided by a cloud database provider on its servers, such as Amazon EC2, Oracle Public’s Cloud, etc. The age we live in is a data age. Compared to the past, the amount of data or the size of the data used is much larger than before. The gap between the large amount of data being produced and the relatively limited size of traditional databases that are used to store the data challenge traditional database systems. Meanwhile, new requirements from data management are being called for: availability, reliability, scalability. These new requirements also bring up a new movement on data model and data storage NoSQL (Not Only SQL or Not the relational DBs). Many key players on the cloud computing services market have included NoSQL technology into some of the cloud services they provide. 4 2.2 Research of Cloud Data Services on Vendors We have done some research on different vendors who currently are providing the cloud DB services on the market, including Amazon, Google, Yahoo, Microsoft, Oracle, IBM, and some popular open source projects. This research was from the type of database service, data model, query interface, size of DB service, to the pricing of the services. Based on the research results, we picked three of those vendors to study the query in our project. These vendors are: Oracle, Amazon, and Google. Oracle started rolling out its public-cloud infrastructure in North America in 2011 and until 2012 had about four data centers on the continent. The company, however, is continuing to expand the data center capacity supporting its cloud services “quarter by quarter” [2]. Google App Engine (GAE) is a set of cloud database services provided by Google and running on Google’s infrastructure [3]. Google App Engine is free up to a certain level of consumed resources. Fees are charged for additional storage, bandwidth, or instance hours required by the application. GAE was first released as a preview version in April 2008, and came out of preview in September 2011 [3]. 2.2.1 Data Models Mainly, there are two data models available for the databases on the cloud: one is SQL-based (pure relational DBs) and another one is NoSQL. 5 In the new NoSQL movement, technologies which differ considerably from traditional RDBMS are used to support various types of data stores, such as key-value stores, objects store, document stores, graph stores, etc. [4]. NoSQL databases are often highly optimized key-value stores intended for simple retrieval and appending operations, with the goal being significant performance benefits in terms of latency and throughput [5]. NoSQL databases are finding significant and growing industry use in big data and real-time web applications. Because a NoSQL database provides a mechanism for storage and retrieval of data that employs less constrained consistency models than traditional relational databases, one of the motivations for this approach is the simplicity of design [5]. 2.2.2 Deployment Models There are two common deployment models: users can run databases on the cloud independently, using a virtual machine image, such as Amazon’s EC2 (Oracle provides a ready-made machine image with an installation of Oracle Database 11g Enterprise Edition on Amazon EC2) or they can purchase access to a database service that is maintained by a cloud database provider, such as Oracle Public Cloud. (Data management as a service). In October 2012, Oracle made an announcement that they will bring their cloud database service in the early part of 2013. In the IaaS space, Oracle is faced with competition from other gorilla-caliber players. They include Amazon Web Services, Google, Microsoft and HP, but the latter three are also relatively new to the space. A key 6 part from Oracle is that the company wants to give customers the choice of deploying the same applications on premise, in a private cloud or in its public cloud [2]. 7 Chapter 3 CASE STUDY In this Chapter, we design a set of scenarios based on the example we selected. Against these scenarios, we study how data is being stored and queried on our selected database systems as mentioned in Chapter 2. 3.1 Motivating Example Based on the example in the textbook from one of our database classes [6], we modified it as follows: In an art center, artists present their artwork in the exhibitions held by this art center. The personal information of the artists, such as their names, birth date, country of origin, etc. is stored in the art center’s database system. Each artist can have their artwork presented in different exhibitions. Each exhibition has its recorded event start date, end date, and the number of tickets sold. Even though all artwork is categorized into three categories: Painting, Sculpture, and Other, they have some attributes in common. For example, each art object has its artwork id number, year created, artwork title, artist’s name, etc.. In order to simplify the situations, we have the following assumptions: All the artwork is categorized into one of three types: Painting, Sculpture, and Other All artwork is created individually by the artists; no artwork is cooperated There is no artist who has the same name as any other artist Below are the scenarios we design for this example: 8 Scenario 1: The manager of the art center wants to know the exhibition name, exhibition start date, exhibition end date, and number of paintings exhibited in the exhibition “The Beauty of Form”. Scenario 2: The manager of the art center wants to know the name, description, and country of origin of all artists whose artwork is categorized as “Painting” in the exhibition named “The Beauty of Form”. Scenario 3: The manager of the art center wants to know a list of artists and the exhibitions in which each artist is showing their work, ordered by the artist’s name, and for each artist’s work ordered alphabetically by title. Scenario 4: The manager of the art center wants to find the maximum number of tickets sold, the minimum number of tickets sold, and the average tickets sold among all exhibitions held in this art center since 2010/01/01. Scenario 5: The manager of the art center wants to know the name of each artist who exhibits both paintings and sculpture. Scenario 6: The manager of the art center wants to print the name(s) of the exhibition(s) that has/have the largest number of tickets sold whose start date and end date are within any given time period. Scenario 7: The manager of the art center wants to know the number of paintings exhibited in a particular exhibition. 9 3.2 Implementation In this section, we apply our example and the scenarios designed above to the following database and cloud database services. We focus on the study of the data definition and data queries during the application, which will be listed in detail in the following subsections. 3.2.1 Oracle Database 11g Express Edition As we mentioned in the last Chapter, the Oracle Database is the world’s leading enterprise database (Oracle database 11g is one of the most popular traditional relational databases.) Its express edition is a free version that can be installed and run on a local machine. We will present the database schema, creation of tables, views, and triggers, database population, query processing, procedures, and functions through our designed scenarios. 3.2.1.1 Data Definition and Populating Database Database Schema Based on the example we described in section 3.1, we define the schema as the following in Table 1: 10 Table 1 Relational Schema Artist (name, date_born, date_died, description, country_of_origin) Art_Object ( id_no, year, title, description, artist_name) foreign key (artist_name) references to Artist (name) Others ( id_no, type, style) foreign key (id_no) references to Art_Object(id_no) Painting (id_no, paint_type, style, drawn_on) foreign key (id_no) references to Art_Object (id_no) Sculpture (id_no, material, weight_kg, height_cm) foreign key (id_no) references to Art_Object (id_no) Exhibition (title, start_date, end_date, no_of_tickets_sold) Shown_at (art_id_no, exhibition_title) foreign key (art_id_no) references to Art_Object (id_no), foreign key (exhibition_title) references to Exhibition(title) The corresponding EER diagram is shown in Figure 1: Figure 1 EER Diagram 11 Creation of Tables and Views According to the above schema, we enter the following SQL statements in the SQL Workshop -> SQL Commands screen from Oracle database 11g Express to create tables and views (see the CREATE statements listed in the APPENDIX A.1 section in this report). Database Population After all the tables and views we need are created based on the schema, we populate some data into the tables by using the INSERT statements listed in the APPENDIX A.2 section. 3.2.1.2 Queries Query Processing The queries used to provide the answers for the manager in the scenarios we designed in section 3.1 are as follows: Scenario 1: Retrieve the exhibition name, exhibition start date, exhibition end date, and number of paintings exhibited in the exhibition “The Beauty of Form”. Query 1 (Implement the GROUP BY Clause and COUNT Aggregate Function): SELECT e.Title, e.Start_Date, e.End_Date, COUNT(*) as FROM Painting p, Exhibition e, Shown_At sa WHERE sa.Exhibition_Title = 'The Beauty of Form' AND p.Id_No = sa.Art_Id_No AND sa.Exhibition_Title = e.Title GROUP BY e.Title, e.Start_Date, e.End_Date; 12 Scenario 2: Retrieve the name, description, and country of origin of all artists who have artwork listed as “Painting” in the exhibition named “The Beauty of Form”. Also list how many paintings of “The Beauty of Form” have been produced by the same artist. Query 2 (Implement the HAVING Clause): Select a.name, a.description, a.Country_of_Origin, count(*) as From Artist a, Art_Object ao, Painting p, Shown_At sa Where sa.Exhibition_Title = 'The Beauty of Form' and p.Id_No = sa.Art_Id_No and ao.Id_No = p.Id_No and a.Name = ao.Artist_Name Group by a.name, a.description, a.Country_of_Origin Having count(*) > 0; OR (Implement the JOIN operation): Select a.name, a.description, a.Country_of_Origin, count(*) as From ((painting_view pv JOIN Shown_At sa ON id_no = art_id_no) JOIN Artist a ON pv.artist_name = a.name) Where sa.Exhibition_Title = 'The Beauty of Form' Group by a.name, a.description, a.Country_of_Origin Having count(*) > 0; 13 Result: Figure 2 Scenario 2 Query Statement and Result Generated in Oracle 11g Express Scenario 3: Retrieve a list of artists and the exhibitions in which each has artwork shown, arranged by the artist’s name, and for each artist an alphabetically arranged list of the artist’s titled artwork. Query 3 (Implement the ORDER BY Clause): SELECT ao.Artist_Name, sa.Exhibition_Title FROM Art_Object ao, Shown_At sa WHERE ao.id_no = sa.Art_Id_No ORDER BY ao.Artist_Name, ao.Title; 14 Scenario 4: Find the maximum number of tickets sold, the minimum number of tickets sold, and the average tickets sold among all exhibitions held in this art center since 2010/01/01. Query 4 (Implement Aggregate Functions): Select MAX(No_of_Tickets_Sold), MIN(No_of_Tickets_Sold), AVG(No_of_Tickets_Sold) From Exhibition e Where Start_Date >= to_date('2010/01/01','yyyy-mm-dd'); Result: Figure 3 Scenario 4 Query Statement and Result Generated in Oracle 11g Express Scenario 5: Retrieve the name of each artist who exhibits both paintings and sculpture. 15 Query 5 (Implement the EXISTS Function): Select ao.Artist_Name From Art_Object ao Where EXISTS (Select * From Painting_View pv Where pv.artist_name = ao.artist_name) AND EXISTS (Select * From Sculpture_View sv Where sv.artist_name = ao.artist_name) GROUP BY ao.Artist_Name; Result: Figure 4 Scenario 5 Query Statement and Result Generated in Oracle 11g Express Procedures Scenario 6: 16 Print the name(s) of the exhibition(s) that has/have the largest number of tickets sold whose start date and end date are within any given time period. In this case, the manager wants to print the name of the exhibition that has the largest number of the tickets sold and was held between Jan 1, 2010 and Mar 31, 2013. Create A Procedure: create or replace PROCEDURE MOSTTICKET_SOLD (startdate IN DATE, enddate IN DATE) as CURSOR exhibitions is SELECT e.Title, e.No_of_Tickets_Sold FROM Exhibition e Where e.No_of_Tickets_Sold = (Select MAX(No_of_Tickets_Sold) From Exhibition e Where Start_Date >= startdate AND End_Date <= enddate); BEGIN for exhi in exhibitions LOOP Dbms_output.put_line(exhi.Title || ', ' || to_char(exhi.No_of_Tickets_Sold)); END LOOP; 17 END; Execute the Procedure: Begin MOSTTICKET_SOLD('1/1/2010', '3/31/2013'); End; Result: Figure 5 Scenario 6 Result Generated After Procedure Executed in Oracle 11g Express Functions Scenario 7: Retrieve the number of paintings exhibited in a particular exhibition. In this case, the manager wants to retrieve the exhibition name, exhibition start date 18 and end date, and the number of paintings exhibited in the exhibition “The Beauty of Form”. Create A Function: create or replace function No_Of_Painting_exhibited (exhibition_name IN varchar2) return number as no_of_paintings number; Begin Select count(*) into no_of_paintings From Art_Object ao, Painting p, Shown_At sa Where sa.Exhibition_Title = exhibition_name and ao.Id_No = sa.Art_Id_No and p.Id_No = ao.Id_No Group by sa.Exhibition_Title; return no_of_paintings; End; Call the Function (in the SELECT clause of a query): select e.Title, e.Start_Date, e.End_Date, No_Of_Painting_exhibited('The Beauty of Form') From exhibition e where e.Title = 'The Beauty of Form'; 3.2.2 Oracle Public’s Cloud (Oracle Database Cloud Service) Besides continuing to develop and to support on-premise Oracle Database software, Oracle has started its competition in all three of the public-cloud markets – Software, Platform and Infrastructure-as-a-Service. The company started offering a 19 cloud-based compute service earlier this year although the compute service offering in cloud is quite new compared to the other cloud providers, such as Amazon Web Services [2]. As we mentioned in Chapter 2, there are four main categories of Cloud computing solutions: Infrastructure-as-a-Service (IaaS), Database-as-a-Service (DBaaS), Platformas-a-Service (PaaS), and Software-as-a-Service (SaaS). Oracle Public Cloud is a Platform-as-a-Service (PaaS) that provides those of a PaaS offering, such as a fully managed service, which does not require any operational effort for the underlying Oracle Database; the programmatic access to the underlying Oracle Database through SQL or PL/SQL executed from inside the Oracle Cloud or through RESTful Web Services, etc [7]. Oracle Public Cloud includes two different tools – Oracle Application Express (APEX) and SQL Developer. SQL Developer is not used in our project. APEX is a rapid application development tool included as part of the Oracle Database Cloud and runs as part of the Oracle Database Cloud. It is accessed through a standard Web browser [8]. APEX includes SQL Workshop as shown in Figure 6 - a set of tools and utilities for working with data structures and data in an underlying database. We will apply our example to Oracle Public Cloud via APEX because we would like to see if there are any differences in data queries between Oracle’s on-premises database management system and their Public Cloud. 20 Figure 6 SQL Workshop in APEX in Oracle Public Cloud 3.2.2.1 Data Definition and Populating Database We use the same data schema defined in section 3.2.1.1 to create tables and views. APEX’s SQL Workshop includes two utilities: SQL Commands and SQL Scripts. Database population in Public Cloud is done through either entering the corresponding SQL statements that are used in Oracle Database 11g Express in the SQL Commands or loading the respective SQL script files via SQL Scripts. Figure 7 below shows one of the INSERT statements entered in SQL Commands to populate table “Art_Object” with data. And Figure 8 shows the screen in SQL Load Data for loading the data from the respective SQL script files. 21 Figure 7 INSERT Statement Entered in Oracle Public Cloud - SQL Commands Figure 8 SQL Load Data in Oracle Public Cloud 22 3.2.2.2 Queries We processed the same query statements as in Oracle Database 11g Express in Public Cloud. The same procedure and function written in PL/SQL used in Oracle Database 11g Express are also executed in Public Cloud through SQL Commands. Figure 9 shows the function created in SQL Commands window. The result generated is shown in Figure 10 after the function is executed. Figure 9 Function in Scenario 10 Created in Oracle Public Cloud SQL Commands Window 23 Figure 10 The Result Generated after Function in Scenario 10 is Executed 3.2.3 Amazon SimpleDB Amazon SimpleDB is a managed NoSQL database service designed for smaller datasets [9]. Making a comparison with Oracle’s on-premises database management system and Oracle Public Cloud, we apply the same example designed in Chapter 3.1 on Amazon SimpleDB. 3.2.3.1 Data Definition and Populating Database Database Schema and Table/View Creation Because Amazon SimpleDB is a scalable schema-less key-value store hosted by Amazon’s Web Services division, the database schema used in previous sections from the traditional relational database, such as Oracle database 11g Express, is not applicable on Amazon SimpleDB. There are no Table and View terminologies in Amazon SimpleDB. 24 Data Model In Amazon SimpleDB, structured data is organized in domains. If we think that a customer account subscribed in Amazon’s SimpleDB is an entire Excel spreadsheet, then the domains are the worksheets of the spreadsheet. Domains are similar to tables. Rows of the worksheet represent Items. Items can be considered as the individual objects that have one or more attributes which are name-value pairs. Columns of the worksheet represent Attributes. Cells of the worksheet represent Values that can be considered as the instances of attributes for Items. Something different to the spreadsheet is that an attribute (a cell) can have multiple values. Because there are no schemas in SimpleDB, we can partition the data set used in our example into one or more domains (tables). For better illustration purposes, we partition the same data set used in Oracle 11g Express and Public Cloud into the follow two tables: Domain 1: Artist (Artist_Id, Name, Date_Born, Date_Died, Description, Country_of_Origin) Domain 2: Art_Object (Id_No, Year, Title, Description, Art_Object_Type, Type, Style, Drawn_On, Material, Weight_kg, Height_cm, Artist_Name, Exhibition_Title, Start_Date, End_Date, No_of_Tickets_Sold) 25 A domain can be created through a simple HTML and Javascript application Javascript Scratchpad, which will be shown next. Database Population A small group of API calls included in SimpleDB service serves the main functionality for building an application. These ten API calls [10] are: CreateDomain - Create domains to contain the data up to 250 domains. Additional domains can be requested through Amazon’s web site. DeleteDomain - Delete any of the domains ListDomains - List all domains within an account PutAttributes - Add, modify, or remove data within the Amazon SimpleDB domains BatchPutAttributes - Generate multiple put operations in a single call DeleteAttributes - Remove items, attributes, or attribute values from a domain BatchDeleteAttributes - Generate multiple delete operations in a single call GetAttributes - Retrieve the attributes and values of any item ID that you specify Select - Query the specified domain using a SQL SELECT expression 26 DomainMetadata - View information about the domain, such as the creation date, number of items and attributes, and the size of attribute names and values According to the domains created in the above Data Model section, we use CreateDomain, PutAttributes, BatchPutAttributes API calls through Javascript Scratchpad to populate the same data set used in Chapter 3.2.1 and 3.2.2 into our Amazon SimpleDB database. Javascript Scratchpad is a simple HTML and Javascript application that allows the user to explore the Amazon SimpleDB API without writing any codes. It can be downloaded from Amazon’s web site and connected to one of the Amazon data centers after running on a local machine. The default data center connected to is US-East (Northern Virginia). Since the main purpose of this project is to study the queries in our selected databases rather than through the application level, we implement the API calls in Javascript Scratchpad for our database population. Three of the API calls we use to create and to populate the data into Domain “Artist” with Javascript Scratchpad are shown in Figure 11, 12, and 13. 27 Figure 11 Create Domain “Artist” in Amazon SimpleDB Figure 12 Populating Data into Domain “Artist” via PutAttributes in Amazon SimpleDB 28 Figure 13 Populating Data into Domain “Artist” via BatchPutAttributes in Amazon SimpleDB PutAttributes and BatchPutAttributes API calls are used to put data into a domain. The differences between these two API calls are whether multiple put operations in a single call or in separate calls. “Replace” shown in Figure 12 and 13 is a flag specifying whether or not to replace the attribute/value pair or to add a new attribute/value pair. The default setting is false. 29 Because string is the only data type used in SimpleDB, the way to compare data is performed lexicographically. Therefore, how to store data in an appropriate format becomes important. 3.2.3.2 Queries The query language introduced in Amazon SimpleDB is a simple custom query language that can allow the user to retrieve the name-value pairs of data associated with items. The query expression in this query language is similar to the standard SQL SELECT statement. Javascript Scratchpad also provides the user interface for users to implement the Select query and to display the result of the query. Javascript Scratchpad, however, can only allow implementation of the Select query and display the respective result one at a time. On the other hand, due to the limitation of the Select query in Amazon SimpleDB (Amazon SimpleDB does not support joins, aggregations, nested queries), for some of the scenarios, we have to implement the query in the application level to be able to retrieve the information requested in these scenarios. The application we use (“SimpleDB_ArtShow”) is modified from the sample code project provided by Amazon SimpleDB. Query Processing In Javascript Scratchpad, we process the following queries on the scenarios we designed in section 3.1: 30 Scenario 1: Retrieve the exhibition name, exhibition start date, exhibition end date, and number of paintings exhibited in the exhibition “The Beauty of Form”. Because there is no GROUP BY clause in SimpleDB and the explicit list of attributes cannot be used with COUNT(*) together in the SELECT output list, one way to retrieve all the information listed above is to use two queries instead: Query 1.1 (the output list is an explicit list of attributes): Select Exhibition_Title, Start_Date, End_Date From Art_Object Where Exhibition_Title = 'The Beauty of Form' and Art_Object_Type = 'Painting' Figure 14 Query 1.1 Select Expression in Javascript Scratchpad 31 Result: Figure 15 Result Returned from Query 1.1 in Javascript Scratchpad Query 1.2 (Implement COUNT): SELECT COUNT(*) FROM Art_Object WHERE Exhibition_Title = 'The Beauty of Form' and Art_Object_Type = 'Painting' 32 Figure 16 Query 1.2 Select Expression in Javascript Scratchpad Result: Figure 17 Result Returned from Query 1.2 in Javascript Scratchpad We can process the above two queries in our Java application “SimpleDB_ArtShow” to obtain the same result. Figure 18 shows how this is implemented and the result is shown in Figure 19: 33 //Scenario 1 String selectExpression = "select Exhibition_Title, Start_Date, End_Date from `" + "Art_Object" + "` where Exhibition_Title = 'The Beauty of Form' and Art_Object_Type = 'Painting'"; System.out.println("Selecting: " + selectExpression + "\n"); SelectRequest selectRequest = new SelectRequest(selectExpression); System.out.println(" The number of paintings: " + sdb.select(selectRequest).getItems().size() + "\n"); for (Item item : sdb.select(selectRequest).getItems()) { System.out.println(" System.out.println(" Art_Object"); Name: " + item.getName()); for (Attribute attribute : item.getAttributes()) { System.out.println(" Attribute"); System.out.println(" Name: " + attribute.getName()); System.out.println(" Value: " + attribute.getValue()); } } Figure 18 Java Codes to Implement Query 1.1 and 1.2 in Amazon SimpleDB Figure 19 Result for Query 1.1 and 1.2 Generated by Running Java Application “SimpleDB_ArtShow” in Amazon SimpleDB 34 Scenario 2: Retrieve the name, description, and country of origin of all artists who have artwork listed as “Painting” in the exhibition named “The Beauty of Form”. Based on the data model we designed in section 3.2.3.1 with two domains: “Artist” and “Art_Object”, the information needed to retrieve in this scenario: name, description, and country of origin of the artists is in the domain Artist. However, the information needed to filter out the kinds of artists who have artwork listed as “Painting” in the exhibition named “The Beauty of Form” is in the domain Art_Object. In Amazon SimpleDB, there is no join equivalent. The only way to retrieve the information requested in this scenario is to get data from both Artist domain and Art_Object domain through two queries, and then obtain the requested data from the results of these queries on our end in an application level. Below are the two queries we need for retrieving the information requested in this scenario: Query 2.1: SELECT Name, Description, Country_of_Origin FROM Artist Query 2.2: SELECT Artist_Name FROM Art_Object 35 WHERE Exhibition_Title = 'The Beauty of Form' and Art_Object_Type = 'Painting' The Java codes to process the above two queries in our application “SimpleDB_ArtShow” are shown in the APPENDIX B.1 section. Scenario 3: Retrieve a list of artists and the exhibitions in which each artist is showing their work, ordered by the artist’s name, and for each artist’s work arranged alphabetically by title. Amazon SimpleDB supports sorting data only on a single attribute in either ascending (by default) or descending order. To retrieve the information requested in this scenario, we can use one query to return the result that is ordered by artist name and then sort the results in an alphabetical order by the art object’s title on the application level. Query 3 (Implement the ORDER BY Clause): SELECT Artist_Name, Exhibition_Title FROM Art_Object WHERE Exhibition_Title != 'N/A' ORDER BY Artist_Name 36 Scenario 4: Find the maximum number of tickets sold, the minimum number of tickets sold, and the average tickets sold among all exhibitions held in this art center since 2010/01/01. Except for COUNT(), there are no other aggregate functions in Amazon SimpleDB. One way to obtain the information requested in this scenario is to use two queries (Query 4.1 and 4.2) to retrieve the maximum and minimum number of the tickets sold respectively by adding a LIMIT clause at the end of the ORDER BY clause (The “limit” is the maximum number of results to return. Default is: 100, maximum is: 250) [11]; and another two queries (Query 4.3 and 4.4) to retrieve all the numbers of tickets sold and the total number of the exhibitions held since 2010/01/01. The problem is that there are duplications in domain “Art_Object” based on the data model we designed in section 3.2.3.1. Therefore, we cannot apply the results returned from the last two queries (a list of numbers of tickets sold and the total number of the exhibitions held) directly on the application level to obtain the correct average number of tickets sold. We have to “manually” implement the operation inside our application that is similar to “GROUP BY” in SQL to prevent the duplicate exhibition from being calculated. Figure 22 below shows how this is handled. 37 Query 4.1: SELECT No_of_Tickets_Sold FROM Art_Object WHERE Start_Date >= '2010-01-01' AND No_of_Tickets_Sold is not null ORDER BY No_of_Tickets_Sold desc limit 1 Implementing the above Query 4.1 in “SimpleDB_ArtShow” Java application: //Scenario 4 //Implement Query 4.1: String selectExpression4_1 = "select No_of_Tickets_Sold from `" + "Art_Object" + "` where Start_Date >= '2010-01-01' AND " + "No_of_Tickets_Sold is not null ORDER BY No_of_Tickets_Sold desc limit 1"; System.out.println("Selecting: " + selectExpression4_1 + "\n"); SelectRequest selectRequest4_1 = new SelectRequest(selectExpression4_1); Item item1 = sdb.select(selectRequest4_1).getItems().get(0); System.out.println("The maximum number of tickets sold: " + item1.getAttributes().get(0).getValue()); System.out Figure 20 Java Codes to Implement Query 4.1 in Amazon SimpleDB Query 4.2: SELECT No_of_Tickets_Sold FROM Art_Object WHERE Start_Date >= '2010/01/01' AND No_of_Tickets_Sold is not null ORDER BY No_of_Tickets_Sold asc limit 1 38 Implementing the above Query 4.2 in “SimpleDB_ArtShow” Java application: //Scenario 4 //Implement Query 4.2: String selectExpression4_2 = "select No_of_Tickets_Sold from `" + "Art_Object" + "` where Start_Date >= '2010-01-01' AND " + "No_of_Tickets_Sold is not null ORDER BY No_of_Tickets_Sold asc limit 1"; System.out.println("Selecting: " + selectExpression4_2 + "\n"); SelectRequest selectRequest4_2 = new SelectRequest(selectExpression4_2); Item item2 = sdb.select(selectRequest4_2).getItems().get(0); System.out.println("The minimum number of tickets sold: " + item2.getAttributes().get(0).getValue()); Figure 21 Java Codes to Implement Query 4.2 in Amazon SimpleDB Query 4.3: SELECT No_of_Tickets_Sold FROM Art_Object WHERE Start_Date >= '2010/01/01' Query 4.4: SELECT COUNT(*) FROM Art_Object WHERE Start_Date >= '2010/01/01' The above queries (Query 4.3 and 4.4) cannot be implemented directly in our application “SimpleDB_ArtShow” to produce the correct result. Figure 22 shows how this can be done on the application level: 39 //Implement Query 4.3: String selectExpression4_3 = "select Exhibition_Title, No_of_Tickets_Sold from `" + "Art_Object" + "` where Start_Date >= '2010-01-01'"; System.out.println("Selecting: " + selectExpression4_3 + "\n"); SelectRequest selectRequest4_3 = new SelectRequest(selectExpression4_3); int sumTickets = 0; int count = 0; String showName1 = ""; for (Item item3 : sdb.select(selectRequest4_3).getItems()) { //Get attribute: No_of_Tickets_Sold Attribute attribute3 = item3.getAttributes().get(1); int numTicket = Integer.parseInt(attribute3.getValue()) ; //Get attribute: Exhibition_Title Attribute attrib = item3.getAttributes().get(0); //Filter out the exhibitions that have the same Exhibition_Title; //only sum up the number of tickets for each different exhibition //to prevent the duplicate exhibition from being calculated if (!showName1.equals(attrib.getValue())){ sumTickets += numTicket; count++; showName1 = attrib.getValue(); } } float avgTickets = sumTickets / count; System.out.println("The average number of tickets sold: " + avgTickets); Figure 22 Java Codes to Implement Query 4.3 in Amazon SimpleDB And Figure 23 shows the result generated after running Java application “SimpleDB_ArtShow” for scenario 4: 40 Figure 23 Result for Scenario 4 Generated by Running Java Application “SimpleDB_ArtShow” in Amazon SimpleDB Scenario 5: Retrieve the name of each artist who exhibits both paintings and sculpture. Because Amazon SimpleDB does not support nested queries, the query we used in Oracle 11g Express and Public Cloud (section 3.2.1.2 and section 3.2.2.2) cannot be applied here for SimpleDB. Based on the domains we designed in previous section 3.2.3.1, one way to obtain the information requested in this scenario is to apply the following two queries in the application level: Query 5.1: SELECT Artist_Name FROM Art_Object WHERE Art_Object_Type = 'Painting' 41 Query 5.2: SELECT Artist_Name FROM Art_Object WHERE Art_Object_Type = 'Sculpture' The Java codes to process the above two queries in our application “SimpleDB_ArtShow” are shown in the APPENDIX B.1 section. Procedures and Functions There is no database programming language like PL/SQL in Oracle for Amazon SimpleDB. Indeed, those programming structures used in Oracle procedures and functions in our example will become part of the programming on the application level. 3.2.4 Google App Engine (GAE) Datastore App Engine is a platform as a service (PaaS) that uses familiar technologies to build and host applications on the same infrastructure used at Google [12]. Again, making a comparison with Oracle’s on-premises database management system, Public Cloud and Amazon SimpleDB, we apply the same example designed in Chapter 3.1 on GAE Datastore. 3.2.4.1 Data Definition and Populating Database Database Schema and Table/View Creation GAE datastore provides a NoSQL schemaless object datastore, with a query engine and atomic transactions. The database schema used in previous 42 sections from the traditional relational database, such as Oracle database 11g Express, is not applicable on GAE datastore. Similar to Amazon SimpleDB, there are no Table and View terminologies in GAE datastore. In the “data population” subsection below, we will show the creation of a kind and an entity. Data Model Different from traditional relational database, the GAE datastore uses data objects called “entities” that have a kind and a set of properties. The structure of data entities is provided by and enforced by the application code. It is easier to understand the data model of GAE datastore by thinking about the following: kind is a data type or a table, an entity is a data object or a data row in that table, property is the field or a column in that table, and the key is just like the primary key in the relational database. Since datastore entities are "schemaless", similar situation to Amazon SimpleDB, we can categorize the data set used in our example into one or more kind (tables) of entities. For simplifying and more clearly showing the concept of the datastore’s data model, we partition the same data set used in previous sections (Oracle 11g Express, Public Cloud, and Amazon SimpleDB) into the following two tables (domains): Domain 1: 43 Artist (Artist_Id, Name, Date_Born, Date_Died, Description, Country_of_Origin) Domain 2: Art_Object (Id_No, Year, Title, Description, Art_Object_Type, Type, Style, Drawn_On, Material, Weight_kg, Height_cm, Artist_Name, Exhibition_Title, Start_Date, End_Date, No_of_Tickets_Sold) Under GAE account after login, there is a Datastore Viewer in Admin Console to view the list in entities. Unfortunately, this utility does not support kind creation. Before we can see the entities by populating our data set into the datastore, we will have to create a small web application by modifying the sample codes from Google to create the kinds. As we mentioned in section 3.2, GAE supports the runtime environments in several programming languages. We choose Java as the main language with JavaScript and JQuery to create this web application “ExArtShow”. Additionally, the relation between Art_Object and Exhibition is simplified from m:n to m:1 for the easier use of our web application on database population. The java codes in our web application “ExArtShow” shown in Figure 24 are to create a kind (“Artist”) in the datastore: 44 Figure 24 Java Codes to Create a Kind (“Artist”) in GAE Datastore Similarly, we create another kind “Art_Object” in the datastore as well. Database Population After both kinds (tables) we designed are created, we populate the same example data set into the datastore through the web page generated from our Java web application “ExArtShow” as shown in Figure 25: 45 Figure 25 Web Page for Populating with Artist Data (Generated by Application “ExArtShow”) Figure 26 shows all the Artist kind entities stored in GAE datastore after the datastore is populated: Figure 26 Artist Entities Displaying in Datastore Viewer 46 3.2.4.2 Queries The distributed NoSQL data storage service provided by GAE features a query engine and transactions [13]. A query language “GQL” can be used to query data from the data store. GQL is a SQL-like language for retrieving entities or keys from GAE scalable datastore. Even though GQL's features are different from those of a query language for a traditional relational database, the GQL syntax is similar to that of SQL [14]. Query Processing Besides displaying and creating entities, the Datastore Viewer in Admin Console can also let the account subscriber do the query by using GQL. We use GQL in Datastore Viewer to study the data query in GAE datastore by applying the same scenarios used in Oracle databases and Amazon SimpleDB. In datastore, every query computes its results using one or more indexes (An index is defined on a list of properties of a given entity kind, with a corresponding order: ascending or descending for each property). If the application makes any changes to the entities, the indexes will be updated incrementally to reflect those changes. In this way, it reduces the computation needed and allows the datastore to provide the correct results of all queries immediately. The details of how to configure an index will be discussed in Scenario 2 below. 47 Scenario 1: Retrieve the exhibition name, exhibition start date, exhibition end date, and number of paintings exhibited in the exhibition “The Beauty of Form”. Similarly to Amazon SimpleDB, there is no GROUP BY clause in GQL and even more complicated is that there is no COUNT(*) clause either. The only way to retrieve all the information listed above is to implement it on the application level, such as in a Java or Python code application. Inside the application code, we can count the number of entities returned in the result of a query. On the other hand, there are limitations on datastore queries. One of the limitations is that in datastore projection on a property cannot be used with an equality filter, which means that the following query is invalid because projection on property “exhibitionTitle” is used in the equality filter “exhibitionTitle = 'The Beauty of Form'” in the WHERE clause: SELECT exhibitionTitle, start_date, end_date FROM ArtObject WHERE exhibitionTitle = 'The Beauty of Form' AND artType = 'Painting' In order to retrieve the name, start date, and end date of the exhibition besides the number of paintings, we will need to iterate through the results of the query below on the application level to obtain the exhibition information we need: 48 Query 1 (Return all the entities that are satisfied with the conditions described in this scenario): SELECT * FROM ArtObject WHERE exhibitionTitle = 'The Beauty of Form' AND artType = 'Painting' Scenario 2: Retrieve the name, description, and country of origin of all artists who have artwork listed as “Painting” in the exhibition named “The Beauty of Form”. Similar to the situation in Amazon SimpleDB, according to the data models we designed in section 3.2.4.1 with kind Artist and kind Art_Object, the information needed to retrieve in this scenario: name, description, and country of origin of the artists is in the kind “Artist”, but the information needed to filter out the kinds of artists who have artwork listed as “Painting” in the exhibition named “The Beauty of Form” is in the kind “Art_Object”. Like the query language in Amazon SimpleDB, GQL has no join operation. The only way to retrieve the information requested in this scenario is to get data from both “Artist” and “Art_Object” through two queries, and then obtain the requested data from the results of these queries on an application level. Below are the two queries we need for retrieving the information requested in this scenario: 49 Query 2.1: SELECT * FROM Artist Query 2.2: SELECT artist FROM ArtObject WHERE exhibitionTitle = 'The Beauty of Form' AND artType = 'Painting' As we mentioned at the beginning of this section, indexes need to be defined for processing the queries. By default, the datastore predefines an index for each property of each entity kind automatically. The automatically predefined indexes are stored in file datastore-indexes-auto.xml. Usually, these predefined indexes are sufficient to perform many simple queries (such as Query 2.1). For some other queries, however, the queries can become invalid if there are no indexes defined by the application. This needs to be done in an index configuration file named datastore-indexes.xml. For example, in our Java application “ExArtShow”, this file “datastore-indexes.xml” is located in war/WEB-INF/ directory. When a query cannot be executed with the available indexes (either predefined or specified in the index configuration file), the query will become invalid. For the above Query 2.2, it becomes valid and the correct query result is returned from the datastore, after adding the following indexes to the configuration file datastore-indexes.xml, and after being deployed to the App Engine: 50 <datastore-index kind="ArtObject" ancestor="false"> <property name="artType" direction="asc" /> <property name="exhibitionTitle" direction="asc" /> <property name="artist" direction="asc" /> </datastore-index> Scenario 3: Retrieve a list of artists and the exhibitions in which each artist displays artwork, ordered by the artist’s name, and for each artist ordered alphabetically by title of artwork. Unlike the query in Amazon SimpleDB, the optional ORDER BY clause in GQL can return the results that are sorted by the given properties, in either ascending (ASC) or descending (DESC) order. We can apply the following SELECT statement in GQL to retrieve the information requested in this scenario by iterating through the results returned from this query on the application level. Compared to the implementation for this scenario in Amazon SimpleDB, the implementation of ordering with GQL is much simpler. By default, if not specified, the order is ascending. Query 3 (Implement the ORDER BY Clause): SELECT * FROM ArtObject ORDER BY artist ASC, title ASC 51 Scenario 4: Find the maximum number of tickets sold, the minimum number of tickets sold, and the average tickets sold among all exhibitions held in this art center since 2010/01/01. In GQL, there are no aggregate functions. And yet, like the query language in Amazon SimpleDB, there is a LIMIT clause in GQL as well. Without the restrictions and limitations on queries in datastore, we could have applied the similar method which we used in the same scenario in SimpleDB (Section 3.2.3.2) to obtain the information requested as shown below (Query 4.1 and 4.2). Because of the restrictions on ordering and inequality filters in datastore, Query 4.1 and 4.2 become invalid. Query 4.1 (not valid): SELECT no_of_tickets_sold FROM ArtObject WHERE start_date >= '2010-01-01' ORDER BY no_of_tickets_sold DESC LIMIT 1 Query 4.2 (not valid): SELECT no_of_tickets_sold FROM Art_Object WHERE start_date >= '2010-01-01' ORDER BY no_of_tickets_sold ASC LIMIT 1 52 One of the restrictions on datastore ordering is that the first ordering property must be the same as the inequality filter property. In Query 4.1 and 4.2, the first ordering property “no_of_tickets_sold” is expected to be “start_date”, in which Query 4.1 and 4.2 will become valid. Another limitation from datastore in inequality filters is that only one property per query may have inequality filters, such as <=, >=, <, >. Because of this restriction, we cannot apply a filter like “no_of_tickets_sold != 0” in the WHERE clause since another property “start_date” is needed also in the WHERE clause in the same query. Additionally, the nature of the information requested in this scenario limits the choices we can have to obtain the result (the maximum and minimum number of the tickets sold) directly from the datastore through a GQL query. Moreover, another difference from what we are doing in the Amazon SimpleDB section is that we can’t get the total number of the exhibitions held since 2010/01/01because there is no COUNT() operation in GQL. This means that we need to calculate the total number of desired exhibitions in the application code level as well, besides the maximum and minimum number of the tickets sold. This can be done through calculating the total number of the list of entities whose property “start_date” is not less than 2010/01/01. Finally, in the application, we are able to calculate the average number of tickets sold by iterating through the results returned from the query below: 53 Query 4.3: SELECT no_of_tickets_sold FROM ArtObject WHERE start_date >= '2010-01-01' Scenario 5: Retrieve the name of each artist who exhibits both paintings and sculpture. Query 5 : Like in Amazon SimpleDB, to obtain the information requested in this scenario, we will need to implement the results returned from the two queries below in the application level by iterating through the results of the first query (Query 5.1), and running the second query (Query 5.2) for each result found from the first query. Query 5.1: SELECT artist FROM ArtObject WHERE artType = 'Painting' Query 5.2: SELECT artist FROM ArtObject WHERE artType = 'Sculpture' 54 Procedures and Functions Similar to Amazon SimpleDB, there is no database programming language like PL/SQL in Oracle for GAE as well. Indeed, those programming structures used in Oracle procedures and functions in our example will become part of the programming on the application level. 3.3 Database Comparison Based on the query implementation discussed in the last section 3.2, we compare these four database systems on the following aspects: data definition, query syntax, query data type, and database population. 3.3.1 Comparison of Data Definition Within these four database systems: Oracle Database 11g Express Edition, Oracle Public Cloud, Amazon SimpleDB, and Google App Engine Datastore, the data model for the first two are relational (SQL) and the last two are key-value and objects store respectively (NoSQL). In a relational database, database schema is the description of a database that is specified during database design and is not expected to change often [6]. In the contract of relational database, NoSQL database is schema-less. In Oracle 11g and Public Cloud, it is not possible to have multiple values store in one column in a table. This is not true anymore in some NoSQL databases, such as the database in Amazon SimpleDB. In a SimpleDB domain, one attribute (like a column in a table) can contain multiple values. A table in the relational database has a primary key and often foreign key relationships. During the data query, a heavily normalized 55 relational database table will have many foreign key relationships and cause many levels of indirection. In a NoSQL database, the relationships between domains (like in Amazon SimpleDB) and kinds (like in GAE datastore) are very sparse. One of the advantages of this loose relationship in a database is to make data modification, such as data adding, update, insert and delete, become much easier. For example, if we would like to add one more field like “email address” in the “Artist” table, it takes much more work to complete this modification in an Oracle table than in a SimpleDB domain due to the table constraints in the relational database. In SimpleDB, if this new attribute is only applied to one or more items in our data set rather than all the items, we can add an attribute (such as “email address”) to those items only in our data set. This can be done via the Javascript Scratchpad we showed in our previous section 3.2.3.1 or by implementing the following Java codes in our application “SimpleDB_ArtShow” shown in Figure 27. Unlike in the relational databases, we have to add a column named “email address” into the “Artist” table, which is for all the items even though some items might not need to have this attribute. Furthermore, if the “Artist” table is related to some other tables or views, we have to make sure other tables or views are updated respectively as well. This is not an issue in SimpleDB anymore. There are no changes that need to be made in the tables. Also, because there is no schema in SimpleDB, any changes made in the items in the table “Artist” won’t affect the other items in other tables. Database updates in SimpleDB become much more flexible and convenient. 56 Figure 27 Java Codes to Add a New Attribute to an Item Figure 28 A New Attribute "Email Address" is added to Item "Artist_05" One of the techniques used in table creation in relational database is normalization. The main purpose of data normalization is to minimize duplication of data by breaking data into smaller chunks and using references to the data, instead of the data itself. The database schema designed in section 3.2.1.1 is based on such a technique. This technique is not applicable for NoSQL databases. By contrast, usually we need to denormalize the data set and put often used and related data into one table regardless of the data duplication and redundancy. This actually makes data update and query much easier. 57 Comparing the two queries for our Scenario 1 listed in section 3.2.1.2 (Oracle 11g) and section 3.2.3.2 (Amazon SimpleDB), the query used in SimpleDB is much simpler. Another purpose to normalize the data in a relational database is to reduce the data storage space by removing the data duplication. Since in a cloud environment, the infrastructures that support the data storage have shifted to the network supported by many servers in the world, the usage of the data storage has become not that critical, which allows for duplicated data storage. As mentioned in section 3.2, it is possible to put everything in one table by combining the information in “Artist” table with “Art_Object” table. For better illustration on how to handle the JOIN operation for NoSQL database query, we keep them in two tables. Apparently, it is a paradigm shift from a relational database to a NoSQL database like SimpleDB and GAE datastore for the developers when defining and modeling the data. Such a shift also requires forward thinking by developers regarding how to fit their cases into the constraints which the system imposes, but this would give the developers more assurance that the query can run quickly and data is always available and scalable. 3.3.2 Comparison of Query Syntax The SELECT query statements in all of the databases we studied are very similar in format. Even though both Amazon SimpleDB and GAE datastore are not relational databases, both of them invent the query languages that are SQL-like. In the APPENDIX C section [6], [11], [14], we see the syntax format of these three query languages. 58 The ORDER BY operation is different in SimpleDB and GAE datastore. Amazon SimpleDB supports sorting data on a single attribute only, in ascending (default) or descending order. In GAE datastore, the ORDER BY clause can specify multiple sort orders as a comma-delimited list, evaluated from left to right with ascending (default) or descending order. Comparing the queries of Scenario 4 in section 3.2.3.2 and section 3.2.4.2, there is COUNT(*) aggregation function in SimpleDB’s query language but not in GQL. The reason of not providing COUNT function in GQL is due to the way Google Big Table (underlying storage for Datastore) is structured. It doesn’t support row counts as a fundamental concept [15]. Moreover, the nature of the index query mechanism in GAE datastore forces certain limitations and restrictions on what a query can do as we have shown in section 3.2.4.2. A big difference in data query between relational databases and NoSQL databases is that there are no JOIN operation and aggregate functions supported in NoSQL. In a relational database, JOIN is often used to combine data from different tables into one result set. Since there is no JOIN operation supported, to query the same data, in NoSQL we have to use multiple queries and handle the returned results on the application level instead. Foreign keys do not exist in NoSQL. In our GAE application “ExArtShow”, a reference defined in class “ArtObject” constructor is referred to its corresponding “Artist” object that has 1:m relationship with this “ArtObject”. This is shown in the file “ArtObject.java” in APPENDIX B.2. Also because there are no aggregate functions in 59 NoSQL (except for the COUNT(*) function in Amazon SimpleDB), in some of the cases it becomes almost impossible to obtain the results directly from the query itself which can be done in the relational databases, like Oracle 11g Express. For example, in Scenario 2 and 4, it is much easier and takes less effort to obtain the information requested from the scenarios in Oracle 11g Express because there are more functions and operations in SQL query, where NoSQL database has to rely on the applications to complete a query to be able to achieve the same goal. Apparently, the approach used in NoSQL to obtain the result of a query has put more responsibility and weight on the application developers. There are also some minor differences on query syntax between SQL and SQLlike query language. As with SQL, GQL keywords are case insensitive, but Kind and property names are case sensitive in GQL. This is different from SQL where the attributes are case insensitive. 3.3.3 Comparison of Data Types The data type in Oracle 11g Express and Public Cloud include: VARCHAR2, Boolean , date, numeric and etc. By contrast, only one data type is used in Amazon SimpeDB which is UT-8 string. The data comparison requires more work for the developers in SimpleDB when data comparison is needed in a query. Compared to Oracle’s relational database, GAE datastore provides less data types as well. In this way, the data type and data structure are enforced by application developers (application level) rather than the databases, which is the benefit of using less data types in NoSQL databases. And yet, this becomes one of the challenges on the 60 application level when populating the data into the NoSQL databases and querying the information from these databases as shown in the Scenario 4. 3.3.4 Comparison of Database Population Because the data types used in NoSQL databases are much less than those in relational databases, having a correct format on the data populated becomes important. For example, in Amazon SimpleDB, all sort operations are performed in lexicographical order. This can be seen from the data population into SimpleDB in our example by padding zeros to the front of the value of “No_of_Tickets_Sold” before populating to the domains. If the data set contains negative numbers, an offset has to be added to all the data in the data set in order to obtain the correct result after sorting. For example, if there is a data set (-1000, 200, -500, 350, 25.6) that contains negative numbers (-1000, -500) and is stored in an Amazon SimpleDB, an offset 10000 is needed to add to all the data in this data set for the proper comparison in a query. In this way, the data in the data set is converted to be positive or zero and can be compared in the lexicographical order to produce the correct result of a query. The recommendation for the value of the offset added is that the offset is larger than the module of the smallest expected negative number in the dataset [16]. When populating our data set into GAE datastore, we have to use our Java application “ExArtShow” that is modified from one of the sample code projects from Google to be able to accomplish the datastore population. Currently, GAE only supports four programming languages: PHP, Java, Python, and Go. For the people who are not 61 familiar with these four languages, such as C# developers, it could be hard and cumbersome for them to populate the datastore by developing and applying an application with the languages they are not familiar with before they can actually get access to it. 62 Chapter 4 CONCLUSION AND FUTURE WORK In this project, the concepts of deployment models and data models of cloud databases were presented. Also, the cloud data services from the current major cloud providers were surveyed. We applied an example over our four selected database systems (one was an on-premise database and other three were cloud database systems) to study the features of these systems. These four database systems are: Oracle Database 11g Expression Edition, Oracle Public Cloud, Amazon SimpleDB, and Google App Engine Datastore. Our study includes the discussion from database schema design and table creation in relational databases, data population, to query languages in all four systems when applying the scenarios designed from our example. At the end of the project, we also discussed the data definition, query syntax, data types in query, and database population of these four database systems based on our study in Chapter 3. After the study of data query and related areas on the above four database systems, we have found that there are no differences between Oracle 11g Express and Oracle Public Cloud on table creation, data population and data query through its SQL Workshop. Even though the data model in Amazon’s Simple DB is different from the one used in GAE Datastore, they have some common characteristics, such as both databases having sparse relationships between “tables”, both are schema-less and have no “join” operation on query. These characteristics can provide benefits to cloud databases on data usage and data storage. Last but not least, the basic database concepts we learned from traditional RDBMS can help us to get a quick start on the understanding of data storage 63 and query on cloud database systems. And yet, due to the limitation of the example we designed, some of the query operations were not able to be studied and compared. For example, the [<offset>,] <count> in the LIMIT clause in GQL. Also, the data read and write consistency has not been discussed. Security issues related to database query in the cloud environment and the technologies used in the cloud databases for data storage and safety are two more areas that need further study. 64 APPENDIX A SQL Statement A.1 OracleScriptCreateTables.sql -------- Create Tables -------CREATE TABLE Artist ( name VARCHAR2(30) NOT NULL, date_born date NOT NULL, date_died date, description VARCHAR2(500) NOT NULL, country_of_origin VARCHAR2(30), CONSTRAINT pk_Artist PRIMARY KEY (name) ); CREATE TABLE Art_Object ( id_no numeric(6) NOT NULL, year numeric(4) NOT NULL, title VARCHAR2(200) NOT NULL, description VARCHAR2(500) NOT NULL, artist_name VARCHAR2(30) NOT NULL, CONSTRAINT pk_ArtObject PRIMARY KEY (id_no), CONSTRAINT fk_ArtObject FOREIGN KEY (artist_name) REFERENCES Artist(name) ); CREATE TABLE Others ( id_no numeric(6) NOT NULL, type VARCHAR2(100) NOT NULL, style VARCHAR2(150) NOT NULL, CONSTRAINT pk_Others PRIMARY KEY (id_no), CONSTRAINT fk_Others FOREIGN KEY (id_no) REFERENCES Art_Object(id_no) ); 65 CREATE TABLE Painting ( id_no numeric(6) NOT NULL, paint_type VARCHAR2(100) NOT NULL, style VARCHAR2(150) NOT NULL, drawn_on VARCHAR2(100) NOT NULL, CONSTRAINT pk_Painting PRIMARY KEY (id_no), CONSTRAINT fk_Painting FOREIGN KEY (id_no) REFERENCES Art_Object(id_no) ); CREATE TABLE Sculpture ( id_no numeric(6) NOT NULL, material VARCHAR2(150) NOT NULL, weight_kg decimal(4, 1) NOT NULL, height_cm decimal(4, 1) NOT NULL, CONSTRAINT pk_Sculpture PRIMARY KEY (id_no), CONSTRAINT fk_Sculpture FOREIGN KEY (id_no) REFERENCES Art_Object(id_no) ); CREATE TABLE Exhibition ( title VARCHAR2(200) NOT NULL, start_date date NOT NULL, end_Date date NOT NULL, no_of_tickets_sold numeric(10) NOT NULL, CONSTRAINT pk_Exhibition PRIMARY KEY (title) ); CREATE TABLE Shown_At ( art_id_no numeric(6) NOT NULL, exhibition_title VARCHAR2(200) NOT NULL, CONSTRAINT pk_ShownAt PRIMARY KEY (art_id_no, exhibition_title), CONSTRAINT fk1_ShownAt FOREIGN KEY (art_id_no) REFERENCES Art_Object(id_no), 66 CONSTRAINT fk2_ShownAt FOREIGN KEY (exhibition_title) REFERENCES Exhibition(title) ); A.2 OracleScriptCreateViews.sql -------- Create Views ------------ Views related to the specialization: Others_View, Painting_View, Sculpture_View -Create view Others_view as Select a.Id_no, a.year, a.title, a.description, o.type, o.style, a.artist_name From Art_Object a, Others o Where a.id_no = o.id_no; Create view Painting_View as Select a.id_no, a.year, a.title, a.description, p.paint_type, p.style, p.drawn_on, a.artist_name From Art_object a, Painting p Where a.id_no=p.id_no; Create view Sculpture_View as Select a.id_no, a.year, a.title, a.description, s.material, s.weight_kg, s.height_cm, a.artist_name From Art_object a, Sculpture s Where a.id_no = s.id_no; ------ View of Artist_No_Of_Sculpture Create view Artist_No_Of_Sculpture_View as Select artist_name, count(*) as "No_Of_Sculpture" From Sculpture_View Group by artist_name Having count(*) >=2; A.3 OracleScriptPopulateTables.sql ---------- Some INSERT statements used in Oracle database 11g Express ---------- and Oracle Public Cloud --------=========== Artist ============-------------------INSERT INTO Artist 67 (Name, Date_Born, Date_Died, Description, Country_of_Origin) VALUES ('Norman Rockwell', TO_DATE('1894/02/03', 'yyyy/mm/dd'), TO_DATE('1978/11/08', 'yyyy/mm/dd'), '20th-century American painter and illustrator', 'United States'); --------=========== Art_Object ============-------------------INSERT INTO Art_Object (Id_No, Year, Title, Description, Artist_Name) VALUES (1000, '1921', '"No Swimming"', 'Saturday Evening Post (magazine) cover', 'Norman Rockwell'); ------------======Others================---------------INSERT INTO Others (Id_No, Type, Style) VALUES (1006, 'Prints', 'Traditional Realism'); -----------=============== Painting=====------------------INSERT INTO Painting (Id_No, Paint_Type, Style, Drawn_On) VALUES (1000, 'Oil', 'Traditional', 'Canvas'); -----------=============== Sculpture =====------------------INSERT INTO Sculpture 68 (Id_No, Material, Weight_kg, Height_cm) VALUES (1010, 'Plaster', 0.0, 26.0); -----------=============== Exhibition =====------------------INSERT INTO Exhibition (Title, Start_Date, End_Date, No_of_Tickets_Sold) VALUES ('Just Concluded Art Show', to_date('2012/11/10', 'yyyy/mm/dd'), to_date('2013/2/3', 'yyyy/mm/dd'), 89003); -----------=============== Shown_At =====------------------INSERT INTO Shown_At (Art_Id_No, Exhibition_Title) VALUES (1000, 'Just Concluded Art Show'); 69 APPENDIX B Source Code B.1 Source Code for Application “SimpleDB_ArtShow” SimpleDB_ArtShow.java /* * This file “SimpleDB_ArtShow.java” is modified from the project file that is * provided by Amazon.com, Inc. or its affiliates and is licensed under * the Apache License, Version 2.0 (the "License"). * * A copy of the License is located at * * http://aws.amazon.com/apache2.0 * */ import java.util.ArrayList; import java.util.List; import com.amazonaws.AmazonClientException; import com.amazonaws.AmazonServiceException; import com.amazonaws.auth.ClasspathPropertiesFileCredentialsProvider; import com.amazonaws.regions.Region; import com.amazonaws.regions.Regions; import com.amazonaws.services.simpledb.AmazonSimpleDB; import com.amazonaws.services.simpledb.AmazonSimpleDBClient; import com.amazonaws.services.simpledb.model.Attribute; import com.amazonaws.services.simpledb.model.BatchPutAttributesRequest; import com.amazonaws.services.simpledb.model.Item; import com.amazonaws.services.simpledb.model.ReplaceableAttribute; import com.amazonaws.services.simpledb.model.ReplaceableItem; import com.amazonaws.services.simpledb.model.SelectRequest; /** * The following codes show that how to make the requests to Amazon * SimpleDB using the AWS SDK for Java and to implement the queries * from the scenarios designed with Java. * * * The prerequisites for running this project are the same as what is * stated in the project files provided by Amazon.com, Inc. as the following: * * <p> * <b>Prerequisites:</b> You must have a valid Amazon Web Services 70 * developer account, and be signed up to use Amazon SimpleDB. * For more information on Amazon SimpleDB, * see http://aws.amazon.com/simpledb. * <p> * <b>Important:</b> Be sure to fill in your AWS access credentials in the * AwsCredentials.properties file before you try to run this * sample. * http://aws.amazon.com/security-credentials */ public class SimpleDB_ArtShow { public static void main(String[] args) throws Exception { /* * As stated in Amazon.com, Inc's original sample code file: * This credentials provider implementation loads your AWS credentials * from a properties file at the root of your classpath. * * Important: Be sure to fill in your AWS access credentials in the * AwsCredentials.properties file before you try to run this * sample. * http://aws.amazon.com/security-credentials */ AmazonSimpleDB sdb = new AmazonSimpleDBClient(new ClasspathPropertiesFileCredentialsProvider()); Region usEast1 = Region.getRegion(Regions.US_EAST_1); sdb.setRegion(usEast1); //Connected to the data center in US-East (Northern Virginia) System.out.println("==========================================="); System.out.println("Amazon SimpleDB: Art Show Example"); System.out.println("===========================================\n"); try { // List domains System.out.println("Listing all domains in your account:\n"); for (String domainName : sdb.listDomains().getDomainNames()) { System.out.println(" " + domainName); } System.out.println(); //*** Scenario 1 *** System.out.println("**************** Scenario 1 ***************"); String selectExpression = "select Exhibition_Title, Start_Date, End_Date from `" + "Art_Object" + "` where Exhibition_Title = 'The Beauty of Form' and Art_Object_Type = 'Painting'"; 71 System.out.println("Selecting: " + selectExpression + "\n"); SelectRequest selectRequest = new SelectRequest(selectExpression); //Output the result for Scenario 1 System.out.println(" The number of paintings: " + sdb.select(selectRequest).getItems().size() + "\n"); for (Item item : sdb.select(selectRequest).getItems()) { System.out.println(" Art_Object"); System.out.println(" Name: " + item.getName()); for (Attribute attribute : item.getAttributes()) { System.out.println(" Attribute"); System.out.println(" Name: " + attribute.getName()); System.out.println(" Value: " + attribute.getValue()); } } System.out.println(); //*** Scenario 2 *** System.out.println("**************** Scenario 2 ***************"); //Implement Query 2.1: String selectExpression2_1 = "select Name, Description, Country_of_Origin from " + "`Artist`"; System.out.println("Selecting: " + selectExpression2_1 + "\n"); SelectRequest selectRequest2_1 = new SelectRequest(selectExpression2_1); System.out.println(); //Implement Query 2.2: String selectExpression2_2 = "select Artist_Name from `" + "Art_Object" + "` where Exhibition_Title = 'The Beauty of Form' and Art_Object_Type = 'Painting'"; System.out.println("Selecting: " + selectExpression2_2 + "\n"); SelectRequest selectRequest2_2 = new SelectRequest(selectExpression2_2); List<String> result = new ArrayList<String>(); for (Item item2 : sdb.select(selectRequest2_2).getItems()) { //get the value of attribute: Artist_Name in the result that is returned from Query 2.2 Attribute attribute2 = item2.getAttributes().get(0); for (Item item1 : sdb.select(selectRequest2_1).getItems()) { //get the value of attribute: Name in the result that is returned from Query 2.1 Attribute attribute1 = item1.getAttributes().get(0); if (attribute1.getValue().equals(attribute2.getValue())) { String rlString = "Name: " + attribute1.getValue() + ", " + 72 "Description: " + item1.getAttributes().get(1).getValue() + ", " + "Country_of_Origin: " + item1.getAttributes().get(2).getValue(); result.add(rlString); break; } } } //Output the result for Scenario 2 System.out.println(); System.out.println("Artist(s): "); for(String artistInfo: result){ System.out.println(artistInfo); } System.out.println(); //*** Scenario 4 *** System.out.println("**************** Scenario 4 ***************"); //Implement Query 4.1: String selectExpression4_1 = "select No_of_Tickets_Sold from `" + "Art_Object" + "` where Start_Date >= '2010-01-01' AND " + "No_of_Tickets_Sold is not null ORDER BY No_of_Tickets_Sold desc limit 1"; System.out.println("Selecting: " + selectExpression4_1 + "\n"); SelectRequest selectRequest4_1 = new SelectRequest(selectExpression4_1); Item item1 = sdb.select(selectRequest4_1).getItems().get(0); System.out.println("The maximum number of tickets sold: " + item1.getAttributes().get(0).getValue()); System.out.println(); //Implement Query 4.2: String selectExpression4_2 = "select No_of_Tickets_Sold from `" + "Art_Object" + "` where Start_Date >= '2010-01-01' AND " + "No_of_Tickets_Sold is not null ORDER BY No_of_Tickets_Sold asc limit 1"; System.out.println("Selecting: " + selectExpression4_2 + "\n"); SelectRequest selectRequest4_2 = new SelectRequest(selectExpression4_2); Item item2 = sdb.select(selectRequest4_2).getItems().get(0); System.out.println("The minimum number of tickets sold: " + item2.getAttributes().get(0).getValue()); 73 System.out.println(); //Implement Query 4.3: String selectExpression4_3 = "select Exhibition_Title, No_of_Tickets_Sold from `" + "Art_Object" + "` where Start_Date >= '2010-01-01'"; System.out.println("Selecting: " + selectExpression4_3 + "\n"); SelectRequest selectRequest4_3 = new SelectRequest(selectExpression4_3); int sumTickets = 0; int count = 0; String showName1 = ""; for (Item item3 : sdb.select(selectRequest4_3).getItems()) { //Get attribute: No_of_Tickets_Sold Attribute attribute3 = item3.getAttributes().get(1); int numTicket = Integer.parseInt(attribute3.getValue()) ; //Get attribute: Exhibition_Title Attribute attrib = item3.getAttributes().get(0); //Filter out the exhibitions that have the same Exhibition_Title; //only sum up the number of tickets for each different exhibition to prevent the duplicate //exhibition from being calculated if (!showName1.equals(attrib.getValue())){ sumTickets += numTicket; count++; showName1 = attrib.getValue(); } } float avgTickets = sumTickets / count; //Output the result for Scenario 4 System.out.println("The average number of tickets sold: " + avgTickets); System.out.println(); //*** Scenario 5 *** System.out.println("**************** Scenario 5 ***************"); String selectExpression5_1 = "select Artist_Name from `" + "Art_Object" + "` where Art_Object_Type = 'Painting'"; System.out.println("Selecting: " + selectExpression5_1 + "\n"); SelectRequest selectRequest5_1 = new SelectRequest(selectExpression5_1); System.out.println(); 74 String selectExpression5_2 = "select Artist_Name from `" + "Art_Object" + "` where Art_Object_Type = 'Sculpture'"; System.out.println("Selecting: " + selectExpression5_2 + "\n"); SelectRequest selectRequest5_2 = new SelectRequest(selectExpression5_2); List<String> result5 = new ArrayList<String>(); for (Item item5_1 : sdb.select(selectRequest5_1).getItems()) { Attribute attribute1 = item5_1.getAttributes().get(0); for (Item item5_2 : sdb.select(selectRequest5_2).getItems()) { Attribute attribute2 = item5_2.getAttributes().get(0); if (attribute1.getValue().equals(attribute2.getValue())) { result5.add(attribute1.getValue()); break; } } } //Output the result for Scenario 5 System.out.println(); System.out.println("Artist Name(s): "); for (String artistName: result5) { System.out.println(artistName); } System.out.println(); //*** Add a new attribute *** System.out.println("**************** Add a new attribute to an item (Artist) ***************"); System.out.println("Adding a new attribute 'Email Address' to Artist_05 (Michael Maczuga): \n"); sdb.batchPutAttributes(new BatchPutAttributesRequest("Artist", addNewAttributeData())); } catch (AmazonServiceException ase) { System.out.println("Caught an AmazonServiceException, which means your request made it " + "to Amazon SimpleDB, but was rejected with an error response for some reason."); System.out.println("Error Message: " + ase.getMessage()); System.out.println("HTTP Status Code: " + ase.getStatusCode()); System.out.println("AWS Error Code: " + ase.getErrorCode()); 75 System.out.println("Error Type: " + ase.getErrorType()); System.out.println("Request ID: " + ase.getRequestId()); } catch (AmazonClientException ace) { System.out.println("Caught an AmazonClientException, which means the client encountered " + "a serious internal problem while trying to communicate with SimpleDB, " + "such as not being able to access the network."); System.out.println("Error Message: " + ace.getMessage()); } } /* Function: addNewAttributeData * @return an array of one new Attribute: "Email Address" */ private static List<ReplaceableItem> addNewAttributeData() { List<ReplaceableItem> data = new ArrayList<ReplaceableItem>(); data.add(new ReplaceableItem("Artist_05").withAttributes( new ReplaceableAttribute("Email Address", "[email protected]", true) )); return data; } } 76 B.2 Source Code for Application “ExArtShow” Artist.java /* * This file “Artist.java” is modified from the project file that is provided by Google Inc. and is * licensed under the Apache License, Version 2.0 (the "License"). * * A copy of the License is located at * * http://www.apache.org/licenses/LICENSE-2.0 * */ package com.google.appengine.codelab; import com.google.appengine.api.datastore.Entity; import com.google.appengine.api.datastore.FetchOptions; import com.google.appengine.api.datastore.Key; import com.google.appengine.api.datastore.KeyFactory; import com.google.appengine.api.datastore.Query; import com.google.appengine.api.datastore.Query.FilterPredicate; import java.util.List; /** * This class handles all the CRUD operations related to Artist entity. */ public class Artist { /** * Update the artist * @return updated Artist */ public static void createOrUpdateArtist(String artistName, String dateBorn, String dateDied, String countryOfOrigin, String description) { Entity artist = getArtist(artistName); if (artist == null) { artist = new Entity("Artist", artistName); artist.setProperty("dateBorn", dateBorn); artist.setProperty("dateDied", dateDied); artist.setProperty("countryOfOrigin", countryOfOrigin); artist.setProperty("description", description); } else { artist.setProperty("dateBorn", dateBorn); artist.setProperty("dateDied", dateDied); artist.setProperty("countryOfOrigin", countryOfOrigin); artist.setProperty("description", description); } Util.persistEntity(artist); 77 } /** * Return all the artists * @param kind : of kind Artist * @return artists */ public static Iterable<Entity> getAllArtists(String kind) { return Util.listEntities(kind, null, null); } /** * Get artist entity * @param name : name of the artist * @return: artist entity */ public static Entity getArtist(String artistName) { Key key = KeyFactory.createKey("Artist", artistName); return Util.findEntity(key); } /** * Get all ArtObjects for an artist * @param name: name of the artist * @return list of ArtObjects */ public static List<Entity> getArtObjects(String artistName) { Query query = new Query(); Key parentKey = KeyFactory.createKey("Artist", artistName); query.setAncestor(parentKey); query.setFilter(new FilterPredicate(Entity.KEY_RESERVED_PROPERTY, Query.FilterOperator.GREATER_THAN, parentKey)); List<Entity> results = Util.getDatastoreServiceInstance() .prepare(query).asList(FetchOptions.Builder.withDefaults()); return results; } /** * Delete artist entity * @param artistKey: artist to be deleted * @return status string */ public static String deleteArtist(String artistKey) { Key key = KeyFactory.createKey("Artist",artistKey); List<Entity> artObjects = getArtObjects(artistKey); 78 if (!artObjects.isEmpty()){ return "Cannot delete, as there are artObjects associated with this artist."; } Util.deleteEntity(key); return "Artist deleted successfully"; } } ArtistServlet.java /* * This file “ArtistServlet.java” is modified from the project file that is provided by Google Inc. * and is licensed under the Apache License, Version 2.0 (the "License"). * * A copy of the License is located at * * http://www.apache.org/licenses/LICENSE-2.0 * */ package com.google.appengine.codelab; import java.io.IOException; import java.io.PrintWriter; import java.util.HashSet; import java.util.Set; import java.util.logging.Level; import java.util.logging.Logger; import javax.servlet.ServletException; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import com.google.appengine.api.datastore.Entity; import com.google.appengine.codelab.Util; /** * This servlet responds to the request corresponding to artist entities. The servlet * manages the Artist Entity * * */ @SuppressWarnings("serial") public class ArtistServlet extends BaseServlet { private static final Logger logger = Logger.getLogger(ArtistServlet.class.getCanonicalName()); /** * Get the entities in JSON format. */ 79 protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { super.doGet(req, resp); logger.log(Level.INFO, "Obtaining artist listing"); String searchFor = req.getParameter("q"); PrintWriter out = resp.getWriter(); Iterable<Entity> entities = null; if (searchFor == null || searchFor.equals("") || searchFor == "*") { entities = Artist.getAllArtists("Artist"); out.println(Util.writeJSON(entities)); } else { Entity artist = Artist.getArtist(searchFor); if (artist != null) { Set<Entity> result = new HashSet<Entity>(); result.add(artist); out.println(Util.writeJSON(result)); } } } /** * Create the entity and persist it. */ protected void doPut(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { logger.log(Level.INFO, "Creating Artist"); PrintWriter out = resp.getWriter(); String artistName = req.getParameter("artistName"); String dateBorn = req.getParameter("dateBorn"); String dateDied = req.getParameter("dateDied"); String countryOfOrigin = req.getParameter("countryOfOrigin"); String description = req.getParameter("description"); try { Artist.createOrUpdateArtist(artistName, dateBorn, dateDied, countryOfOrigin, description); } catch (Exception e) { String msg = Util.getErrorMessage(e); out.print(msg); } } /** * Delete the artist entity */ protected void doDelete(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { String artistkey = req.getParameter("id"); PrintWriter out = resp.getWriter(); 80 try{ out.println(Artist.deleteArtist(artistkey)); } catch(Exception e) { out.println(Util.getErrorMessage(e)); } } /** * Redirect the call to doDelete or doPut method */ protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { String action = req.getParameter("action"); if (action.equalsIgnoreCase("delete")) { doDelete(req, resp); return; } else if (action.equalsIgnoreCase("put")) { doPut(req, resp); return; } } } ArtObject.java /* * This file “ArtObject.java” is modified from the project file that is provided by Google Inc. * and is licensed under the Apache License, Version 2.0 (the "License"). * * A copy of the License is located at * * http://www.apache.org/licenses/LICENSE-2.0 * */ package com.google.appengine.codelab; import java.util.List; import com.google.appengine.api.datastore.Entity; import com.google.appengine.api.datastore.FetchOptions; import com.google.appengine.api.datastore.Key; import com.google.appengine.api.datastore.KeyFactory; import com.google.appengine.api.datastore.Query; import com.google.appengine.api.datastore.Query.FilterOperator; import com.google.appengine.api.datastore.Query.FilterPredicate; 81 /** * This class handles CRUD operations related to ArtObject entity. */ public class ArtObject { /** * Create or update ArtObject for a particular Artist. Artist entity has one to many * relationship with ArtObject entity * * @return updated ArtObject */ public static Entity createOrUpdateArtObject(String artistName, String artObjectId, String year, String title, String description, String artType, String type, String style, String drawnOn, String material, String weight, String height, String exhibitionTitle, String startDate, String endDate, String noOfTicketsSold) { Entity artist = Artist.getArtist(artistName); Entity artObject = getSingleArtObject(artObjectId); if(artObject == null){ artObject = new Entity("ArtObject",artist.getKey()); artObject.setProperty("artId", artObjectId); artObject.setProperty("artist", artistName); artObject.setProperty("year", year); artObject.setProperty("title", title); artObject.setProperty("description", description); artObject.setProperty("artType", artType); artObject.setProperty("type", type); artObject.setProperty("style", style); artObject.setProperty("drawnOn", drawnOn); artObject.setProperty("material", material); artObject.setProperty("weight", weight); artObject.setProperty("height", height); artObject.setProperty("exhibitionTitle", exhibitionTitle); artObject.setProperty("start_date", startDate); artObject.setProperty("end_date", endDate); artObject.setProperty("no_of_tickets_sold", noOfTicketsSold); } else{ if (year != null && !"".equals(year)) { artObject.setProperty("year", year); } if (title != null && !"".equals(title)) { artObject.setProperty("title", title); } if (description != null && !"".equals(description)) { artObject.setProperty("description", description); 82 } if (artType != null && !"".equals(artType)) { artObject.setProperty("artType", artType); } if (type != null && !"".equals(type)) { artObject.setProperty("type", type); } if (style != null && !"".equals(style)) { artObject.setProperty("style", style); } if (drawnOn != null && !"".equals(drawnOn)) { artObject.setProperty("drawnOn", drawnOn); } if (material != null && !"".equals(material)) { artObject.setProperty("material", material); } if (weight != null && !"".equals(weight)) { artObject.setProperty("weight", weight); } if (height != null && !"".equals(height)) { artObject.setProperty("height", height); } if (exhibitionTitle != null && !"".equals(exhibitionTitle)) { artObject.setProperty("exhibitionTitle", exhibitionTitle); } if (startDate != null && !"".equals(startDate)) { artObject.setProperty("start_date", startDate); } if (endDate != null && !"".equals(endDate)) { artObject.setProperty("end_date", endDate); } if (noOfTicketsSold != null && !"".equals(noOfTicketsSold)) { artObject.setProperty("no_of_tickets_sold", exhibitionTitle); } } Util.persistEntity(artObject); return artObject; } /** * get All the artObjects in the list * * @param kind: ArtObject kind * @return all the ArtObjects */ public static Iterable<Entity> getAllArtObjects() { Iterable<Entity> entities = Util.listEntities("ArtObject", null, null); return entities; 83 } /** * Get the ArtObject by ArtObjectId, return an Iterable * * @param itemName: item name * @return Item Entity */ public static Iterable<Entity> getArtObject(String artObjectId) { Iterable<Entity> entities = Util.listEntities("ArtObject", "artId", artObjectId); return entities; } /** * Get all the ArtObjects for an artist * * @param kind: ArtObject kind * @param artistName: name of the artist * @return: all ArtObjects created by that artist */ public static Iterable<Entity> getArtObjectsForArtist(String kind, String artistName) { Key ancestorKey = KeyFactory.createKey("Artist", artistName); return Util.listChildren("ArtObject", ancestorKey); } /** * get ArtObject with artObjectId * @param artObjectId: get artObjectId * @return ArtObject entity */ public static Entity getSingleArtObject(String artObjectId) { Query query = new Query("ArtObject"); query.setFilter(new FilterPredicate("artId", FilterOperator.EQUAL, artObjectId)); List<Entity> results = Util.getDatastoreServiceInstance().prepare(query).asList(FetchOptions.Builder.withDefaults()); if (!results.isEmpty()) { return (Entity)results.remove(0); } return null; } public static String deleteArtObject(String artObjectKey) { Entity entity = getSingleArtObject(artObjectKey); if(entity != null){ Util.deleteEntity(entity.getKey()); return("ArtObject deleted successfully."); } else 84 return("ArtObject not found"); } } ArtObjectServlet.java /* * This file “ArtObjectServlet.java” is modified from the project file that is provided by Google * Inc. and is licensed under the Apache License, Version 2.0 (the "License"). * * A copy of the License is located at * * http://www.apache.org/licenses/LICENSE-2.0 * */ package com.google.appengine.codelab; import java.io.IOException; import java.io.PrintWriter; import java.util.logging.Level; import java.util.logging.Logger; import javax.servlet.ServletException; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import com.google.appengine.api.datastore.Entity; /** * This servlet responds to the request corresponding to ArtObjects. The class * creates and manages the ArtObject Entity * * */ @SuppressWarnings("serial") public class ArtObjectServlet extends BaseServlet { private static final Logger logger = Logger.getLogger(ArtObjectServlet.class.getCanonicalName()); /** * Searches for the entity based on the search criteria and returns result in * JSON format */ protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { super.doGet(req, resp); 85 logger.log(Level.INFO, "Obtaining ArtObject listing"); String searchBy = req.getParameter("artObject-searchby"); String searchFor = req.getParameter("q"); PrintWriter out = resp.getWriter(); if (searchFor == null || searchFor.equals("")) { Iterable<Entity> entities = ArtObject.getAllArtObjects(); out.println(Util.writeJSON(entities)); } else if (searchBy == null || searchBy.equals("artId")) { Iterable<Entity> entities = ArtObject.getArtObject(searchFor); out.println(Util.writeJSON(entities)); } else if (searchBy != null && searchBy.equals("artistName")) { Iterable<Entity> entities = ArtObject.getArtObjectsForArtist("ArtObject", searchFor); out.println(Util.writeJSON(entities)); } } /** * Creates entity and persists the same */ protected void doPut(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { logger.log(Level.INFO, "Creating ArtObject"); String artObjectId = req.getParameter("artId"); String artistName = req.getParameter("artist"); String year = req.getParameter("year"); String title = req.getParameter("title"); String description = req.getParameter("description"); String artType = req.getParameter("artType"); String type = req.getParameter("type"); String style = req.getParameter("style"); String drawnOn = req.getParameter("drawnOn"); String material = req.getParameter("material"); String weight = req.getParameter("weight"); String height = req.getParameter("height"); String exhibitionTitle = req.getParameter("exhibitionTitle"); String startDate = req.getParameter("start_date"); String endDate = req.getParameter("end_date"); String noOfTicketsSold = req.getParameter("no_of_tickets_sold"); ArtObject.createOrUpdateArtObject(artistName, artObjectId, year, title, description, artType, type, style, drawnOn, material, weight, height, exhibitionTitle, startDate, endDate, noOfTicketsSold); } /** * Delete the entity from the datastore. Throws an exception if there are any * orders associated with the artObject and ignores the delete action for it. */ protected void doDelete(HttpServletRequest req, HttpServletResponse resp) 86 throws ServletException, IOException { logger.log(Level.INFO, "deleting artObject"); String artObjectKey = req.getParameter("id"); PrintWriter out = resp.getWriter(); try{ out.println(ArtObject.deleteArtObject(artObjectKey)); } catch(Exception e) { out.println(Util.getErrorMessage(e)); } } /** * Redirects to delete or insert entity based on the action in the HTTP * request. */ protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { String action = req.getParameter("action"); if (action.equalsIgnoreCase("delete")) { doDelete(req, resp); return; } else if (action.equalsIgnoreCase("put")) { doPut(req, resp); return; } } } ajax.util.js /* * This file “ajax.util.js” is modified from the project file that is provided by Google Inc. and is * licensed under the Apache License, Version 2.0 (the "License"). * * A copy of the License is located at * * http://www.apache.org/licenses/LICENSE-2.0 * */ var HOME='home'; var ENTITY_ARTIST='artist'; var ENTITY_ARTOBJECT='artObject'; //function to initialize the page var init = function() { //showing the home tab on initializing showTab(HOME); 87 //adding event listeners to the tabs $('#tabs a').click(function(event) { showTab(event.currentTarget.id); }); } //function to show the tab var showTab = function(entity) { //remove the active class from all the tabs $('.tab').removeClass("active"); //setting the active class to the selected tab $('#'+entity).addClass("active"); //hiding all the tabs $('.g-unit').hide(); //showing the selected tab $('#' + entity + '-tab').show(); //hiding the create block showHideCreate(entity, false); if(entity!=HOME) $('#'+entity+'-search-reset').click(); } //function to show/hide create block for an entity in a tab var showHideCreate = function(entity, show) { //checking if the block is show or not if (show) { //hiding the search container $('#' + entity + '-search-ctr').hide(); //hiding the list container $('#' + entity + '-list-ctr').hide(); //showing the create container $('#' + entity + '-create-ctr').show(); } else { //showing the search container $('#' + entity + '-search-ctr').show(); //showing the list container $('#' + entity + '-list-ctr').show(); //hiding the create container $('#' + entity + '-create-ctr').hide(); //checking if the entity is not a home then populating the list of the entity if(entity!=HOME) populateList(entity,null); } } //parameter object definition var param=function(name,value){ 88 this.name=name; this.value=value; } //function to add an entity when user clicks on the add button in UI var add = function(entity) { $('.message').hide(); $('#'+entity+'-reset').click(); //display the create container showHideCreate(entity, true); $("span.readonly input").attr('readonly', false); $("select[id$=artObject-artist-list] > option").remove(); //checking the entity to populate the select box if (entity == ENTITY_ARTOBJECT) { //populating the artist and contact by making an ajax call populateSelectBox('artObject-artist-list', '/artist'); } } //function to search an entity when user inputs the value in the search box var search = function(entity) { $('.message').hide(); // collecting the field values from the form var formEleList = $('form#'+entity+'-search-form').serializeArray(); //assigning the filter criteria var filterParam=new Array(); for(var i=0;i<formEleList.length;i++){ filterParam[filterParam.length]=new param(formEleList[i].name,formEleList[i].value); } //calling population of the list through ajax populateList(entity,filterParam); } var showMessage = function(message, entity){ $('#'+entity+'-show-message').show().html('<p><b>'+message+'</b></p>'); } var formValidate = function(entity){ var key; var formEleList = $('form#'+entity+'-create-form').serializeArray(); key=formEleList[0].value; switch(entity){ case ENTITY_ARTOBJECT: var valueArtist = $('#artObject-artist-list').val(); if(valueArtist == "" || key == ""){ showMessage('please check the key and Artist values in the form', entity); 89 return; } break; default : if(key==""){ showMessage('please check the values in the form', entity); return; } break; } save(entity); $('#'+entity+'-show-message').hide(); } //function to save an entity var save = function(entity) { $('#'+entity+'-show-message').hide(); // creating the data object to be sent to backend var data=new Array(); // collecting the field values from the form var formEleList = $('form#'+entity+'-create-form').serializeArray(); for(var i=0;i<formEleList.length;i++){ data[data.length]=new param(formEleList[i].name,formEleList[i].value); } //setting action as PUT data[data.length]=new param('action','PUT'); //making the ajax call $.ajax({ url : "/"+entity, type : "POST", data:data, success : function(data) { showHideCreate(entity,false); } }); $('#'+entity+'-reset').click(); //$('#artObject-artist-list').reset();//=''; } //function to edit entity var edit = function(entity, id){ var parameter=new Array(); parameter[parameter.length]=new param('q',id); $.ajax({ url : "/"+entity, type : "GET", data:parameter, success : function(resp) { 90 var data = ''; if (resp!='') data = resp.data[0]; var formElements = $('form#'+entity+'-create-form :input'); for(var i=0;i<formElements.length;i++){ if(formElements[i].type !="button"){ var ele=$(formElements[i]); if(ele.attr('name')=="artist"){ $("select[id$=artObject-artist-list] > option").remove(); ele.append('<option value="'+eval('data.'+ele.attr('name'))+'">'+eval('data.'+ele.attr('name'))+'</option>'); } else ele.val(eval('data.'+ele.attr('name'))); } } showHideCreate(entity, true); $("span.readonly input").attr('readonly', true); } }); } //function called when user clicks on the cancel button var cancel = function(entity) { $('.message').hide(); //hiding the create container in the tab showHideCreate(entity, false); } //function to delete an entity var deleteEntity = function(entity,id,parentid) { var parameter=new Array(); parameter[parameter.length]=new param('id',id); parameter[parameter.length]=new param('parentid', parentid); parameter[parameter.length]=new param('action','DELETE'); //making the ajax call $.ajax({ url : "/"+entity, type : "POST", data:parameter, dataType:"html", success : function(resp) { showHideCreate(entity,false); if (resp!=''){ showMessage(resp, entity); 91 } }, error : function(resp){ showMessage(resp, entity); } }); } // function to get the data by setting url, filter, success function and error function var getData=function(url,filterData,successFn,errorFn){ // making the ajax call $.ajax({ url : url, type : "GET", data:filterData, success : function(resp) { //calling the user defined success function if(successFn) successFn(resp); }, error:function(e){ //calling the user defined error function if(errorFn) errorFn(e); } }); } //function to populate the select box which takes input as id of the selectbox element and url to get the data var populateSelectBox = function(id, url) { //specifying the success function. When the ajax response is successful then the following function will be called var successFn=function(resp){ //getting the select box element var selectBox=$('#'+id); //setting the content inside as empty selectBox.innerHTML = ''; //getting the data from the response object var data=resp.data; //appending the first option as select to the select box selectBox.append('<option value="">Select</option>'); //adding all other values for (var i=0;i<data.length;i++) { selectBox.append('<option value="'+data[i].name+'">'+data[i].name+'</option>'); } } 92 //calling the getData function with the success function getData(url,null,successFn,null); } //function to populate the list of an entity var populateList=function(entity, filter){ //specifying the success function. When the ajax response is successful then the following function will be called var successFn=function(resp){ var data=''; if(resp){ //getting the data from the response object data=resp.data; } //creating the html content var htm=''; if(data.length > 0){ for (var i=0;i<data.length;i++){ //creating a row htm+='<tr>'; switch(entity) { case ENTITY_ARTIST: htm+='<td>'+data[i].name+'</td><td>'+data[i].dateBorn+'</td><td>'+data[i].dateDied+'< /td>'+ '<td>'+data[i].description+'</td><td>'+data[i].countryOfOrigin+'</td>'; //'+data[i].name+'</td><td>'+data[i].description+' break; case ENTITY_ARTOBJECT: htm+='<td>'+data[i].artId+'</td><td>'+data[i].year+'</td><td>'+data[i].title+'</td>'+ '<td>'+data[i].description+'</td><td>'+data[i].artType+'</td><td>'+data[i].type+'</td>'+ '<td>'+data[i].style+'</td><td>'+data[i].drawnOn+'</td><td>'+data[i].material+'</td>'+ '<td>'+data[i].weight+'</td><td>'+data[i].height+'</td><td>'+data[i].exhibitionTitle+'</t d>'+ '<td>'+data[i].start_date+'</td><td>'+data[i].end_date+'</td><td>'+data[i].no_of_tickets _sold+'</td><td>'+data[i].artist+'</td>'; break; default: htm+=""; } 93 if(entity == ENTITY_ARTOBJECT) htm+='<td><a href="#" class="delete-entity" onclick=\'deleteEntity("'+entity+'","'+data[i].artId+'","'+data[i].artist+'")\'>Delete</a> | <a href="#" class="edit-entity" onclick=\'edit("'+entity+'","'+data[i].artId+'")\'>Edit</a></td></tr>'; else htm+='<td><a href="#" class="delete-entity" onclick=\'deleteEntity("'+entity+'","'+data[i].name +'")\'>Delete</a> | <a href="#" class="editentity" onclick=\'edit("'+entity+'","'+data[i].name+'")\'>Edit</a></td></tr>'; } } else{ //condition to show message when data is not available var thElesLength=$('#'+entity+'-list-ctr table thead th').length; htm+='<tr><td colspan="'+thElesLength+'">No artObjects found</td></tr>'; } $('#'+entity+'-list-tbody').html(htm); } getData("/"+entity,filter,successFn,null); } 94 APPENDIX C Query Language C.1 Oracle SQL Syntax SELECT <attribute list> FROM <table list> [WHERE <condition>] [GROUP BY <grouping attribute(s)>] [HAVING <group condition>] [ORDER BY <attribute list>] C.2 Amazon SimpleDB Query Language select output_list from domain_name [where expression] [sort_instructions] [limit limit] The output_list can be any of the following: (all attributes) itemName() (the item name only) count(*) An explicit list of attributes (attribute1,..., attributeN) The domain_name is the domain to search. The expression is the match expression. 95 The sort_instructions describe how to sort the results. The limit is the maximum number of results to return (default: 100, max. 250). The expression can be any of the following: <select expression> intersection <select expression> NOT <select expression> (<select expression>) <select expression> or <select expression> <select expression> and <select expression> <simple comparison> C.3 Google GQL Syntax SELECT [DISTINCT] [* | <property list> | __key__] [FROM <kind>] [WHERE <condition> [AND <condition> ...]] [ORDER BY <property> [ASC | DESC] [, <property> [ASC | DESC] ...]] [LIMIT [<offset>,]<count>] [OFFSET <offset>] <property list> := <property> [, <property> ...] <condition> := <property> {< | <= | > | >= | = | != } <value> <condition> := <property> IN <list> <condition> := ANCESTOR IS <entity or key> <list> := (<value> [, <value> ...]]) 96 BIBLIOGRAPHY [1] "Amazon Web Services," Amazon Web Services, Inc., 2013. [Online]. Available: http://aws.amazon.com/ec2/. [Accessed 1 November 2013]. [2] Y. Sverdlik, "OpenWorld: Data Centers for Oracle's Public Cloud," 3 October 2012. [Online]. Available: http://www.datacenterdynamics.com/focus/archive/2012/10/openworld-datacenters-oracle%E2%80%99s-public-cloud. [Accessed 1 November 2013]. [3] "Google Intranet Integration And Related Applications," Intoweb Business (PTY) LTD, 2013. [Online]. Available: http://www.intoweb.com/articles/google_intranet_integration.php. [Accessed 1 November 2013]. [4] S. Sakr, A. Liu, D. M. Batista and M. Alomari, "A Survey of Large Scale Data Management Approaches in Cloud Environments," IEEE Communications Surveys & Tutorials, vol. 13, no. 3, pp. 311-335, 2011. [5] M. Fowler, "Introduction to NoSQL Databases," 22 August 2013. [Online]. Available: http://www.datascienceassn.org/1/category/nosql/1.html. [Accessed 1 November 2013]. [6] R. Elmasri and S. B. Navathe, Fundamentals of Database Systems, 4th ed., Boston, MA: Pearson Education, Inc., 2004. 97 [7] R. Greenwald, "Oracle Cloud Computing," September 2012. [Online]. Available: http://www.oracle.com/technetwork/database/database-cloud/public/oracle-db-anddb-cloud-service-wp-1844127.pdf. [Accessed 1 November 2013]. [8] R. Greenwald, "Data Movement," April 2013. [Online]. Available: http://www.oracle.com/technetwork/database/database-cloud/public/datamovement-wp-1844121.pdf. [Accessed 1 November 2013]. [9] R. Lakshminarayanan, B. Kumar and M. Raju, "Cloud Computing Benefits for Educational Institutions," Cornell University , [Online]. Available: http://arxiv.org/ftp/arxiv/papers/1305/1305.2616.pdf. [Accessed 1 November 2013]. [10] Amazon Web Services, Inc., "Amazon SimpleDB Developer Guide (API Version 2009-04-15)," April 2009. [Online]. Available: http://docs.aws.amazon.com/AmazonSimpleDB/latest/DeveloperGuide/APISumma ry.html. [Accessed 1 November 2013]. [11] Amazon Web Services, "Building Amazon SimpleDB Queries," 23 March 2010 . [Online]. Available: http://aws.amazon.com/articles/1231. [Accessed 1 November 2013]. [12] Google Inc., "Google App Engine," 24 June 2013 . [Online]. Available: https://cloud.google.com/products/app-engine. [Accessed 1 November 2013]. [13] Google Inc., "Google App Engine," 15 October 2013. [Online]. Available: https://developers.google.com/appengine/docs/whatisgoogleappengine. [Accessed 1 November 2013]. 98 [14] Google Inc., "GQL Reference," 23 October 2013. [Online]. Available: https://developers.google.com/appengine/docs/python/datastore/gqlreference. [Accessed 1 November 2013]. [15] N. Kumar, "Efficiently Getting a Count of All Entities in Google App Engine," 7 September 2011. [Online]. Available: http://thoughts.inphina.com/2011/09/07/efficiently-getting-a-count-of-all-entitiesin-google-app-engine/. [Accessed 1 November 2013]. [16] Amazon Web Services, "Tips and Tricks for Amazon SimpleDB Query," 6 April 2009. [Online]. Available: http://aws.amazon.com/articles/1232. [Accessed 1 November 2013].