* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download we know all the incoming query templates beforehand
Expense and cost recovery system (ECRS) wikipedia , lookup
Data analysis wikipedia , lookup
Data center wikipedia , lookup
Microsoft Access wikipedia , lookup
Information privacy law wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Semantic Web wikipedia , lookup
Data vault modeling wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Business intelligence wikipedia , lookup
Web analytics wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
Concurrency control wikipedia , lookup
Relational model wikipedia , lookup
Versant Object Database wikipedia , lookup
Performance Driven Database Design for Scalable Web Applications Jozsef Patvarczki, Murali Mani, and Neil Heffernan Abstract Scaling up web applications requires distribution of load across multiple application servers and across multiple database servers. Distributing load across multiple application servers is fairly straightforward; however distributing load (select and UDI queries) across multiple database servers is more complex because of the synchronization requirements for multiple copies of the data. Different techniques have been investigated for data placement across multiple database servers, such as replication, partitioning and de-normalization. In this poster, we describe our architecture that utilizes these data placement techniques for determining an optimum layout of data. Our solution is general, and other data placement techniques can be integrated within our system. Once the data is laid out on the different database servers, our efficient query router routes the queries to the appropriate database server/(s). Our query router maintains multiple connections for a database server so that many queries are executed simultaneously on a database server, thus increasing the utilization of each database server. Our query router also implements a locking mechanism to ensure that the queries on a database server are executed in order. We have implemented our solutions in our system, that we call SIPD (System for Intelligent Placement of Data). Preliminary experimental results illustrate the significant performance benefits achievable by our system. Proposed Solution • We propose a generic architecture for balancing load across multiple database servers for web applications; • There are two core parts of our system: (a) the intelligent data placement algorithm produces a layout structure, describing how the data needs to be laid out across multiple database servers to achieve better performance and performs the actual layout; (b) the query router that utilizes the layout structure produced by the intelligent data placement algorithm for routing queries and ensuring that the database servers are being utilized effectively; • To achieve a better performance, the data placement algorithm uses the given query workload, the time it takes to execute a select/UDI query type, and the time it takes to execute a select/UDI query if the table/(s) are partitioned, replicated or de-normalized; Problem Statement • A characteristic of web applications such as our ASSISTment system, is that we know all the incoming query templates beforehand as the users typically interact with the system through a web interface [1]; •There are thousands of web applications, and these systems need to figure out how to scale up their performance; • Web applications typically have a 3-tier architectures consisting of clients, application, and database server that are working together [2]; • The increasing load of the database layer can lead to slow response time, application error, and in the worst case, to different type of system crashes; • We have integrated our data placement algorithm and our query router into a prototype system that we call System for Intelligent Placement of Data (SIPD). Future Work • • • • Data modeling; Effective locking mechanisms; Distributed query processing; Considering fault tolerance as an application constrain and handling inconsistencies. References 1. Tobias Groothuyse, Swaminathan Sivasubramanian, Guillaume Pierre, “Globetp: template-based database replication for scalable web applications”, WWW07, Banff, Canada, pp. 301-310 2. Jozsef Patvarczki, Shane F. Almeida, Joseph E. Beck, and Neil T. Heffernan,, “Lessons Learned from Scaling Up a Web-based Intelligent Tutoring System”, ITS2008, Montreal, Canada, pp. 766-770 Contact: Jozsef Patvarczki, [email protected]