Download To enhance the PostgreSQL system, we propose

CISC 322 Assignment 3 Enhancement Proposal for PostgreSQL December 04, 2010 S-Queue-L Adam Abu Hijleh ([email protected]) Adam Ali ([email protected]) Khurrum Abdul Mujeeb ([email protected]) Stephen McDonald ([email protected]) Wissam Zaghal ([email protected]) S-Queue-L Page |1 Table of Contents Abstract ........................................................................................................................................................ 2 Proposed Enhancement and Why It’s Beneficial ........................................................................................ 2 Current Implementation of Redundancy on PostgreSQL............................................................................ 3 Effects of Enhancement ............................................................................................................................... 4 Software Architecture Analysis Method ..................................................................................................... 5 Alternate Implementation ........................................................................................................................... 7 Chosen Implementation Architectural Styles and Design Patterns ........................................................... 8 Testing .......................................................................................................................................................... 9 Use Case 1 ................................................................................................................................................... 10 Use Case 2 ................................................................................................................................................... 11 Interactions with Other Features .............................................................................................................. 12 Lessons Learned and Limitations ............................................................................................................... 12 Conclusion .................................................................................................................................................. 13 Glossary ...................................................................................................................................................... 14 References .................................................................................................................................................. 15 S-Queue-L Page |2 Abstract To enhance the PostgreSQL system, we propose adding an additional back-end, such that there are two identical back-ends running the exact same processes simultaneously. This report touches on why we believe redundancy would be a beneficial add-on, how the add-on differs from the present PostgreSQL implementation and an overall analysis of the effects of the addition. A presentation of two possible implementations of our redundancy will be delineated, and each will be taken through a SAAM analysis. Next, the new high and low level architectural styles are discussed. Any components affected by the implementation are identified, and all new interactions caused by redundancy with existing features will be mentioned. We then examine the potential risks that redundancy comes with, followed by two diagrammed use cases that trace common functions via data flow. The report concludes with a brief consideration of our lessons learned and limitations, and a quick summation of all presented information. Proposed Enhancement and Why It’s Beneficial Although PostgreSQL’s system promotes maintainability, scalability and reliability, there is still room for improvement. If an entire PostgreSQL server were to crash, the full process it was working on would obviously be halted, and the user would have to wait server to restore and utilize the write ahead log to return to your previous state. Thus we felt we that an enhancement was needed to ensure that there was a backup option in case the entire system was to crash. One popular database feature used today is redundancy, which involves storing identical copies of data and processes that are independent of each other (stored/performed on separate servers that have no inter-communication). This can happen synchronously or asynchronously, and each option has its advantages. Asynchronous data redundancy already exists in PostgreSQL and is better known as data replication, a process whereby the master database adds new data and then sends this newly added data to slave databases (2. “Deciding between Redundancy and Clustering”, December 4, 2010). Synchronous data redundancy implies the use of two independent databases that run the exact same processes, at the exact same time, on the exact same data. In synchronous data redundancy, both databases store the exact same data as well. With the addition of a synchronous redundancy system, we propose that PostgreSQL should utilize a layered architecture style, where the current version of PostgreSQL is the lowest layer, and the upmost layer introduces a new component known as “Redundancy Control”. This component, which is on its own machine, will take requests from users, duplicate them, and spawn two separate and independent instances of PostgreSQL, each stored on a separate machine, to perform the given task. Both PostgreSQL instances complete the same query and have the same data. Should one of the PostgreSQL servers lose connection with the redundancy control layer, the Failure Manager within redundancy control will pick up on the lost connection, and know to uses data from the other server. Once the connection becomes live again, the Backend Manager will implement all changes on the newly live server that it missed while it was down. This feature would be beneficial to PostgreSQL in many ways. Firstly, it would eliminate the likelihood of the system crashing, as a second database will always be up to date and running. S-Queue-L Page |3 This would reassure users that no matter what changes they make, the databases will always be up to date and only in rare cases would their database not be up to date. It also means that if one server crashes, the user would not even notice, as the process they are performing would still be running perfectly on the other server (assuming the crash is not due to reaching hardware limits). Also, by placing the second database on a completely different system, we make it easier to scale-up. By having two servers with the same information, we can take one down and upgrade it while the other server performs all queries, and then update it when it becomes live again. This is similar to the “Google Dance”, where one instance of Google can be taken down, and changed, while other instances remain live, leaving the user unaffected. Current Implementation of Redundancy on PostgreSQL The latest version of PostgreSQL (Version 9.0) introduced a form of redundancy that was not found in earlier versions of the software. This implementation works much like a Master-Slave pattern, where the main server is the Master that regularly updates the stand-by servers (slaves) via write-ahead logging. These stand-by servers are also given the ability to run read-only queries, which in contrast to previous versions, was only allowed when the stand-by server became the new master. (3. “Postgres Documentation: Manuals”, December 3, 2010 ) This implementation of redundancy works asynchronously via a “Hot Standby” server method. The “Hot Standby” server method is a backup server that receives regular updates and is standing by ready to take the role of a master server immediately in the event of a failover. A “Hot Standby” method is generally conceived as a synchronous method of running a master and stand-by server (4. “What is hot standby?”, December 3, 2010), but is used asynchronously in PostgreSQL due to the “Hot Standby” server being updated via the write-ahead log. The writeahead log contains all modifications that are to be applied to the main server before its actual application on the server. On our conceptual architecture for this implementation is performed via the Replication component of the Process Manager Subsystem(Figure One). The Walsender and Walreceiver sub-components are the ones that aid in the redundancy to take place amongst multiple servers. The Walreceiver is the process on the stand-by server that in charge of receiving the write-ahead logs from the master server, whereas the Walsender is the process that is running on the master server that is sending write-ahead logs to stand-by servers. The problem we found with this form of redundancy was that data replication amongst the master and stand-by servers is done asynchronously. Although, this implementation does not hinder performance, it does affect reliability and availability of the system. Reliability affected negatively, as some data integrity may be lost during a failover. Availability is affected negatively for the very same reason, as there might be a period of downtime during a failover. S-Queue-L Page |4 Figure 1 – The Replication Component of the Process Manager subsystem Effects of Enhancement The enhancement will affect PostgreSQL both positively and negatively. As mentioned before, reliability, availability and scalability will increase as a second server running with the exact same process and data is present, and should one server fail, the overall system will still be available. Our enhancement also affects the security of PostgreSQL in a very interesting way. Since our enhancement is meant to ensure both servers constantly have the same database, any type of malicious data corruption in one database will not mean a total loss of data since another PostgreSQL server will contain up to date information on another server. However, if the malicious data corruption comes through the normal input, it will affect both servers in the exact same way. Despite those advantages, performance is seriously impacted as redundancy control will have to remain in communication with the two PostgreSQL servers. This can be problematic if there are many users connected to one server, as the redundancy control component will constantly have to relay information and runs the risk of taking too long to communicate back to the user, becoming a bottleneck. Our enhancement will also lead to an increase in cost, as more servers will have to be purchased. Our redundancy control could be difficult to implement, as it completely changes how a process S-Queue-L Page |5 is begun. In addition, all machines (for the PostgreSQL servers) will have to be equally as powerful to minimize hardware differences between servers. Another possible problem arises: What if both servers crash? Since both servers will be run on the same type of machines, both hardware and software crashes in one server may lead to the same type of crash in the other server. Unfortunately, our enhancement cannot account for such a situation but we have mentioned that both servers should be extremely powerful in order to account for the many users connecting to a server at one given time. While our proposal decreases the reliance on one PostgreSQL system, it increases its reliance on the redundancy control component. This means if the redundancy control component were to crash the whole system would crash. There is also the possibility that a server may have sudden hardware problems and take a great deal of time to complete a query. This can result in one server accumulating a stack of queries to process while the other server completes the changes on time, and on read/write queries we want to make sure that both databases are up to date. To eliminate such a problem, we have allowed the redundancy control to kill the connection to that server if it takes too long to respond, signaling that the server may have technical problems. Software Architecture Analysis Method (SAAM) Software Architecture Analysis Method (SAAM) was used to evaluate our two implementations based upon the non-functional requirements of a software system. In the case of our enhancement, we particularly aimed to enhance reliability and availability. The following steps were taken in order perform our SAAM analysis. Stakeholders & Non-Functional Requirements A stakeholder is anyone that affects or is affected by a project. It could be a person, group or an organization. To perform an effective SAAM analysis, we need to point out the stakeholders of PostgreSQL that our enhancement may either positively or negatively affect. A non-functional requirement is a requirement that specifies standards that can be used to judge the operation of a system. Standards such as Maintainability, Reliability and Availability can be considered as non-functional requirements. On the SAAM analysis, we were required to point out the non-functional requirements for each respective stakeholder. PostgreSQL Developers These are developers of the PostgreSQL software who are responsible for maintaining and scaling of the PostgreSQL system. Thus, maintainability and scalability of the software is of upmost importance to the developers, therefore any changes made to PostgreSQL require the system to still be maintainable and scalable. End Users of PostgreSQL S-Queue-L Page |6 These are the users of PostgreSQL that could range from corporations to private use. We thought that regardless of what type of user is using PostgreSQL, one major non-functional requirement is reliability. Having a reliable database management system is of upmost importance to all type of users, especially corporations, as a reliable system constitutes availability and security as well. Performance and usability are key non-functional requirements as well CommitFest Reviewers These reviewers are the ones who decide during the CommitFest period, which focuses on patch reviews. These stakeholders would require a given patch to be testable code, hence testability is of importance to CommitFest Reviewers. Network Administrators of Backend Server The network administrators are the people whose job is to maintain the backend servers. Thus, reliability and scalability is of upmost importance to them. Comparison of Implementations Comparisons of both our implementations were made based upon stakeholders’ nonfunctional requirements. Reliability and availability were the ones that were given highest priority. 1. Layered Architecture with a Redundancy control sub-system - Maintainability: The system is still very maintainable, as all changes are occurring within the redundancy control sub-system - Scalability: Allows the system to scale out in terms of creating further redundant servers. - Reliability: Both servers are synchronously updated and are independent of each other, which maintains data integrity and in the case of failover, the other working server would immediately take over, thus reliability has increased. - Availability: As reliability has improved, and the system uptime has increased, this implementation increases availability - Performance: Performance was always a design trade-off we were willing to make, and therefore this implementation decreases the overall performance of the system due to the introduction of a new subsystem which increases the number of tasks when compared with PostgreSQL before this implementation - Usability: No change in usability - Security: The redundancy control adds measures of security to ensure data integrity is maintained between both servers, thus adds new security measures to successfully integrate this new subsystem S-Queue-L - Page |7 Testability: Testability of the system still remains the same, as all changes are occurring within the redundancy control sub-system 2. Back-Up Database Server - Maintainability: Very little difference in maintainability, as the backup database server has very few components besides a database, and therefore the PostgreSQL server needs to be functional at all times. - Scalability: System can be scaled out and scaled up as the backup database will get more and more information and needs exist separately on a very powerful system. - Reliability: Not as reliable as other implementation as a failure during Query Processing will mean that the whole system will crash as the second server can’t do anything besides re-running the query all over again. - Availability: No change to availability, as only the database can be accessed - Performance: Slower performance as system will have to communicate with another server. - Usability: Similar to original PostgreSQL implementation. Security: Less secure as the backup database can be accessed maliciously without the knowledge of the PostgreSQL system. - Testability: Not very testable as there are no ways of ensuring that both databases are up to date concurrently. Alternate Implementation Our initial implementation idea, using a Back-Up Database server, requires only two servers; a regular PostgreSQL system on one server and another back-up server which has a copy of the master database. When the query is requested by a user, it is sent to the regular Process Manager, which starts the query on the regular PostgreSQL system and sends a record of the query to the second server in order to keep a note of what query was processed by the user. Once the query is ready to be stored to the database, it is sent to both storage on the regular PostgreSQL system, and to the back-up server for storage. This implementation ensures that there is always two databases with up to date information available, and that if the regular PostgreSQL server was to crash, the second server can respawn the server with the same query, since it’s stored. The disadvantage here is that the query would have to restart. S-Queue-L Page |8 Chosen Implementation Architectural Style and Design Patterns Figure 2 – Depicts the highest-level architectural style, which is layered. Redundancy control consists of the first layer, and the two backend servers, Backend 1 and Backend 2, are in the second layer. To implement system redundancy, we have selected to utilize a layered architecture at the most abstract level. Each subsystem portrayed in Figure 2 runs on its own machine. The upmost layer consists of the actual redundancy control that dictates how each of the back-ends’ are used. The architecture of this level resembles the object-orientated style (see figure 3). The lower layer consists of current PostgreSQL system in its entirety. Within this lower layer, an objectorientated architectural style exists where all subsystems rely on each other in some form. At the highest abstraction level, the current architecture for PostgreSQL is object-oriented; differing from the layered style we feel fits for the addition of redundancy. The change is deceiving however, as the architecture for the current PostgreSQL remains the same (objectoriented), and is just simply moved to the lower-layer of the layered style; “Backend” simply refers to a server that contains exactly what PostgreSQL currently has implemented (see figure 4). S-Queue-L Page |9 Figure 3– Depicts inner components within the redundancy control layer and their dependencies. These components model an implicit invocation style, the backend manager waits for status updates from the failure manager to determine whether both backends are running or have lost a connection. Figure 4– Depicts inner components of each of the backends and their dependencies. It is important to note that both depend on the same redundancy control subsystem. The backend maintains an object-oriented architectural style, and has the same functionality of the current version of PostgreSQL. Testing In order to make sure that PostgreSQL does not become less effective after the integration of our enhancement than in previous releases, regression testing would be required to no functionality is lost. Regression testing could be broken down into three main parts; functionality tests, failure tests and operational tests. First of all, we’ll consider functionality tests. One test type in this group that would be carried out is Black Box Testing, a testing technique whereby the internal workings of the system being tested are not known by the tester (1. “What is Black Box Testing”, December 3, 2010). Running a test on a certain part of the system when a feature is added, changed or extended, also is a part of the functionality tests. The next test group to consider is S-Queue-L P a g e | 10 failure tests. Failure tests help find failures by examining how past failures occurred through current system response or by searching documentation shows if the failure occurred in the past. Once the cause has been identified, the cause has to be documented for future reference in case of the system failing again due to the same reason. Once all failure tests are passed, the whole system is re-tested. Finally, we will look into operational tests. These kinds of tests mainly catch accidental or unintentional changes that could have occurred during the process of integrating our enhancement into PostgreSQL. These tests ensure that existing and intended functionality and behaviors are not lost from within the system. The three different types of integration testing mentioned above should be sufficient to make sure the system is fully functional after the integration of our enhancement into PostgreSQL. Use Case 1 Use Case 1 – Depicts the highest-level architectural style, which is layered. Redundancy control consists of the first layer, and the two backend servers, Backend 1 and Backend 2, are in the second layer. Use Case 1 exemplifies how PostgreSQL would respond to one of the servers crashing. Initially, the user requests to log in to the redundancy control. Once the user is logged in, the redundancy S-Queue-L P a g e | 11 control creates two mirrored Postgres systems, and the library interface requests a server space and a communication channel to be established for the user. Now that the user has access to a communication channel created, a query can be sent through redundancy control to each of the library interfaces. Now each of the library interfaces will send off the received query to their corresponding backends. Both process are carried out in parallel. As the query is being processed, server 1 experiences technical difficulties and crashes. Since server 2 is independent of server 1, the query is carried out by only server 2’s backend. Once the query is executed, it is then sent of back to the redundancy control, which sends it off to the user. Use Case 2 Use Case 2 – Depicts the highest-level architectural style, which is layered. Redundancy control consists of the first layer, and the two backend servers, Backend 1 and Backend 2, are in the second layer. S-Queue-L P a g e | 12 Use Case 2 exemplifies how PostgreSQL would carry out a query without any problems or crashes occurring. Initially, the user requests to log in to the redundancy control. Once the user is logged in, the interface requests a server space and a communication channel to be established for the user. This server request now sets up 2 servers for the user to use. Now that the user has access to a communication channel created, a query is sent off to the redundancy control, which in turn sends it off to both running library interfaces. These library interfaces send the query onto their corresponding backends. Once the query is executed, it is then sent of back to the redundancy control. Since both servers execute the query at the same time, for a read only use case, whichever server returns the result first is sent off back to the user, but in a read/write query, both results are written to disk. Interactions with Other Features As seen in “Use Case 1”, one of the created servers (server 1) crashes, but the query is still executed by the other server. In the redundancy control, we have a component called the “Failure Manager” that constantly keeps a communication path in order to find out if any connection is disrupted with the servers. Since the first system does crash in use case 1, when the failure manager does a check up on the server, it receives a feedback from the first server saying that an error has occurred and the first server has been shut down. That way the redundancy control will not wait for a response from both servers, as one of them has already crashed. Once the query is executed and returned back to the redundancy control, and then back to the user, a component in the redundancy control named the “Backend Manager” will try to re-spawn the server that has crashed with updated data from any queries ran while the server was down. This is so both servers are running again and ready for a new query to be sent by the user. Lessons Learned and Limitations Over the course of our research, we quickly learned that implementing change was rarely a winwin scenario; to improve one aspect, another must be reduced. For example, introducing redundancy to PostgreSQL through the methods described in this report will obviously allow PostgreSQL to be more reliable. On the other hand, redundancy would require more physical equipment, as there must be two identical PostgreSQL systems running on separate, identical machines. This would incur an increased cost, and is the main tradeoff. The main limitation faced was the lack of experience all group members had using database management systems. It was difficult to pinpoint what a true enhancement would be, as we had limited knowledge on actually using PostgreSQL. S-Queue-L P a g e | 13 Conclusion The current architecture for PostgreSQL is object-oriented, and this remains unchanged for ours, as the low-layer (which is the current PostgreSQL system) keeps all current functionality. The difference between PostgreSQL with our implementation and the current PostgreSQL lies in the introduction of the redundancy control, This low-level architectural style remains the same style as the original PostgreSQL system, while the top layer is a completely new system being introduced. S-Queue-L P a g e | 14 Glossary Backend : is a synonym used for one current PostgreSQL system on one server. It is important to note that in the Use Cases we have separated the Library Interface from the backend only to show that it is called exclusively at times, it is still technically part of the backend SQL: Structured Query Language Redundancy: Having multiple instances of the same thing Failover: It is the capability to switch over automatically to a redundant or standby computer server in the case of termination of a main server References 1. "What Is Black Box Testing? - A Word Definition From the Webopedia Computer Dictionary." Webopedia: Online Computer Dictionary for Computer and Internet Terms and Definitions. Web. 03 Dec. 2010. http://www.webopedia.com/TERM/B/Black_Box_Testing.html 2. Deciding Between Redundancy and Clustering (Sun Java System Directory Server Enterprise Edition 6.1 Deployment Planning Guide) - Sun Microsystems." Sun Microsystems Documentation. Web. 04 Dec. 2010. <http://docs.sun.com/app/docs/doc/820-0377/gcbtw?l=en&a=view>. 3. "PostgreSQL: Documentation: Manuals: PostgreSQL 9.0: High Availability, Load Balancing, and Replication." PostgreSQL: The World's Most Advanced Open Source Database. Web. 03 Dec. 2010. <http://www.postgresql.org/docs/9.0/interactive/highavailability.html>. 4. "What Is Hot Standby? - A Word Definition From the Webopedia Computer Dictionary." Webopedia: Online Computer Dictionary for Computer and Internet Terms and Definitions. Web. 03 Dec. 2010. <http://www.webopedia.com/TERM/H/hot_standby.html>.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download To enhance the PostgreSQL system, we propose