Download To enhance the PostgreSQL system, we propose

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Relational model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Clusterpoint wikipedia , lookup

Team Foundation Server wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

PostgreSQL wikipedia , lookup

Transcript
CISC 322
Assignment 3
Enhancement Proposal for PostgreSQL
December 04, 2010
S-Queue-L
Adam Abu Hijleh ([email protected])
Adam Ali ([email protected])
Khurrum Abdul Mujeeb ([email protected])
Stephen McDonald ([email protected])
Wissam Zaghal ([email protected])
S-Queue-L
Page |1
Table of Contents
Abstract ........................................................................................................................................................ 2
Proposed Enhancement and Why It’s Beneficial ........................................................................................ 2
Current Implementation of Redundancy on PostgreSQL............................................................................ 3
Effects of Enhancement ............................................................................................................................... 4
Software Architecture Analysis Method ..................................................................................................... 5
Alternate Implementation ........................................................................................................................... 7
Chosen Implementation Architectural Styles and Design Patterns ........................................................... 8
Testing .......................................................................................................................................................... 9
Use Case 1 ................................................................................................................................................... 10
Use Case 2 ................................................................................................................................................... 11
Interactions with Other Features .............................................................................................................. 12
Lessons Learned and Limitations ............................................................................................................... 12
Conclusion .................................................................................................................................................. 13
Glossary ...................................................................................................................................................... 14
References .................................................................................................................................................. 15
S-Queue-L
Page |2
Abstract
To enhance the PostgreSQL system, we propose adding an additional back-end, such that there
are two identical back-ends running the exact same processes simultaneously. This report
touches on why we believe redundancy would be a beneficial add-on, how the add-on differs
from the present PostgreSQL implementation and an overall analysis of the effects of the
addition. A presentation of two possible implementations of our redundancy will be delineated,
and each will be taken through a SAAM analysis. Next, the new high and low level architectural
styles are discussed. Any components affected by the implementation are identified, and all new
interactions caused by redundancy with existing features will be mentioned. We then examine
the potential risks that redundancy comes with, followed by two diagrammed use cases that trace
common functions via data flow. The report concludes with a brief consideration of our lessons
learned and limitations, and a quick summation of all presented information.
Proposed Enhancement and Why It’s Beneficial
Although PostgreSQL’s system promotes maintainability, scalability and reliability, there is still
room for improvement. If an entire PostgreSQL server were to crash, the full process it was
working on would obviously be halted, and the user would have to wait server to restore and
utilize the write ahead log to return to your previous state. Thus we felt we that an enhancement
was needed to ensure that there was a backup option in case the entire system was to crash.
One popular database feature used today is redundancy, which involves storing identical copies
of data and processes that are independent of each other (stored/performed on separate servers
that have no inter-communication). This can happen synchronously or asynchronously, and each
option has its advantages. Asynchronous data redundancy already exists in PostgreSQL and is
better known as data replication, a process whereby the master database adds new data and then
sends this newly added data to slave databases (2. “Deciding between Redundancy and
Clustering”, December 4, 2010). Synchronous data redundancy implies the use of two
independent databases that run the exact same processes, at the exact same time, on the exact
same data. In synchronous data redundancy, both databases store the exact same data as well.
With the addition of a synchronous redundancy system, we propose that PostgreSQL should
utilize a layered architecture style, where the current version of PostgreSQL is the lowest layer,
and the upmost layer introduces a new component known as “Redundancy Control”. This
component, which is on its own machine, will take requests from users, duplicate them, and
spawn two separate and independent instances of PostgreSQL, each stored on a separate
machine, to perform the given task. Both PostgreSQL instances complete the same query and
have the same data. Should one of the PostgreSQL servers lose connection with the redundancy
control layer, the Failure Manager within redundancy control will pick up on the lost connection,
and know to uses data from the other server. Once the connection becomes live again, the
Backend Manager will implement all changes on the newly live server that it missed while it was
down.
This feature would be beneficial to PostgreSQL in many ways. Firstly, it would eliminate the
likelihood of the system crashing, as a second database will always be up to date and running.
S-Queue-L
Page |3
This would reassure users that no matter what changes they make, the databases will always be
up to date and only in rare cases would their database not be up to date. It also means that if one
server crashes, the user would not even notice, as the process they are performing would still be
running perfectly on the other server (assuming the crash is not due to reaching hardware limits).
Also, by placing the second database on a completely different system, we make it easier to
scale-up. By having two servers with the same information, we can take one down and upgrade
it while the other server performs all queries, and then update it when it becomes live again. This
is similar to the “Google Dance”, where one instance of Google can be taken down, and
changed, while other instances remain live, leaving the user unaffected.
Current Implementation of Redundancy on PostgreSQL
The latest version of PostgreSQL (Version 9.0) introduced a form of redundancy that was not
found in earlier versions of the software. This implementation works much like a Master-Slave
pattern, where the main server is the Master that regularly updates the stand-by servers (slaves)
via write-ahead logging. These stand-by servers are also given the ability to run read-only
queries, which in contrast to previous versions, was only allowed when the stand-by server
became the new master. (3. “Postgres Documentation: Manuals”, December 3, 2010 )
This implementation of redundancy works asynchronously via a “Hot Standby” server method.
The “Hot Standby” server method is a backup server that receives regular updates and is
standing by ready to take the role of a master server immediately in the event of a failover. A
“Hot Standby” method is generally conceived as a synchronous method of running a master and
stand-by server (4. “What is hot standby?”, December 3, 2010), but is used asynchronously in
PostgreSQL due to the “Hot Standby” server being updated via the write-ahead log. The writeahead log contains all modifications that are to be applied to the main server before its actual
application on the server.
On our conceptual architecture for this implementation is performed via the Replication
component of the Process Manager Subsystem(Figure One). The Walsender and Walreceiver
sub-components are the ones that aid in the redundancy to take place amongst multiple servers.
The Walreceiver is the process on the stand-by server that in charge of receiving the write-ahead
logs from the master server, whereas the Walsender is the process that is running on the master
server that is sending write-ahead logs to stand-by servers.
The problem we found with this form of redundancy was that data replication amongst the
master and stand-by servers is done asynchronously. Although, this implementation does not
hinder performance, it does affect reliability and availability of the system. Reliability affected
negatively, as some data integrity may be lost during a failover. Availability is affected
negatively for the very same reason, as there might be a period of downtime during a failover.
S-Queue-L
Page |4
Figure 1 – The Replication Component of the Process Manager subsystem
Effects of Enhancement
The enhancement will affect PostgreSQL both positively and negatively. As mentioned before,
reliability, availability and scalability will increase as a second server running with the exact
same process and data is present, and should one server fail, the overall system will still be
available.
Our enhancement also affects the security of PostgreSQL in a very interesting way. Since our
enhancement is meant to ensure both servers constantly have the same database, any type of
malicious data corruption in one database will not mean a total loss of data since another
PostgreSQL server will contain up to date information on another server. However, if the
malicious data corruption comes through the normal input, it will affect both servers in the exact
same way.
Despite those advantages, performance is seriously impacted as redundancy control will have to
remain in communication with the two PostgreSQL servers. This can be problematic if there are
many users connected to one server, as the redundancy control component will constantly have
to relay information and runs the risk of taking too long to communicate back to the user,
becoming a bottleneck.
Our enhancement will also lead to an increase in cost, as more servers will have to be purchased.
Our redundancy control could be difficult to implement, as it completely changes how a process
S-Queue-L
Page |5
is begun. In addition, all machines (for the PostgreSQL servers) will have to be equally as
powerful to minimize hardware differences between servers.
Another possible problem arises: What if both servers crash? Since both servers will be run on
the same type of machines, both hardware and software crashes in one server may lead to the
same type of crash in the other server. Unfortunately, our enhancement cannot account for such a
situation but we have mentioned that both servers should be extremely powerful in order to
account for the many users connecting to a server at one given time.
While our proposal decreases the reliance on one PostgreSQL system, it increases its reliance on
the redundancy control component. This means if the redundancy control component were to
crash the whole system would crash.
There is also the possibility that a server may have sudden hardware problems and take a great
deal of time to complete a query. This can result in one server accumulating a stack of queries to
process while the other server completes the changes on time, and on read/write queries we want
to make sure that both databases are up to date. To eliminate such a problem, we have allowed
the redundancy control to kill the connection to that server if it takes too long to respond,
signaling that the server may have technical problems.
Software Architecture Analysis Method (SAAM)
Software Architecture Analysis Method (SAAM) was used to evaluate our two implementations
based upon the non-functional requirements of a software system. In the case of our
enhancement, we particularly aimed to enhance reliability and availability.
The following steps were taken in order perform our SAAM analysis.
Stakeholders & Non-Functional Requirements
A stakeholder is anyone that affects or is affected by a project. It could be a person, group or an
organization. To perform an effective SAAM analysis, we need to point out the stakeholders of
PostgreSQL that our enhancement may either positively or negatively affect.
A non-functional requirement is a requirement that specifies standards that can be used to judge
the operation of a system. Standards such as Maintainability, Reliability and Availability can be
considered as non-functional requirements. On the SAAM analysis, we were required to point
out the non-functional requirements for each respective stakeholder.
PostgreSQL Developers
These are developers of the PostgreSQL software who are responsible for maintaining and
scaling of the PostgreSQL system. Thus, maintainability and scalability of the software is of
upmost importance to the developers, therefore any changes made to PostgreSQL require the
system to still be maintainable and scalable.
End Users of PostgreSQL
S-Queue-L
Page |6
These are the users of PostgreSQL that could range from corporations to private use. We thought
that regardless of what type of user is using PostgreSQL, one major non-functional requirement
is reliability. Having a reliable database management system is of upmost importance to all type
of users, especially corporations, as a reliable system constitutes availability and security as well.
Performance and usability are key non-functional requirements as well
CommitFest Reviewers
These reviewers are the ones who decide during the CommitFest period, which focuses on patch
reviews. These stakeholders would require a given patch to be testable code, hence testability is
of importance to CommitFest Reviewers.
Network Administrators of Backend Server
The network administrators are the people whose job is to maintain the backend servers. Thus,
reliability and scalability is of upmost importance to them.
Comparison of Implementations
Comparisons of both our implementations were made based upon stakeholders’ nonfunctional requirements. Reliability and availability were the ones that were given highest
priority.
1. Layered Architecture with a Redundancy control sub-system
- Maintainability: The system is still very maintainable, as all changes are
occurring within the redundancy control sub-system
-
Scalability: Allows the system to scale out in terms of creating further redundant
servers.
-
Reliability: Both servers are synchronously updated and are independent of each
other, which maintains data integrity and in the case of failover, the other working
server would immediately take over, thus reliability has increased.
-
Availability: As reliability has improved, and the system uptime has increased,
this implementation increases availability
-
Performance: Performance was always a design trade-off we were willing to
make, and therefore this implementation decreases the overall performance of the
system due to the introduction of a new subsystem which increases the number of
tasks when compared with PostgreSQL before this implementation
-
Usability: No change in usability
-
Security: The redundancy control adds measures of security to ensure data
integrity is maintained between both servers, thus adds new security measures to
successfully integrate this new subsystem
S-Queue-L
-
Page |7
Testability: Testability of the system still remains the same, as all changes are
occurring within the redundancy control sub-system
2. Back-Up Database Server
-
Maintainability: Very little difference in maintainability, as the backup database
server has very few components besides a database, and therefore the PostgreSQL
server needs to be functional at all times.
-
Scalability: System can be scaled out and scaled up as the backup database will
get more and more information and needs exist separately on a very powerful
system.
-
Reliability: Not as reliable as other implementation as a failure during Query
Processing will mean that the whole system will crash as the second server can’t
do anything besides re-running the query all over again.
-
Availability: No change to availability, as only the database can be accessed
-
Performance: Slower performance as system will have to communicate with
another server.
-
Usability: Similar to original PostgreSQL implementation.
Security: Less secure as the backup database can be accessed maliciously without
the knowledge of the PostgreSQL system.
-
Testability: Not very testable as there are no ways of ensuring that both databases
are up to date concurrently.
Alternate Implementation
Our initial implementation idea, using a Back-Up Database server, requires only two servers; a
regular PostgreSQL system on one server and another back-up server which has a copy of the
master database. When the query is requested by a user, it is sent to the regular Process Manager,
which starts the query on the regular PostgreSQL system and sends a record of the query to the
second server in order to keep a note of what query was processed by the user. Once the query is
ready to be stored to the database, it is sent to both storage on the regular PostgreSQL system,
and to the back-up server for storage. This implementation ensures that there is always two
databases with up to date information available, and that if the regular PostgreSQL server was to
crash, the second server can respawn the server with the same query, since it’s stored. The
disadvantage here is that the query would have to restart.
S-Queue-L
Page |8
Chosen Implementation Architectural Style and Design Patterns
Figure 2 – Depicts the highest-level
architectural style, which is layered.
Redundancy control consists of the first
layer, and the two backend servers,
Backend 1 and Backend 2, are in the
second layer.
To implement system redundancy, we have selected to utilize a layered architecture at the most
abstract level. Each subsystem portrayed in Figure 2 runs on its own machine. The upmost layer
consists of the actual redundancy control that dictates how each of the back-ends’ are used. The
architecture of this level resembles the object-orientated style (see figure 3). The lower layer
consists of current PostgreSQL system in its entirety. Within this lower layer, an objectorientated architectural style exists where all subsystems rely on each other in some form.
At the highest abstraction level, the current architecture for PostgreSQL is object-oriented;
differing from the layered style we feel fits for the addition of redundancy. The change is
deceiving however, as the architecture for the current PostgreSQL remains the same (objectoriented), and is just simply moved to the lower-layer of the layered style; “Backend” simply
refers to a server that contains exactly what PostgreSQL currently has implemented (see figure
4).
S-Queue-L
Page |9
Figure 3– Depicts inner components within the
redundancy control layer and their dependencies.
These components model an implicit invocation style,
the backend manager waits for status updates from
the failure manager to determine whether both
backends are running or have lost a connection.
Figure 4– Depicts inner components of
each of the backends and their
dependencies. It is important to note that
both depend on the same redundancy
control subsystem. The backend maintains
an object-oriented architectural style, and
has the same functionality of the current
version of PostgreSQL.
Testing
In order to make sure that PostgreSQL does not become less effective after the integration of our
enhancement than in previous releases, regression testing would be required to no functionality is
lost. Regression testing could be broken down into three main parts; functionality tests, failure
tests and operational tests. First of all, we’ll consider functionality tests. One test type in this
group that would be carried out is Black Box Testing, a testing technique whereby the internal
workings of the system being tested are not known by the tester (1. “What is Black Box Testing”,
December 3, 2010). Running a test on a certain part of the system when a feature is added,
changed or extended, also is a part of the functionality tests. The next test group to consider is
S-Queue-L
P a g e | 10
failure tests. Failure tests help find failures by examining how past failures occurred through
current system response or by searching documentation shows if the failure occurred in the past.
Once the cause has been identified, the cause has to be documented for future reference in case
of the system failing again due to the same reason. Once all failure tests are passed, the whole
system is re-tested. Finally, we will look into operational tests. These kinds of tests mainly catch
accidental or unintentional changes that could have occurred during the process of integrating
our enhancement into PostgreSQL. These tests ensure that existing and intended functionality
and behaviors are not lost from within the system. The three different types of integration testing
mentioned above should be sufficient to make sure the system is fully functional after the
integration of our enhancement into PostgreSQL.
Use Case 1
Use Case 1 – Depicts the highest-level architectural style, which is layered. Redundancy control consists of the first layer, and
the two backend servers, Backend 1 and Backend 2, are in the second layer.
Use Case 1 exemplifies how PostgreSQL would respond to one of the servers crashing. Initially,
the user requests to log in to the redundancy control. Once the user is logged in, the redundancy
S-Queue-L
P a g e | 11
control creates two mirrored Postgres systems, and the library interface requests a server space
and a communication channel to be established for the user. Now that the user has access to a
communication channel created, a query can be sent through redundancy control to each of the
library interfaces. Now each of the library interfaces will send off the received query to their
corresponding backends. Both process are carried out in parallel. As the query is being
processed, server 1 experiences technical difficulties and crashes. Since server 2 is independent
of server 1, the query is carried out by only server 2’s backend. Once the query is executed, it is
then sent of back to the redundancy control, which sends it off to the user.
Use Case 2
Use Case 2 – Depicts the highest-level architectural style, which is layered. Redundancy control consists of the first layer, and
the two backend servers, Backend 1 and Backend 2, are in the second layer.
S-Queue-L
P a g e | 12
Use Case 2 exemplifies how PostgreSQL would carry out a query without any problems or
crashes occurring. Initially, the user requests to log in to the redundancy control. Once the user is
logged in, the interface requests a server space and a communication channel to be established
for the user. This server request now sets up 2 servers for the user to use. Now that the user has
access to a communication channel created, a query is sent off to the redundancy control, which
in turn sends it off to both running library interfaces. These library interfaces send the query onto
their corresponding backends. Once the query is executed, it is then sent of back to the
redundancy control. Since both servers execute the query at the same time, for a read only use
case, whichever server returns the result first is sent off back to the user, but in a read/write
query, both results are written to disk.
Interactions with Other Features
As seen in “Use Case 1”, one of the created servers (server 1) crashes, but the query is still
executed by the other server. In the redundancy control, we have a component called the “Failure
Manager” that constantly keeps a communication path in order to find out if any connection is
disrupted with the servers. Since the first system does crash in use case 1, when the failure
manager does a check up on the server, it receives a feedback from the first server saying that an
error has occurred and the first server has been shut down. That way the redundancy control will
not wait for a response from both servers, as one of them has already crashed. Once the query is
executed and returned back to the redundancy control, and then back to the user, a component in
the redundancy control named the “Backend Manager” will try to re-spawn the server that has
crashed with updated data from any queries ran while the server was down. This is so both
servers are running again and ready for a new query to be sent by the user.
Lessons Learned and Limitations
Over the course of our research, we quickly learned that implementing change was rarely a winwin scenario; to improve one aspect, another must be reduced. For example, introducing
redundancy to PostgreSQL through the methods described in this report will obviously allow
PostgreSQL to be more reliable. On the other hand, redundancy would require more physical
equipment, as there must be two identical PostgreSQL systems running on separate, identical
machines. This would incur an increased cost, and is the main tradeoff.
The main limitation faced was the lack of experience all group members had using database
management systems. It was difficult to pinpoint what a true enhancement would be, as we had
limited knowledge on actually using PostgreSQL.
S-Queue-L
P a g e | 13
Conclusion
The current architecture for PostgreSQL is object-oriented, and this remains unchanged for ours,
as the low-layer (which is the current PostgreSQL system) keeps all current functionality. The
difference between PostgreSQL with our implementation and the current PostgreSQL lies in the
introduction of the redundancy control,
This low-level architectural style remains the same style as the original PostgreSQL system,
while the top layer is a completely new system being introduced.
S-Queue-L
P a g e | 14
Glossary
Backend : is a synonym used for one current PostgreSQL system on one server. It is important
to note that in the Use Cases we have separated the Library Interface from the backend only to
show that it is called exclusively at times, it is still technically part of the backend
SQL: Structured Query Language
Redundancy: Having multiple instances of the same thing
Failover: It is the capability to switch over automatically to a redundant or standby computer
server in the case of termination of a main server
References
1. "What Is Black Box Testing? - A Word Definition From the Webopedia Computer
Dictionary." Webopedia: Online Computer Dictionary for Computer and Internet Terms
and
Definitions.
Web.
03
Dec.
2010.
http://www.webopedia.com/TERM/B/Black_Box_Testing.html
2. Deciding Between Redundancy and Clustering (Sun Java System Directory Server
Enterprise Edition 6.1 Deployment Planning Guide) - Sun Microsystems." Sun
Microsystems
Documentation.
Web.
04
Dec.
2010.
<http://docs.sun.com/app/docs/doc/820-0377/gcbtw?l=en&a=view>.
3. "PostgreSQL: Documentation: Manuals: PostgreSQL 9.0: High Availability, Load
Balancing, and Replication." PostgreSQL: The World's Most Advanced Open Source
Database. Web. 03 Dec. 2010. <http://www.postgresql.org/docs/9.0/interactive/highavailability.html>.
4. "What Is Hot Standby? - A Word Definition From the Webopedia Computer Dictionary."
Webopedia: Online Computer Dictionary for Computer and Internet Terms and
Definitions.
Web.
03
Dec.
2010.
<http://www.webopedia.com/TERM/H/hot_standby.html>.