* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Database Farming
Oracle Database wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Microsoft Access wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Concurrency control wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Clusterpoint wikipedia , lookup
Relational model wikipedia , lookup
Database Farming For Improved Performance Presented By: Russell Yong Supervisor: Prof Wentworth Problem at Hand Database solution for a large corporation Expensive software (Oracle Database Enterprise Edition +- US $40k) Top-end hardware Microsoft’s SQL Server 2000 Not same level of confidence Solution Adapt the popular technique of backend server farming Apply it to databases – to create a high performance database web service Backend setup being invisible to the user Hypothesis Technique will create a more cost effective database farm Eradicate some problems associated with dealing with large databases Our Plan Standard 3-Tier Model Our Plan Adapted 3-Tier Model Conceptually Pool of Connections http request Web-server DataSet DataSet http request DataSet DataSet DataSet http request Clients Multiple Threads of Execution Web Service Farm of Databases DataSet Object In-memory cache of data Comparable to a mini-database Multiple tables Relationships Constraints DataSet Object Easily serialized into and back out of XML Structure (tables, columns, etc) described in an XML schema View and manipulate using either relational or XML methods (unified programming model) Compatible with other XML speaking applications DataSet Object Disconnected Model Sub-queries fill individual datasets Collector Object Collect and merge individual sub-queries Returned to the client Typed DataSet Has an implicit schema Allows for more efficient filling Faster access Created via Form Designer, programmatically, or at run time via XSD XSD File <?xml version="1.0" encoding="utf-8"?> <xs:schema id="PingDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemasmicrosoft-com:xml-msdata"> <xs:element name="PingDataSet" msdata:IsDataSet="true"> <xs:complexType> <xs:choice maxOccurs="unbounded"> <xs:element name="PingInfo"> <xs:complexType> <xs:sequence> <xs:element name="ip" type="xs:string" /> <xs:element name="seen" type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element> </xs:choice> </xs:complexType> <xs:unique name="PK_PingPrimaryKey" msdata:PrimaryKey="true"> <xs:selector xpath=".//PingInfo" /> <xs:field xpath="ip" /> <xs:field xpath="seen" /> </xs:unique> </xs:element> </xs:schema> Implications for Web-Applications Resource sensitive approach “Bulk” approach to communication Access local cache Ideal for non-volatile data Implications for Web-Applications Optimistic concurrency model Most applications ? Improved performance (no locking) No persistent connection required (resources) Minimize required server resources Connections used more effectively Exceptions are dealt with accordingly Our Database Excess of “10 Million Records” Network traffic information Partitioned in 10 segments Initial difficulty Distributed over 3 machines (SQL Server 2000) Simulating a completely distributed environment Data Providers SQL Server .NET Data Providers SqlConnection SqlDataAdapter SqlCommand OLE DB .NET Data Providers ODBC .NET Data Providers (separate download) Data Providers Our Framework MyQueryHandler Farming Layer An instance for each individual user query Distributor (spawns threads) Collector, merging DataSets as they return All encompassing DataSet Pluggable MyThreadHandler Represents individual threads Fills separate DataSets for each of the partitions in the farm Returns DataSet to QueryHandler Pluggable Specifying Queries Couple queries hard-coded Defined according to a parameter Future Extensions… Tests and Results Ran queries 100 times Gauge mean Filter out any possible influencing factors Influencing factors Network traffic Active machines Testing and Results Simple query “SELECT * FROM ping WHERE (ip = 2464643887) OR (ip = 2464643464) OR (ip = '2464639301') OR (ip = '2464625293')” Returning 11 853 rows Farming Method Averaged 35 seconds Normal Method Averaged 94 seconds Testing and Results 01:17.760 01:09.120 01:00.480 00:51.840 00:43.200 00:34.560 00:25.920 00:17.280 00:08.640 00:00.000 Normal Method Time to Query 1 12 23 34 45 56 67 78 89 100 Query Number Query Time Query Time Farming Method 02:18.240 02:00.960 01:43.680 01:26.400 01:09.120 00:51.840 00:34.560 00:17.280 00:00.000 Time to Query 1 12 23 34 45 56 67 78 89 100 Query Number “SELECT * FROM ping WHERE (ip = 2464643887) OR (ip = 2464643464) OR (ip = '2464639301') OR (ip = '2464625293')” Hypothesis Technique will create a more cost effective database farm Eradicate some problems associated with dealing with large databases Possible Extensions Full access to DB via HTTPS Front-end Query construction wizard Investigate partitioning techniques “Intelligent” querying Questions ?