Download Database Farming

Database Farming For Improved Performance Presented By: Russell Yong Supervisor: Prof Wentworth Problem at Hand  Database solution for a large corporation  Expensive software (Oracle Database Enterprise Edition +- US $40k)  Top-end hardware  Microsoft’s SQL Server 2000  Not same level of confidence Solution  Adapt the popular technique of backend server farming  Apply it to databases – to create a high performance database web service  Backend setup being invisible to the user Hypothesis  Technique will create a more cost effective database farm  Eradicate some problems associated with dealing with large databases Our Plan Standard 3-Tier Model Our Plan Adapted 3-Tier Model Conceptually Pool of Connections http request Web-server DataSet DataSet http request DataSet DataSet DataSet http request Clients Multiple Threads of Execution Web Service Farm of Databases DataSet Object  In-memory cache of data  Comparable to a mini-database  Multiple tables  Relationships  Constraints DataSet Object  Easily serialized into and back out of XML  Structure (tables, columns, etc) described in an XML schema  View and manipulate using either relational or XML methods (unified programming model)  Compatible with other XML speaking applications DataSet Object  Disconnected Model  Sub-queries fill individual datasets  Collector Object  Collect and merge individual sub-queries  Returned to the client Typed DataSet     Has an implicit schema Allows for more efficient filling Faster access Created via Form Designer, programmatically, or at run time via XSD XSD File <?xml version="1.0" encoding="utf-8"?> <xs:schema id="PingDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemasmicrosoft-com:xml-msdata"> <xs:element name="PingDataSet" msdata:IsDataSet="true"> <xs:complexType> <xs:choice maxOccurs="unbounded"> <xs:element name="PingInfo"> <xs:complexType> <xs:sequence> <xs:element name="ip" type="xs:string" /> <xs:element name="seen" type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element> </xs:choice> </xs:complexType> <xs:unique name="PK_PingPrimaryKey" msdata:PrimaryKey="true"> <xs:selector xpath=".//PingInfo" /> <xs:field xpath="ip" /> <xs:field xpath="seen" /> </xs:unique> </xs:element> </xs:schema> Implications for Web-Applications     Resource sensitive approach “Bulk” approach to communication Access local cache Ideal for non-volatile data Implications for Web-Applications  Optimistic concurrency model       Most applications ? Improved performance (no locking) No persistent connection required (resources) Minimize required server resources Connections used more effectively Exceptions are dealt with accordingly Our Database  Excess of “10 Million Records”  Network traffic information  Partitioned in 10 segments  Initial difficulty  Distributed over 3 machines (SQL Server 2000)  Simulating a completely distributed environment Data Providers  SQL Server .NET Data Providers  SqlConnection  SqlDataAdapter  SqlCommand  OLE DB .NET Data Providers  ODBC .NET Data Providers (separate download) Data Providers Our Framework MyQueryHandler       Farming Layer An instance for each individual user query Distributor (spawns threads) Collector, merging DataSets as they return All encompassing DataSet Pluggable MyThreadHandler  Represents individual threads  Fills separate DataSets for each of the partitions in the farm  Returns DataSet to QueryHandler  Pluggable Specifying Queries  Couple queries hard-coded  Defined according to a parameter  Future Extensions… Tests and Results  Ran queries 100 times  Gauge mean  Filter out any possible influencing factors  Influencing factors  Network traffic  Active machines Testing and Results  Simple query  “SELECT * FROM ping WHERE (ip = 2464643887) OR (ip = 2464643464) OR (ip = '2464639301') OR (ip = '2464625293')”  Returning 11 853 rows  Farming Method  Averaged 35 seconds  Normal Method  Averaged 94 seconds Testing and Results 01:17.760 01:09.120 01:00.480 00:51.840 00:43.200 00:34.560 00:25.920 00:17.280 00:08.640 00:00.000 Normal Method Time to Query 1 12 23 34 45 56 67 78 89 100 Query Number Query Time Query Time Farming Method 02:18.240 02:00.960 01:43.680 01:26.400 01:09.120 00:51.840 00:34.560 00:17.280 00:00.000 Time to Query 1 12 23 34 45 56 67 78 89 100 Query Number “SELECT * FROM ping WHERE (ip = 2464643887) OR (ip = 2464643464) OR (ip = '2464639301') OR (ip = '2464625293')” Hypothesis  Technique will create a more cost effective database farm   Eradicate some problems associated with dealing with large databases  Possible Extensions  Full access to DB via HTTPS  Front-end  Query construction wizard  Investigate partitioning techniques  “Intelligent” querying Questions ?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Database Farming