* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download transparencies - Indico
Survey
Document related concepts
Oracle Database wikipedia , lookup
Microsoft Access wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Concurrency control wikipedia , lookup
Functional Database Model wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
Relational model wikipedia , lookup
Transcript
Data consistency for Applications using FroNtier/Squid Luis Ramos, CERN 3D Meeting, January 2006 Agenda 1. 2. 3. 4. 5. 6. Frontier Basics Cache consistency issues with Frontier/Squid Inconsistency Scenarios Application Restrictions Summary Conclusions Appendix: Invalidation Mechanism 3D Meeting, January 2006 Luis Ramos, CERN 2 Frontier Basics Frontier servlet generates query results as XML documents from database queries submitted by clients Frontier Client is an C/C++ API to send requests to the Frontier servlet FrontierAccess (Frontier POOL “plug-in”) uses Frontier Client to access Frontier servlets In this context, Frontier is a web-based approach for generic DB access 3D Meeting, January 2006 Luis Ramos, CERN 3 Example - Frontier servlet (HTTP) QUERY: http://pcitdb03.cern.ch:8080/Frontier/Frontier?type=frontier_request:1:DEFAULT&encoding=BLOB&p1=...(SQL query encoded in base64) REPLY: 3D Meeting, January 2006 Luis Ramos, CERN 4 Squid Squid cache servers are placed between clients and the Frontier servlet Squid caches query results (XML documents) and serves them to clients that ask for exactly the same query 3D Meeting, January 2006 Luis Ramos, CERN 5 Cache Consistency Problem Squid caches database query results for a fixed time (HTTP TimeToLive) set by Frontier server (7 days) time-based cache invalidation Backend database change Squid keeps serving stale data to clients 3D Meeting, January 2006 Luis Ramos, CERN 6 Cache Consistency Problem If tables are created in the database, new queries will refer them and results will not be in cache as tables are new, no problem If tables are dropped the cached results will be wrong BUT, if inserts or updates are made in existing tables, cached data in Squids becomes stale! 3D Meeting, January 2006 Luis Ramos, CERN 7 Scenario: CREATE TABLE - OK Cached query: Database change Create table tab3 (…); New query: Select * from tab1, tab2 where … Select * from tab1, tab3 where … Query not cached, OK 3D Meeting, January 2006 Luis Ramos, CERN 8 Scenario: DROP TABLE - KO Cached query: Database change Drop table tab1 (…); New query: Select * from tab1, tab2 where … Select * from tab1, tab2 where … If query is cached, KO: wrong result 3D Meeting, January 2006 Luis Ramos, CERN 9 Scenario: INSERT - KO Cached query: Database change insert into tab1 values (…); New query: Select * from tab1, tab2 where … Select * from tab1, tab2 where … If query is cached, KO: stale data 3D Meeting, January 2006 Luis Ramos, CERN 10 Scenario: UPDATE - KO Cached query: Database change Update tab1 set … where …; New query: Select * from tab1, tab2 where … Select * from tab1, tab2 where … If query is cached, KO: stale data 3D Meeting, January 2006 Luis Ramos, CERN 11 Scenario: new object OK Cached query: Database change insert into objs values (Y, …); insert into attribs values (.., Y, …); New query: Select * from obj, attribs where objs.ID = attribs.OBJ_ID and objs.ID = X select * from objs, attribs where objs.ID = attribs.OBJ_ID and objs.ID = Y Query for object Y is not cached, OK Queries on IDs of static objects, static cache is OK 3D Meeting, January 2006 Luis Ramos, CERN 12 Scenario: new attribute KO Cached query: Database change insert into attribs values (.., X, …); New query: Select * from objs, attribs where objs.ID = attribs.OBJ_ID and objs.ID = X select * from objs, atribs where objs.ID = attribs.OBJ_ID and objs.ID = X Query for object X might be cached, KO Queries on IDs of non static objects, static cache is KO 3D Meeting, January 2006 Luis Ramos, CERN 13 Restrictions with static cache Table drops can lead to wrong query results Data updates can lead to wrong query results Inserts need special care ID based queries are OK Otherwise, KO when inserting in “attribs”, a force refresh is needed at user application level for queries over “objs” Will user applications respect these restrictions? 3D Meeting, January 2006 Luis Ramos, CERN 14 Present Status - problem POOL Frontier plug-in has two types of queries: DB dictionary data and user data To avoid stale cached data, the plug-in does client side cache refresh for metadata queries Stale data in cache may appear in user data queries 3D Meeting, January 2006 Luis Ramos, CERN 15 Invalidation Mechanism Build a cache content invalidation mechanism over Squid/Frontier/OracleDB A way to invalidate cached query results when respective tables are changed Invalidation mechanism basic steps are: Detect database changes Detect which cache content is stale Send invalidation messages to Squids Purge cached content in Squids 3D Meeting, January 2006 Luis Ramos, CERN 16 Conclusions Frontier alone does not grant data consistency Applications must follow a set of rules to keep data consistency (see slide 14) Invalidation mechanism could be developed Some ideas follow in appendix 3D Meeting, January 2006 Luis Ramos, CERN 17 Appendix - Invalidation Steps 1. Database changes detection 2. Stale cached queries detection 3. Invalidation propagation to Squids 4. Purge cached content in Squids 3D Meeting, January 2006 Luis Ramos, CERN 18 1. Database changes detection Options: Database triggers View ALL_TAB_MODIFICATIONS This view is updated off-line with up to 3 hours delay between table update and registration in all_tab_modifications Database auditing data manipulation triggers (DML operations) can only be setup on table level (not on database or schema level) AUDIT INSERT TABLE, UPDATE TABLE, DELETE TABLE BY ACCESS WHENEVER SUCCESSFUL; Oracle Log Miner More info available and less performance overhead than auditing Not so simple as DB auditing and implies setup time overhead 3D Meeting, January 2006 Luis Ramos, CERN 19 1. Database changes detection Database auditing Simple to configure Trigger over the table sys.aud$ Trigger fires a stored procedure to start the invalidation procedure 3D Meeting, January 2006 Luis Ramos, CERN 20 2. Stale cached queries detection How to find pages to invalidate in Squids given the name of a modified table? A mapping between tables and queries Frontier servlet query strings could be modified to ease this mapping Whenever there’s a query to the servlet it must store the query and the tables somewhere When a table is modified all queries with that table are invalidated Danger of invalidating objects that are still valid (over-invalidation) Invalidation procedure can be tricky (invalidation rules) 3D Meeting, January 2006 Luis Ramos, CERN 21 2. Stale cached queries detection Logging queries, clients and tables affected Two logging options: Log module in Frontier servlet (as a servlet wrapper) OR Some script running over Apache logs 3D Meeting, January 2006 Luis Ramos, CERN 22 3. Invalidation propagation to Squids After having a list of queries to invalidate we need to know: What caches requested the query? Easy to register except with hierarchical caches Where are those caches? Caches must be registered in server The cache hierarchy (topology) must be also registered 3D Meeting, January 2006 Luis Ramos, CERN 23 4. Purge cached content in Squids Two options: Purge HTTP command Squid purge tool one object at a time regular expressions for purging multiple objects with one command Performance tests could be done 3D Meeting, January 2006 Luis Ramos, CERN 24 Questions 3D Meeting, January 2006 Luis Ramos, CERN 25