Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Versant Object Database wikipedia , lookup
Information privacy law wikipedia , lookup
Operational transformation wikipedia , lookup
Clusterpoint wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Data vault modeling wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
Enterprise content management wikipedia , lookup
Relational model wikipedia , lookup
Services: White Paper: Implementing a Taxonomy Implementing a Taxonomy A Comparison of Database Approaches Vignette Content Management Blueprint White Paper December 2002. v.1.0 Table of Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Taxonomy Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 String-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Dimension-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Data Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Editorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Transactional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Schema Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Single-Table Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Single-Table Example Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Multi-Table Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Multi-Table Example Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Appendix A. Detailed Database Schema . . . . . . . . . . . . . . . . . . . . . . . . . .6 Services: White Paper: Implementing a Taxonomy Copyright 2002 Vignette Corporation. All rights reserved. U.S. Patent Pending. This document is confidential, and is an unpublished work and trade secret of Vignette. This document is for internal use only and may not be distributed to third parties. Vignette and the V Logo are trademarks or registered trademarks of the Vignette Corporation in the United States and other countries. All other company, product, and service names and brands are the trademark or registered trademarks of their respective owners. A Comparison of Database Approaches Vignette Content Management Blueprint Introduction The concept of a content classification taxonomy is crucial to the proper implementation and maintenance of a content management solution. A content taxonomy provides order to a large volume of content, and allows business users to navigate and manage content more efficiently. This paper addresses two basic approaches for providing taxonomy support in the content management application, and explains how an implementation should be performed. The Appendix provides a sample detailed database schema. This paper assumes that readers have read “Designing an Integrated Content Management Solution: A Taxonomy-Based Approach” and are familiar with various taxonomy descriptions and terminology. Taxonomy Approaches casual”. One advantage of this approach is that it allows There are basically two types of taxonomies. The first is a for a linear view of the content categories. It also provides string-based taxonomy. In a string-based approach, flexibility in adding new nodes and ensures that new nodes taxonomy nodes are represented by a string that can be added without affecting other taxonomy designates the position of a category in the hierarchy. The constraints. A string-based approach also provides a second type is a dimension-based approach that allows for clearer view of the taxonomy when performing searches. the categories in the taxonomy to be broken down into However, there are some drawbacks to the string- based discrete areas. A dimension-based approach allows more approach. First, this approach forces the taxonomy to be flexibility in placing content into various nodes represented linear by design, and although this helps build in structure, by the dimension to which they belong. Dimensions are it limits the robustness of the taxonomy. Second, utilizing a most commonly utilized in a navigation taxonomy to string to represent the taxonomy makes it difficult to present multiple views of the same content. For example, provide multiple tagging capabilities for specific pieces of dimensions in a retail taxonomy may be product type content. Lastly, the string-based approach typically does (sweaters, shoes, etc.), gender (male, female), or type not enforce referential integrity and thus makes it more (casual, evening, special, etc.). Content items can be difficult to maintain the taxonomy. classified using nodes in one or all of these dimensions. Dimension-Based Approach String-Based Approach A dimension-based taxonomy approach provides more In a string-based approach, nodes are formed by a flexibility in tagging to categories in specific dimensions. It separate string that defines the node and its position in the also enforces a more structured approach to the hierarchy hierarchy. For example, “product > sweaters > women’s > of the taxonomy. Dimensions also typically build in referential integrity, which aids in the maintenance of the taxonomy. Some of the disadvantages of this approach involve performance and tagging constraints. Performance Services: White Paper: Implementing a Taxonomy Implementing a Taxonomy Implementing a Taxonomy – White Paper. December 2002. v 1.0 may become an issue if items are tagged across several dimensions and cause SQL joins across multiple tables and complicated queries. Tagging may also become an issue if affinity to the various schema approaches: ■ String-based taxonomy with Editorial Data approach favors MULTI-TABLE several dimensions are built into the taxonomy. If a content producer must tag content for multiple dimensions, this approach increases the content tagging work significantly. ■ Dimension-based taxonomy with Editorial Data approach favors SINGLE-TABLE ■ String-based taxonomy with Transactional Data approach Data Approaches Another key area to consider when deciding how to implement the database design to support a taxonomy is the approach used for storing the data that represents the taxonomy. There are two basic types of data approaches: an editorial approach to data in which data will be viewed as mere content that gets published and pushed out to the Web, or a transactional approach in which data is created by transactions. Editorial Editorial data is the easiest of all to deal with in terms of entering, storing, and displaying because it does not require extensive overhead to ensure transactional compliance. An editorial approach typically aligns well with a generic view of data. Editorial data can use a single-table approach, since a single table can view all the content as basically having the same behavior with the only difference being the taxonomy category to which the content items are tagged. Transactional favors MULTI-TABLE ■ Dimension-based taxonomy with Transactional Data approach favors MULTI-TABLE Single-Table Approach The single-table schema approach utilizes a single table to hold content and typically provides a single table to manage the taxonomy. There are several advantages to utilizing this approach. First, it provides an easy and flexible mechanism for publishing content. Second, it allows greater flexibility for use of taxonomy dimensions. Lastly, it provides faster time to deployment. The disadvantages of this approach include scalability, performance, and documentation issues. Scalability issues may arise if a certain threshold is reached within the content table, depending on the specific DBMS used. Performance issues may also be a concern depending upon how the data is queried and retrieved from the database. Documentation can also be an issue if a developer has a need to understand how to retrieve specific content items. Transactional data involves a more complex data model due to Single-Table Example Schemas. The schemas below show the fact that entities will behave differently and have specific examples of the single-table approach. relationships to other data entities. The transactional approach favors a multi-table view of the data because it creates a need String-Based Editorial CONTENT_TYPE for transactional overhead on specific tables and a need to PK create one-to-many views of the data. The data is differentiated by the entities in which it lies and forms a foundation for securing and maintaining transactional compliance. Schema Approaches Once the taxonomy and data approaches have been NAVIGATION_TAXONOMY TYPE_ID ENABLED DESCRIPTION TEMPLATE_PATH PK NAV_ID NAV_STRING CONTENT NAV_MAP PK,FK1 PK,FK2 PK CONTENT_ID FK1 FK2 TITLE TEASER BODY IMAGE CONT_ID TYPE_ID NAV_ID CONTENT_ID selected, it is much easier to decide which schema approach to take. The following lists the various approaches and their CONTENT_TAXONOMY PK CONT_ID CONTENT_STRING 2 Implementing a Taxonomy – White Paper. December 2002. v 1.0 This is a very simplistic view of how a string-based taxonomy Dimension-Based Editorial can be implemented in the database schema. The Navigation In this example, dimensions located within the and Content taxonomy are kept separate. A content item can NAVIGATION_TAXONOMY and CONTENT_TAXONOMY be assigned to multiple navigation taxonomy strings whereas tables represent the taxonomy. The dimensions are represented it can only be associated to one content taxonomy string. This by the recursive relationship that allows a node to be a parent approach ensures that content items are only associated to of nodes below it and to have parent nodes above it. This one node within the content taxonomy, which creates a more approach allows for a more structured view of the taxonomy structured view of the taxonomy for managing the content. and the various dimensions that make up the taxonomy. CONTENT_TYPE PK TYPE_ID NAVIGATION_TAXONOMY PK NAV_ID ENABLED DESCRIPTION TEMPLATE_PATH PARENT_ID NODE CONTENT NAV_MAP PK PK,FK1 NAV_ID PK,FK2 CONTENT_ID CONTENT_ID TITLE TEASER BODY IMAGE FK1 CONT_ID FK2 TYPE_ID CONTENT_TAXONOMY PK CONT_ID PARENT_ID NODE 3 Implementing a Taxonomy – White Paper. December 2002. v 1.0 Multi-Table Approach approach because it allows them to maintain better control Using a multi-table is the classical data model approach and over performance and maintenance of the data model. is one that will maintain transactional compliance for data Multi-Table Example Schemas. Below are some examples of exchange and storage. This approach is generally taken how the multi-table schema might be approached. Refer to when a client wishes to utilize the database beyond mere the schema in Appendix A for further details. content management. Most DBAs will also prefer this String – Multi-Table COMPANY_INFO PK COMPANY_ID NAME DESCRIPTION FK1 CONTENT_TAXID CONTENT_TAXONOMY PK CONTENT_TAXID TAX_STRING PK PRODUCT NAV_TAX_MAP PRODUCT_ID PK,FK1 PRODUCT_ID PK,FK2 NAV_ID TITLE TEASER BODY IMAGE PRICE FK1 CONTENT_TAXID FK3 FK4 SERVICE_ID COMPANY_ID SERVICE PK SERVICE_ID TITLE TEASER BODY PRICE FK1 CONTENT_TAXID NAVIGATION_TAXONOMY ORDER PK PK NAV_ID NAV_STRING ORDER_ID DATE_ORDERED STATUS DATE_SHIPPED FK1 COMPANY_ID ORDER_DETAIL FK1 PRODUCT_ID FK2 ORDER_ID 4 Implementing a Taxonomy – White Paper. December 2002. v 1.0 In this data model, products, services and companies are all provide organization for content around these core comprised of separate entities. This separation is done for components of the schema. The navigation taxonomy is used transactional purposes to maintain integrity on the to specify the contents of specific views of product and information that is being stored, but there is content in these services data as it applies to company info. tables which should also be viewed on the Web Site. In order to display the content on the Web Site, the tables should be Conclusion associated with the navigation taxonomy nodes, and with the This paper has presented several approaches to content taxonomy nodes for backend content management implementing a taxonomy through database design, and tagging. This model uses a string-based approach in order to addressed the pros and cons of the approaches and place new products into distinctive nodes without relying on a provided examples for database schema models. The structured taxonomy. Appendix has a more detailed implementation database The example below shows how a transactional schema might look for a dimension-based approach. Product, Service and schema that provides a greater level of detail and real-world requirements. Company Info all relate to the Content Taxonomy and Dimension – Multi-Table COMPANY_INFO PK COMPANY_ID NAME DESCRIPTION CONTENT_TAXONOMY PK CONT_ID PARENT_ID NODE PRODUCT NAV_TAX_MAP SERVICE PK PRODUCT_ID PK,FK1 PRODUCT_ID PK SERVICE_ID TITLE TEASER BODY IMAGE PRICE FK2 FK3 ORDER PK ORDER_ID SERVICE_ID COMPANY_ID TITLE TEASER BODY PRICE NAVIGATION_TAXONOMY PK NAV_ID DATE_ORDERED STATUS DATE_SHIPPED FK1 COMPANY_ID PARENT_ID NODE ORDER_DETAIL FK1 PRODUCT_ID FK2 ORDER_ID 5 Implementing a Taxonomy – White Paper. December 2002. v 1.0 Appendix A. Detailed Database Schema Dimension-Based Editorial Channel_Item_Map Channel channel_id sequence parent_channel_id (FK) (IE) item_name workflow_status workflow_oid channel_item_id publish_date (IE) expire_date (IE) sequence item_id (FK) (IE) channel_id (FK) (IE) relationship_modifier (FK) (IE) priority (FK) (IE) item_name workflow_status workflow_oid Item_Attribute_Usage item_type_id (FK) attribute_id (FK) (IE) attribute_usage_modifier (FK) (IE) allow_multiple sequence Item_Type item_type_id item_type_name item_type_description Item Channel_Display Attribute_Type channel_title (IE) channel_description channel_id (FK) (IE) channel_template_id (FK) (IE) display_modifier (FK) (IE) attribute_id attribute_name attribute_description attribute_data_type uses_value_short item_id audit_user audit_date item_status (IE) item_type_id (FK) (IE) item_modifier (FK) (IE) content_provider_id (FK) (IE) Item_Item_Map parent_item_id (FK) child_item_id (FK) (IE) relationship_type (FK) (IE) sequence Attribute_Value item_id (FK) value_long attribute_id (FK) (IE) sequence value_short Channel_Template channel_template_id template_path (IE) template_name Provider_Channel_Map channel_id (FK) (IE) provider_channel_modifier (FK) (IE) content_provider_id (FK) (IE) Lookup Content_Provider lookup_id type_description content_provider_id provider_name editable contact address phone_number language_id (FK) (IE) encoding_id (FK) comments deleted session_id expire_date Sub_Lookup vgn_ur sub_lookup_id subtype_description lookup_id (FK) (IE) id country_id (FK) language_id login name password email passwordcreated canchangepassword ch_lname ch_fname en_fname en_lname address1 address2 address3 city province_id postal_code phone mobile pager birthday gender email2 income education fax maritial_status occupation Workflow_Audit table_name table_column table_id audit_user audit_date action (FK) (IE) User_Profile user_id ticker_symbol sequence Country country_id country_name Label_Display Stock_Data Province User_Weather province_id country_id province_name city_id user_id ticker_symbol exchange company_name current_price currency_type volume bid_price ask_price net_change percent_change yield open_price close_price close_date high_price low_price dividend year_high year_low earnings pe_ratio trade_time trade_date ric_code delayed label_text label_id (FK) (IE) label_display_modifier (FK) (IE) Label label_id label_description label_key 6 SVCWP_IMPL_TAX_1202 Implementing a Taxonomy – White Paper. December 2002. v 1.0 7 Vignette Corporate Headquarters Vignette Latin America Vignette Europe / Middle-East / Africa Vignette Asia-Pacific 1601 South MoPac Expressway 305.789.6603 Tel 44.1628.77.2000 Tel 1.800.800.848 Tel Austin, TX 78746-5776 305.789.6612 Fax 44.1628.77.2266 Fax 61.2.9455.5200 Fax 512.741.4300 Tel [email protected] [email protected] [email protected] 512.741.4500 Fax 888.608.9900 Toll-Free Email info @ vignette.com Publication date: June 2002. Vignette does not warrant, guarantee, or make representations concerning the contents of this document. All information is provided “AS-IS,” without express or implied warranties of any kind. Vignette reserves the right to change the contents of this document and the features or functionalities of its products at any time without obligation to notify anyone of such changes. Copyright 1997-2001 Vignette Corporation. All rights reserved. Vignette, the V Logo, www.vignette.com, StoryServer, netCustomer, and Centerstage are trademarks or registered trademarks of Vignette Corporation in the United States and foreign countries. VGM, VPS, and Vignette Village are servicemarks of Vignette Corporation in the United States and foreign countries. All other brands, products and company names mentioned are the trademarks of their respective owners.