Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
WEB DATABASE DESIGN STRATEGIES When working with databases for the web you are in a unique position. You provide the data link between the normal business world where databases have been traditionally accessed by desktop applications and the web, where you are using HTML to create your own interfaces. In the business world, many RDBMS’s provide GUIs for accessing the data, creating SQL statements, Views & Forms and routinely hide the complexity of the application from all but the most serious users. On the web, where speed and efficiency are highly important, gone are all the warm fuzzy interfaces. If we want to use them, we usually have to create them ourselves. Frequently these 2 paradigms clash, and become entirely our problem, as a web developer. You may need to go deep into the business logic of the way a business database works in order to even determine if you can help a business. Some jobs may include a database that is highly normalized in its offline (or online) design. These jobs would be the more challenging. If you are designing the database from scratch (for a startup, for example) you can choose to include as much normalization as you can stand! For our purposes, we’ll imagine an online apparel store. We need to be able to accommodate clothing items in varying sizes and colors. NORMALIZATION ISSUES Normalization is a 3 step process for designing a more efficient database design. The efficiency is in the area of data storage, however, since we may in the process complicate (and slow down) the speed of data access, and complicate the SQL queries for selecting, updating and deleting data. Normalization occurs in 3 forms. Each successive form is said to be further normalized. Most databases in use are officially between the 2nd and 3rd normal forms. Very few can claim to be (or even may not WANT to be) fully the 3 rd normal form. Here is a layman’s view of the 3 forms: First Normal Form: Each column can only contain one value. For this to be true, we need to separate compound info, like an address field into separate fields for city, state, address, etc. This is easy and always a correct idea. Second Normal Form: Every column in a table that is not a key can only apply to the primary key. Here foreign keys are allowed, but each other piece of data should relate directly to the primary key. We somewhat violate this in our example, since we have included the business logic that a shipping address (and name to be shipped to) can be different than the person making the purchase. This allows a customer to order and ship a gift direct. This also saves a correct address of shipment if the customer later changes their address info. We also violate this in the area of Dprice in the tblOrderDetail. The Dprice (item detail price) was the price applied to the item at the time of purchase. If we relied entirely on the price as it is currently in the database, we lose account of the price of an item that was for sale, at the time of purchase. If the price changes while the item is still in process, the price for the order follows the detailed item, not the current price of the item. Third Normal Form: Every non-key item is completely independent of every other non-key item. To do this, we would need to split out items that can have variable items into separate tables. The most obvious for us would be to split out Sizes and Colors into separate tables. The tables would then have a “link table” whose sole existence would be link size, color and clothing items together. For instance, if we had an item called “shirt”, would it come in red? If so, would it come in Small, Medium and Large? Each of these colors, sizes and clothing items could be cross-referenced on a link table. Price and quantity could also follow on this link table, since size may determine price, and quantity would need to be by both size and color, so that we don’t think we have small red shirts, when we only have large! There would be a potentially greater number of records in the link table also. If you go with 3 clothing items, (shirt, pants, coat) and say each comes in 3 sizes (small, medium, large) and then say there are 3 colors (red,blue,yellow) you now would be entering 18 items. You would potentially multiply the number of items, times the number of colors times the number of sizes, (3*3*3=18) and your three items become quite a bit more data, and labor to input! However, to accurately keep track of quantity and price differences between sizes, this is probably the best way to go. Normalization can be diametrically opposed to speed and to ease of use for developer and user alike. When a DB is highly normalized (nearly complete 3rd form) the number of tables is high, and the number of items per table are very low. To be able to see the data in a relevant manner, views can be created to see the results of complicated table joins to relate data from the greater number of tables. The number of link tables (tables that exist only to join 2 or more unique ID numbers from different tables) greatly exacerbates this problem. When a database is running natively on a desktop RDBMS system, “views” and “forms” can be created at different levels to give users access to the data in a relevant manner. When we are developing for the web, many of these “views” and “forms” may be custom built by the developer. The extra development time involved curtails the design in favor of a less normalized DB that is more suitable for the web. STORING DATA IN CODE, vs STORING DATA IN THE DB As a web developer, you may be faced with the decision to store data in a database, or to store the data in the actual application you develop. For instance, imagine you had created the database as described above, with the link table, and the customer (the one who pays you) says for instance, “oh, by the way, can you add a 5% surcharge on the customers who order from New York?”. This could either mean a yet another table, and much more involved code, or it could be a simple “if/then” or “case select” statement in your code. When presented with an issue like this, the questions that are relevant could be: How many customers do you anticipate in New York? Do you anticipate other surcharges? Any of different amounts? Creating a new table and all the requisite changes to the code (on possibly multiple pages) AND the extra DB hits you don’t need could be curtailed by that “case select” that is suddenly sounding like such a good idea! If you choose to go that route, you have MANY things riding in your favor! First, your application can run through hundreds of lines of code MUCH faster than hitting any database, any time. Because of that, the less hits to the DB the better, and your application will be faster for it! Second, you can position your “surcharge” into the page in an easy manner to get to, to edit later. If you create a variable at the top of the page, notate it correctly, and then create a “case select” so that more items (and multiple surcharge prices) can be added later, you maintain code flexibility and speed. What is sacrificed is the ability of the customer to click into an administrative page and make the change at will. If the item is not going to be changing frequently, should it belong in the code, not in the DB? STORING ITEMS IN DELINEATED STRINGS One way around the “link” table concept is to put multiple options/items in a field with a “separator” between the items. This would apply to our “size & color” quandry from before. We could include “Sizes” and “Colors” as simple fields in the “tblClothing” table. This is very helpful in a number of ways. When an admin opens the “tblClothing” table, they could see the field “Sizes” and see: Small,Medium,Large A very simple “table editor” was created to see this table, so gone is all the work to add/edit/delete the now unnecessary link table, etc. To edit, the admin person would simply add another size: Small,Medium,Large,Extra Large To be able to access this in the code, you would “split” the string into an array, on the delineator (separator) between the items. In our case, this would be the “comma”. Gone are the multiple entries of the link table, and the overhead for having it there in the first place! When the user clicks on the item, you would split the values out on the separator, and run the array values into a set of radio buttons or drop down box to allow the user to choose among valid choices for that particular piece of clothing. There are limitations to this method, for all it’s speed and convenience. If quantity MUST be tracked, then this route would not be an option, in this case. This is assuming quantity follows the particular size or color! We may be out of red smalls, but we may have medium smalls! The other limitation is price breaks between sizes. This one may not be as big an issue. If all sizes (for example) are the same, except for “over sizes” (XXL, etc.) you could build a “surcharge” into the code to reflect that price. This is all dependent on the frequency of change, number of items involved, etc. Many vendors who sell items online may use this model, even with the limitations of “quantity”. How many times have you placed an order and been told the item is on back order, or out of stock? If you routinely stock an item, you may make a business decision to allow the user to order a size that is out of stock, anyway. If the size/color is permanently removed, you know how easy that change is! DATA DISCONNECT If there is a “data disconnect” between your online data, and the business processing of data, you will find your hands are much more free in the database design aspects. For instance, if your customer takes orders over the internet, but decides to process the order manually, separate from the web database, or just enters the order into another system, there is a “data disconnect”. While this is not an optimal situation, you can either address trying to get disparate systems to talk (XML anyone?). You could get the customer to switch systems, build the entire DB online instead, or just tell the customer, if this becomes a big issue, it will be the problem you wish you had! THE PROBLEM YOU WISH YOU HAD This refers to the concept that if a customer has too many customers, they have the problem they wished they had. Too busy. Too much money. Oh, what to do?? Perhaps the customer will want to pay YOU to do a redesign? If the customer on the other hand is busy WITHOUT money, they may want to examine their business model. In this case, they probably CAN’T afford you to help them redesign! HOW BIG IS YOUR SHOVEL? This is corollary concept to the “Problem You Wish You Had”. It refers to the discussion of which technology/database to suggest for a customer. There is a saying that goes, “You can have any 2 of the following, Good, Cheap, Fast. You can never have more than 2!” If a customer is a startup, you might not want to suggest running on Oracle with a dedicated server and high monthly charges for an ecommerce plugin for a site that attracts few customers. It makes the most sense to design (if this is in fact a startup) the site to be “scalable”, meaning it is designed to grow with the customer, but in general has few overhead expenses, at the start. Once the paying customers roll on in, as your customer wants, then perhaps they are having “The Problem You Wish You Had”, and need to scale up, due to making far too much money!! SO WHICH TECHNOLOGY REALLY? So you are fluent in more than one? Which techology you use to build the site should be the one you are most fluent in. If not, change horses, and become fluent in the one you decide to support! What good does it do to advise a customer that you can’t help them? From the customer’s perspective, they will likely need ongoing support of some kind, no matter which technology you build in, so don’t build it and move to South America… (unless of course you have a web connection there…!)