Download web database design strategies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Database model wikipedia , lookup

Transcript
WEB DATABASE DESIGN STRATEGIES
When working with databases for the web you are in a unique position. You provide the data link between the normal
business world where databases have been traditionally accessed by desktop applications and the web, where you
are using HTML to create your own interfaces.
In the business world, many RDBMS’s provide GUIs for accessing the data, creating SQL statements, Views & Forms
and routinely hide the complexity of the application from all but the most serious users. On the web, where speed and
efficiency are highly important, gone are all the warm fuzzy interfaces. If we want to use them, we usually have to
create them ourselves.
Frequently these 2 paradigms clash, and become entirely our problem, as a web developer. You may need to go deep
into the business logic of the way a business database works in order to even determine if you can help a business.
Some jobs may include a database that is highly normalized in its offline (or online) design. These jobs would be the
more challenging. If you are designing the database from scratch (for a startup, for example) you can choose to
include as much normalization as you can stand!
For our purposes, we’ll imagine an online apparel store. We need to be able to accommodate clothing items in varying
sizes and colors.
NORMALIZATION ISSUES
Normalization is a 3 step process for designing a more efficient database design. The efficiency is in the area of data
storage, however, since we may in the process complicate (and slow down) the speed of data access, and complicate
the SQL queries for selecting, updating and deleting data.
Normalization occurs in 3 forms. Each successive form is said to be further normalized. Most databases in use are
officially between the 2nd and 3rd normal forms. Very few can claim to be (or even may not WANT to be) fully the 3 rd
normal form. Here is a layman’s view of the 3 forms:
First Normal Form: Each column can only contain one value.
For this to be true, we need to separate compound info, like an address field into separate fields for city, state,
address, etc. This is easy and always a correct idea.
Second Normal Form: Every column in a table that is not a key can only apply to the primary key.
Here foreign keys are allowed, but each other piece of data should relate directly to the primary key. We somewhat
violate this in our example, since we have included the business logic that a shipping address (and name to be
shipped to) can be different than the person making the purchase. This allows a customer to order and ship a gift
direct. This also saves a correct address of shipment if the customer later changes their address info.
We also violate this in the area of Dprice in the tblOrderDetail. The Dprice (item detail price) was the price applied to
the item at the time of purchase. If we relied entirely on the price as it is currently in the database, we lose account of
the price of an item that was for sale, at the time of purchase. If the price changes while the item is still in process, the
price for the order follows the detailed item, not the current price of the item.
Third Normal Form: Every non-key item is completely independent of every other non-key item.
To do this, we would need to split out items that can have variable items into separate tables. The most obvious for us
would be to split out Sizes and Colors into separate tables. The tables would then have a “link table” whose sole
existence would be link size, color and clothing items together. For instance, if we had an item called “shirt”, would it
come in red? If so, would it come in Small, Medium and Large? Each of these colors, sizes and clothing items could
be cross-referenced on a link table.
Price and quantity could also follow on this link table, since size may determine price, and quantity would need to be
by both size and color, so that we don’t think we have small red shirts, when we only have large!
There would be a potentially greater number of records in the link table also. If you go with 3 clothing items, (shirt,
pants, coat) and say each comes in 3 sizes (small, medium, large) and then say there are 3 colors (red,blue,yellow)
you now would be entering 18 items. You would potentially multiply the number of items, times the number of colors
times the number of sizes, (3*3*3=18) and your three items become quite a bit more data, and labor to input!
However, to accurately keep track of quantity and price differences between sizes, this is probably the best way to go.
Normalization can be diametrically opposed to speed and to ease of use for developer and user alike. When a DB is
highly normalized (nearly complete 3rd form) the number of tables is high, and the number of items per table are very
low. To be able to see the data in a relevant manner, views can be created to see the results of complicated table
joins to relate data from the greater number of tables. The number of link tables (tables that exist only to join 2 or more
unique ID numbers from different tables) greatly exacerbates this problem. When a database is running natively on a
desktop RDBMS system, “views” and “forms” can be created at different levels to give users access to the data in a
relevant manner.
When we are developing for the web, many of these “views” and “forms” may be custom built by the developer. The
extra development time involved curtails the design in favor of a less normalized DB that is more suitable for the web.
STORING DATA IN CODE, vs STORING DATA IN THE DB
As a web developer, you may be faced with the decision to store data in a database, or to store the data in the actual
application you develop.
For instance, imagine you had created the database as described above, with the link table, and the customer (the one
who pays you) says for instance, “oh, by the way, can you add a 5% surcharge on the customers who order from New
York?”.
This could either mean a yet another table, and much more involved code, or it could be a simple “if/then” or “case
select” statement in your code. When presented with an issue like this, the questions that are relevant could be:
How many customers do you anticipate in New York?
Do you anticipate other surcharges? Any of different amounts?
Creating a new table and all the requisite changes to the code (on possibly multiple pages) AND the extra DB hits you
don’t need could be curtailed by that “case select” that is suddenly sounding like such a good idea!
If you choose to go that route, you have MANY things riding in your favor! First, your application can run through
hundreds of lines of code MUCH faster than hitting any database, any time. Because of that, the less hits to the DB
the better, and your application will be faster for it!
Second, you can position your “surcharge” into the page in an easy manner to get to, to edit later. If you create a
variable at the top of the page, notate it correctly, and then create a “case select” so that more items (and multiple
surcharge prices) can be added later, you maintain code flexibility and speed. What is sacrificed is the ability of the
customer to click into an administrative page and make the change at will. If the item is not going to be changing
frequently, should it belong in the code, not in the DB?
STORING ITEMS IN DELINEATED STRINGS
One way around the “link” table concept is to put multiple options/items in a field with a “separator” between the items.
This would apply to our “size & color” quandry from before. We could include “Sizes” and “Colors” as simple fields in
the “tblClothing” table.
This is very helpful in a number of ways. When an admin opens the “tblClothing” table, they could see the field “Sizes”
and see:
Small,Medium,Large
A very simple “table editor” was created to see this table, so gone is all the work to add/edit/delete the now
unnecessary link table, etc. To edit, the admin person would simply add another size:
Small,Medium,Large,Extra Large
To be able to access this in the code, you would “split” the string into an array, on the delineator (separator) between
the items. In our case, this would be the “comma”.
Gone are the multiple entries of the link table, and the overhead for having it there in the first place! When the user
clicks on the item, you would split the values out on the separator, and run the array values into a set of radio buttons
or drop down box to allow the user to choose among valid choices for that particular piece of clothing.
There are limitations to this method, for all it’s speed and convenience. If quantity MUST be tracked, then this route
would not be an option, in this case. This is assuming quantity follows the particular size or color! We may be out of
red smalls, but we may have medium smalls!
The other limitation is price breaks between sizes. This one may not be as big an issue. If all sizes (for example) are
the same, except for “over sizes” (XXL, etc.) you could build a “surcharge” into the code to reflect that price. This is all
dependent on the frequency of change, number of items involved, etc.
Many vendors who sell items online may use this model, even with the limitations of “quantity”. How many times have
you placed an order and been told the item is on back order, or out of stock? If you routinely stock an item, you may
make a business decision to allow the user to order a size that is out of stock, anyway. If the size/color is permanently
removed, you know how easy that change is!
DATA DISCONNECT
If there is a “data disconnect” between your online data, and the business processing of data, you will find your hands
are much more free in the database design aspects. For instance, if your customer takes orders over the internet, but
decides to process the order manually, separate from the web database, or just enters the order into another system,
there is a “data disconnect”.
While this is not an optimal situation, you can either address trying to get disparate systems to talk (XML anyone?).
You could get the customer to switch systems, build the entire DB online instead, or just tell the customer, if this
becomes a big issue, it will be the problem you wish you had!
THE PROBLEM YOU WISH YOU HAD
This refers to the concept that if a customer has too many customers, they have the problem they wished they had.
Too busy. Too much money. Oh, what to do?? Perhaps the customer will want to pay YOU to do a redesign?
If the customer on the other hand is busy WITHOUT money, they may want to examine their business model. In this
case, they probably CAN’T afford you to help them redesign!
HOW BIG IS YOUR SHOVEL?
This is corollary concept to the “Problem You Wish You Had”. It refers to the discussion of which technology/database
to suggest for a customer. There is a saying that goes, “You can have any 2 of the following, Good, Cheap, Fast. You
can never have more than 2!”
If a customer is a startup, you might not want to suggest running on Oracle with a dedicated server and high monthly
charges for an ecommerce plugin for a site that attracts few customers. It makes the most sense to design (if this is in
fact a startup) the site to be “scalable”, meaning it is designed to grow with the customer, but in general has few
overhead expenses, at the start. Once the paying customers roll on in, as your customer wants, then perhaps they are
having “The Problem You Wish You Had”, and need to scale up, due to making far too much money!!
SO WHICH TECHNOLOGY REALLY?
So you are fluent in more than one? Which techology you use to build the site should be the one you are most fluent
in. If not, change horses, and become fluent in the one you decide to support! What good does it do to advise a
customer that you can’t help them? From the customer’s perspective, they will likely need ongoing support of some
kind, no matter which technology you build in, so don’t build it and move to South America… (unless of course you
have a web connection there…!)